EP4649087A2 - Systèmes d'expression non virale et leurs procédés d'utilisation - Google Patents
Systèmes d'expression non virale et leurs procédés d'utilisationInfo
- Publication number
- EP4649087A2 EP4649087A2 EP24705886.0A EP24705886A EP4649087A2 EP 4649087 A2 EP4649087 A2 EP 4649087A2 EP 24705886 A EP24705886 A EP 24705886A EP 4649087 A2 EP4649087 A2 EP 4649087A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- seq
- fold
- amino acid
- viral system
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/005—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2710/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
- C12N2710/00011—Details
- C12N2710/16011—Herpesviridae
- C12N2710/16211—Lymphocryptovirus, e.g. human herpesvirus 4, Epstein-Barr Virus
- C12N2710/16222—New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/108—Plasmid DNA episomal vectors
Definitions
- Nonviral gene expression systems may broadly be classified as DNA or mRNA-based systems, both of which have drawbacks.
- Epstein-Barr virus (EBV)-based vectors are self-replicating episomal plasmids that permit long-term expression of exogenous genes in mammalian cells.
- EBV-based vectors have been used to improve the efficiency and duration of expression of transgene expression (see, e.g., Belt, et al (1991); Mazda, et al (1997); Tu, et al (2000); Mazda (2002)).
- EBV-based vectors typically encode the Epstein-Barr nuclear antigen-1 (EBNA1) protein and comprise an EBV origin of replication (OriP).
- the EBNA1 protein comprises a DNA binding domain (DBD) that binds to DNA binding elements (DBEs) present in the OriP. Once expressed, the EBNA1 protein functions to bind the EBV-based vector via the OriP to facilitate replication and episomal maintenance.
- DBD DNA binding domain
- DBEs DNA binding elements
- Synthetic plasmids have been developed comprising a nucleotide sequence encoding EBNA1 protein and an OriP. Such systems decrease the apparent dilution of plasmid DNA during successive rounds of cellular replication and increase overall protein production.
- the therapeutic potential of such plasmids is limited, in part due to potential safety concerns associated with constitutive EBNA1 expression.
- EBV has been implicated in the pathophysiology of multiple cancers, most notably B cell cancers. Constitutive expression of EBNA1 alone has been found to be sufficient for transformation of B cells.
- EBV is also implicated in nasopharyngeal cancer, gastric cancers, and others. This limits the therapeutic potential of EBNA1 and OriP containing DNA-based therapies.
- the present disclosure provides a non-viral system for increasing expression of at least one transgene in a cell, the system comprising: (i) an mRNA comprising an open reading frame (ORF) encoding a DNA-binding protein comprising (a) one or more chromatin-binding domains; and (b) a DNA-binding domain (DBD) of an Epstein-Barr nuclear antigen-1 (EBNA1) polypeptide, wherein (i)(a) and (b) are operably linked; (ii) a recombinant expression vector comprising (a) the at least one transgene; and (b) a polynucleotide comprising one or more DNA binding elements (DBEs) of an Epstein-Barr virus (EBV) origin of replication (OriP), wherein (ii)(a) and (b) are operably linked.
- ORF open reading frame
- EBNA1 Epstein-Barr nuclear antigen-1
- the disclosure provides a method for increasing expression of at least one transgene in a cell comprising contacting the cell with a system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA-binding protein comprises (a) one or more chromatin-binding domains; and (b) a polypeptide comprising an EBNA1 DBD, wherein (i)(a) and (b) are operably-linked; (ii) a recombinant expression vector comprising (a) the at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably-linked, thereby increasing expression of the at least one transgene in the cell.
- a system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA-binding protein comprises (a) one or more
- the disclosure provides a method for increasing expression of at least one transgene in a dividing cell comprising contacting the cell with a system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA-binding protein comprises (a) one or more chromatin-binding domains; and (b) a polypeptide comprising an EBNA1 DBD, wherein (i)(a) and (b) are operably-linked; (ii) a recombinant expression vector comprising (a) the at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably-linked, thereby increasing expression of the at least one transgene in the dividing cell.
- a system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA-binding protein comprises (a)
- the disclosure provides a non-viral system for expression of a transgene, wherein the system comprises: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of an Epstein- Barr nuclear antigen-1 (EBNA1) homolog, wherein the EBNA1 homolog is of a non-human primate (NHP) lymphocryptovirus (LCV), and (b) a chromatin-binding domain; wherein (i)(a) and (i)(b) are operably-linked; (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof, and/or a DBE of an NHP LCV, or a
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is of a NHP LCV, and (b) a chromatin-binding domain; wherein (i)(a) and (b) are operably-linked; (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof, and/or a DBE of an NHP LCV, or a variant thereof, wherein (ii)(a) and (b) are operably-
- the DBE comprises an EBV DBE or a variant or a fragment thereof. In some embodiments, the DBE comprises an NHP LCV DBE or a variant or a fragment thereof.
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is of an NHP LCV, and (b) a chromatin-binding domain; wherein (i)(a) and (b) are operably-linked; (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polyn
- the DBE comprises an EBV DBE or a variant or a fragment thereof. In some embodiments, the DBE comprises an NHP LCV DBE or a variant or a fragment thereof.
- the disclosure provides a non-viral system for expression of a transgene, wherein the system comprises: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of an Epstein- Barr nuclear antigen-1 (EBNA1) homolog, wherein the EBNA1 homolog is of a non-human primate (NHP) lymphocryptovirus (LCV), and (b) a chromatin-binding domain; wherein (i)(a) and (i)(b) are operably-linked; (ii) a recombinant expression vector comprising (a) a transgene
- DBD DNA binding domain
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog, or a variant thereof, wherein the EBNA1 homolog is of a NHP LCV, wherein the DBD is a sequence represented by the formula: N′-[Xaa1] w -[A]-[Xaa2] x - [B]-[Xaa3]y-[C]-[Xaa4] z -C′, wherein Xaa1, Xaa2, Xaa3, and Xaa4 are any amino acid, wherein w, x, y, and z are integers referring to the
- the DBE comprises an EBV DBE or a variant or a fragment thereof. In some embodiments, the DBE comprises an NHP LCV DBE or a variant or a fragment thereof.
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog, or a variant thereof, wherein the EBNA1 homolog is of a NHP LCV, wherein the DBD is a sequence represented by the formula: N′-[Xaa1] w -[A]-[Xaa2] x -[B]-[Xaa3] y -[C]-
- the DBE comprises an EBV DBE or a variant or a fragment thereof. In some embodiments, the DBE comprises an NHP LCV DBE or a variant or a fragment thereof. [0016] In some embodiments of any of the foregoing or related aspects, the first sequence motif has at least about 90% similarity to KX48X49X50YX51LRRX52 (SEQ ID NO: 284).
- the first sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to KX48X49X50YX51LRRX52 (SEQ ID NO: 284), wherein X 48 is T, I, N, or W; X 49 is C, S, P; X 50 is V, L, C, or I; X 51 is N or S; and X 52 is C,G, or A.
- the first sequence motif is KX48X49X50YX51LRRX52 (SEQ ID NO: 284), wherein X 48 is T, I, N, or W; X 49 is C, S, P; X 50 is V, L, C, or I; X 51 is N or S; and X 52 is C,G, or A. In some embodiments, the first sequence motif is X48X49X50YX51LRRX52 (SEQ ID NO: 284).
- the first sequence motif has at least 80% similarity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the first sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the first sequence motif is a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the second sequence motif has at least about 90% similarity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285).
- the second sequence motif is RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285). In some embodiments, the second sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285), wherein X 61 is A, L, S, or I;X 62 is T or S; X 63 is P or T;X 64 is G, S, or F; andX 65 is Y or F.
- the second sequence motif is RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285), wherein X 61 is A, L, S, or I; X 62 is T or S; X 63 is P or T; X 64 is G, S, or F; and X 65 is Y or F.
- the second sequence motif has at least 80% similarity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the second sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the second sequence motif is a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the third sequence motif has at least about 90% similarity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286).
- the third sequence motif is GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286). In some embodiments, the third sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286), wherein X 71 is Q or E; X 72 is G or T; X 73 is L, M, or I; and X 74 is R, K, M, or L.
- the third sequence motif is GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286), wherein X 71 is Q or E; X 72 is G or T; X 73 is L, M, or I; and X 74 is R, K, M, or L.
- the third sequence motif has at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- GPQPGPLRES SEQ ID NO: 301
- GPQPGPLKES SEQ ID NO: 302
- GPQPGPMRES SEQ ID NO: 303
- GPEPTPLMES SEQ ID NO: 304
- GPQPGPILES SEQ ID NO: 305
- the third sequence motif has at least 80% similarity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- the third sequence motif is a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- w 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, or 58, 59, 60.
- x 6, 7, 8, 9, or 10.
- y 4, 5, 6, 7, or 8.
- z 48, 49, 50, 51, 52, or 53.
- [Xaa1] w comprises the amino acid sequence X 1 X 2 X 3 GGX 4 X 5 X 6 X 7 X 8 RGX 9 X 10 X 11 X 12 X 13 X 14 X 15 KX 16 X 17 X 18 X 19 X 20 X 21 X 22 X 23 X 24 X 25 LLX 2 6RX27X28X29X30X31X32TX33X34X35X36X37WX38X39X40X41X42X43X44X45X46X47 (SEQ ID NO: 306), wherein X 1 is R, G, P, or K; X 2 is K or P; X 3 is K or R; X 4 is W, V, or -; X 5 is F or -; X 6 is G, -, or Y; X7 is K, -, R, or V; X8 is H, -
- X38 is V, M, P, K, G, or C;
- X39 is A, N, F, Y, or C;
- X40 is G or A;
- X41 is V or L;
- X42 is F, M, L, or I;
- X43 is V, A, or I;
- X44 is Y or V;
- X45 is G or N;
- X46 is G, L, P, or Y;
- X47 is S, -, or C; and “-” is a deletion.
- [Xaa2]x comprises the amino acid sequence X53X54X55X56X57X58X59X60 (SEQ ID NO: 307), wherein X53 is T, L, I, or M; X54 is A, G, or S; X 55 is L, C, V, or I; X 56 is A or C; X 57 is I, A, V, or C; X 58 is P or N; X 59 is Q, E, W, or G; and X60 is C, V, or G.
- [Xaa3]y comprises the amino acid sequence GX 66 X 67 X 68 X 69 X 70 (SEQ ID NO: 308), wherein X 66 is M, S, Y, H, I, or T; X 67 is A, S, or T; X68 is P, F, or W; X69 is G or E; and X70 is P, T, A, or G.
- [Xaa4]z comprises the amino acid sequence X75X76X77X78FX79X80FX81X82X83X84X85X86X87X88X89X90X91X92X93X94X95X96X97X98X99X100 X 101 PX 102 PX 103 X 104 X 105 X 106 X 107 VX 108 X 109 X 110 X 111 FX 112 X 113 X 114 X 115 X 116 X 117 LP (SEQ ID NO: 309), wherein X75 is I, S, T, C, or G; X76 is V, T, D, E, or W; X77 is C, W, or S; X78 is Y or G; X 79 is M, L, or I; X 80 is V, F, or Y; X 81 is L, T, or V; X 82 is Q, P, or N; X 83 is T
- the DBD comprises the amino acid sequence of SEQ ID NO: 310.
- the disclosure provides a non-viral system for expression of a transgene, the system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of a NHP LCV Epstein- Barr nuclear antigen-1 (EBNA1) homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to SEQ ID NO: 322, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a NHP LCV DBE or
- the DBD comprises SEQ ID NO: 322. In some embodiments, the DBD consists of SEQ ID NO: 322. [0022] In some aspects, the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of a NHP LCV Epstein-Barr nuclear antigen-1 (EBNA1) homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to SEQ ID NO: 322, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b
- the DBD comprises SEQ ID NO: 322. In some embodiments, the DBD consists of SEQ ID NO: 322. [0023] In some aspects, the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of a NHP LCV EBNA1 homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to SEQ ID NO: 322, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA
- the DBD comprises SEQ ID NO: 322. In some embodiments, the DBD consists of SEQ ID NO: 322. [0024] In some embodiments of any of the foregoing or related aspects, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 322. In some embodiments, the DBD comprises SEQ ID NO: 322. In some embodiments, the DBD consists of SEQ ID NO: 322.
- the disclosure provides non-viral system for expression of a transgene, the system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA- binding protein comprises (a) a DNA binding domain (DBD) of a NHP LCV Epstein-Barr nuclear antigen-1 (EBNA1) homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising an EBV DBE or a variant thereof, wherein (ii)(a) and (b) are
- the DBD comprises an amino acid sequence selected from SEQ ID NOs 215-220. In some embodiments, the DBD consists of an amino acid sequence selected from SEQ ID NOs 215-220. [0026] In some aspects, the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of a NHP LCV Epstein-Barr nuclear antigen-1 (EBNA1) homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (i) a DNA
- the DBD comprises an amino acid sequence selected from SEQ ID NOs 215-220. In some embodiments, the DBD consists of an amino acid sequence selected from SEQ ID NOs 215-220. [0027] In some aspects, the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DNA binding domain (DBD) of a NHP LCV Epstein- Barr nuclear antigen-1 (EBNA1) homolog or a variant thereof, wherein the DBD comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220, and (b) a chromatin binding domain, wherein (i)(a) and (i)(b
- the DBD comprises an amino acid sequence selected from SEQ ID NOs 215-220. In some embodiments, the DBD consists of an amino acid sequence selected from SEQ ID NOs 215-220. [0028] In some embodiments of any of the foregoing or related aspects, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence selected from SEQ ID NOs: 215-220. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 215. In some embodiments, the DBD comprises SEQ ID NO: 215.
- the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 216. In some embodiments, the DBD comprises SEQ ID NO: 216. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 217. In some embodiments, the DBD comprises SEQ ID NO: 217. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 218.
- the DBD comprises SEQ ID NO: 218. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 219. In some embodiments, the DBD comprises SEQ ID NO: 219. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 220. In some embodiments, the DBD comprises SEQ ID NO: 220.
- the disclosure provides a non-viral system for expression of a transgene, the system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant thereof, wherein the EBNA1 homolog is of a NHP LCV; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a sequence represented by the formula 5 ⁇ -[D 1 ]-[L 1 ]-[D 2 ]-[L 2 ]-[D 3 ]-[L 3 ]-([D n ]-[L
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant thereof, wherein the EBNA1 homolog is of a NHP LCV; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a sequence represented by the formula 5 ⁇ -[D1]-[L1]-[D2]-[L2]-[D3]- [L 3 ]-([D n
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant thereof, wherein the EBNA1 homolog is of a NHP LCV; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a sequence represented by the formula 5 ⁇ -[D1]-[L1]-[D2]-[L2]-[D3]-[L
- the NHP is of the family Hominoidea or Cercopithecoidea. In some embodiments, the NHP is of the family Hominoidea. In some embodiments, the NHP (e.g., the NHP is of the family Hominoidea) is selected from the group consisting of species listed in Table 2. [0033] In some embodiments of any of the foregoing or related aspects, the NHP is of the family Cercopithecoidea. In some embodiments, the NHP (e.g., NHP is of the family Cercopithecoidea) is selected from the group consisting of species listed in Table 3.
- the LCV (e.g., the LCV of an NHP of the family Hominoidea or Cercopithecoidea) is selected from Cercocebus atys lymphocryptovirus 1, Cercopithecus hamlyni lymphocryptovirus 1, Cercopithecus cephus lymphocryptovirus 1, Cercopithecus neglectus lymphocryptovirus 1, Cercopithecus neglectus lymphocryptovirus 2, Cercopithecus nictitans lymphocryptovirus 1, Chlorocebus aethiops lymphocryptovirus 1, Chlorocebus aethiops lymphocryptovirus 2, Colobus guereza lymphocryptovirus 1, Colobus polykomos lymphocryptovirus 1, Erythrocebus patas lymphocryptovirus 1, Gorilla gorilla lymphocryptovirus 1, Gorilla gorilla lymphocryptovirus 2, Hylobates lar lymphocryptovirus 1, Hylobates muelleri lymph
- the LCV (e.g., the LCV of an NHP of the family Hominoidea or Cercopithecoidea) is selected from gorilline gammaherpesvirus 1, macacine gammaherpesvirus 4, macacine gammaherpesvirus 10, macacine gammaherpesvirus 13, panine gammaherpesvirus 1, paniine gammaherpesvirus 1, and pongine gammaherpesvirus 2.
- the disclosure provides a non-viral system for expression of a transgene, the system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant, wherein the EBNA1 homolog is of a NHP LCV, wherein the NHP is of the parvorder Platyrrhini; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a DBE or a variant thereof, wherein the DBE is present in the genome of the NHP LCV, wherein (ii)(a) and (b
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant, wherein the EBNA1 homolog is of a NHP LCV, wherein the NHP is of the parvorder Platyrrhini; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a DBE or a variant thereof, wherein the DBE is present in the genome of the NHP LCV, wherein
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, wherein the DNA-binding protein comprises (a) a DBD of an EBNA1 homolog or a variant, wherein the EBNA1 homolog is of a NHP LCV, wherein the NHP is of the parvorder Platyrrhini; and (b) a chromatin binding domain, wherein (i)(a) and (i)(b) are operably linked, and (ii) a recombinant expression vector comprising (a) the transgene; and (b) a DNA binding polynucleotide comprising a DBE or a variant thereof, wherein the DBE is present in the
- the DBE (e.g., the DBE is present in the genome of the NHP LCV) comprises: (i) CGCCAACAAACGTTG (SEQ ID NO: 317), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 317, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 317; (ii) CAACACCCAGTCACGCAGTCTCAAGGGTCCT (SEQ ID NO: 318), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 318, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 318; (iii) TTTGTTGGCGCCAACAAA (SEQ ID NO: 319), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 319, or a nucleot
- the DNA binding polynucleotide (e.g., the DNA binding polynucleotide comprising a DBE or a variant thereof, wherein the DBE is present in the genome of the NHP LCV) comprises a nucleotide sequence having at least about 70% identity to SEQ ID NO: 316. In some embodiments, the DNA binding polynucleotide comprises SEQ ID NO: 316.
- the NHP e.g., the NHP is of the parvorder Platyrrhini
- the NHP is selected from the group consisting of species listed in Table 4.
- the LCV (e.g., the NHP is of the parvorder Platyrrhini) is selected from Ateles paniscus lymphocryptovirus1, Callithrix penicillata lymphocryptovirus1, Leontopithecus rosalia lymphocryptovirus1, Pithecia pithecia lymphocryptovirus1, Saimiri sciureus lymphocryptovirus2, and Saimiri sciureus lymphocryptovirus3.
- the LCV e.g., the LCV of an NHP of the parvorder Platyrrhini
- the LCV is callitrichine gammaherpesvirus 3.
- the DBD comprises an amino acid sequence set forth in SEQ ID NO: 322, or an amino acid sequence having at least about 80% similarity to SEQ ID NO: 322. In some embodiments, the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 322.
- the disclosure provides a non-viral system for expression of a transgene, the system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, and (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polynucleotide, wherein (I) the DNA-binding protein comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprise a sequence having at least about 80% identity to SEQ ID NO: 69; (II) the DNA-binding protein comprises an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprises SEQ ID NO: 69; (III) the DNA-binding protein comprises an amino acid sequence having at least about 80% similarity
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA-binding protein, or a recombinant expression vector comprising the nucleic acid, and (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polynucleotide, wherein (I) the DNA- binding protein comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprise a sequence having at least about 80% identity to SEQ ID NO: 69; (II) the DNA- binding protein comprises an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprises SEQ ID NO: 69; (III) the DNA-binding protein comprises an amino acid sequence
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) a DNA-binding protein, a nucleic acid encoding the DNA- binding protein, or a recombinant expression vector comprising the nucleic acid, and (ii) a recombinant expression vector comprising (a) a transgene; and (b) a DNA binding polynucleotide, wherein (I) the DNA-binding protein comprises an amino acid sequence having at least about 80% similarity to an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprise a sequence having at least about 80% identity to SEQ ID NO: 69; (II) the DNA-binding protein comprises an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprises SEQ ID NO: 69; (III)
- the DNA-binding protein comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence selected from SEQ ID NOs 215-220 and the DNA binding polynucleotide comprise a sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 69.
- the DNA-binding protein comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO 322 and the DNA binding polynucleotide comprise a sequences having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 316.
- expression of the at least one transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30- fold, about 40-fold, about 45-fold, about 50-fold as compared to introducing the recombinant expression vector alone.
- expression of the at least one transgene is increased by about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45- fold, about 50-fold about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, or about 100-fold, about 110-fold, about 120-fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170- fold, about 180-fold, about 190-fold, about 200-fold, about 250-fold, about 300-fold, about 350-fold, about 400-fold, about 450-fold, about 500-fold, about 600-fold, about 700-fold, about 800-fold, about 900-fold, about 1x10 3 -fold, about 5x10 3 -fold, or about 1x10 4 -fold as compared to introducing the recombinant expression vector alone.
- expression of the at least one transgene is increased by about 10-fold to about 100-fold, about 10-fold to about 500-fold, about 10-fold to about 1x10 3 -fold, about 100-fold to about 500-fold, about 100- fold to about 1x10 3 -fold, about 100-fold to about 2x10 3 -fold, about 100-fold to about 5x10 3 - fold, or about 1x10 3 -fold to about 1x10 4 -fold as compared to introducing the recombinant expression vector alone.
- expression of the at least one transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- expression of the at least one transgene is increased by about 10-fold to about 100-fold, about 10-fold to about 500-fold, about 10-fold to about 1x10 3 -fold, about 100-fold to about 500-fold, about 100-fold to about 1x10 3 -fold, about 100-fold to about 2x10 3 -fold, about 100-fold to about 5x10 3 -fold, or about 1x10 3 -fold to about 1x10 4 -fold as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- expression of the at least one transgene is increased by about 10-fold to about 10 2 -fold, about 10-fold to about 10 3 -fold, about 10 2 -fold to about 10 3 -fold, about 10 2 -fold to about 10 4 -fold, or about 10 3 -fold to about 10 4 -fold, as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- the introducing is ex vivo. In some embodiments, the introducing is in vivo.
- the disclosure provides a method for selectively expressing at least one transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA-binding protein comprises (a) one or more chromatin-binding domains; and (b) a polypeptide comprising an EBNA1 DBD, wherein (i)(a) and (b) are operably-linked; (ii) a recombinant expression vector comprising (a) the at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably linked, thereby selectively expressing the at least one transgene in the target tissue and/or target cell population.
- the target tissue comprises tumor tissue.
- the target cell population comprises tumor cells.
- the system is administered systemically.
- the system is administered intratumorally.
- expression of the at least one transgene in the target tissue and/or target cell population is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 50-fold, about 100-fold, about 150-fold, or about 200-fold as compared to administering the recombinant expression vector alone.
- expression of the at least one transgene in the target tissue and/or target cell population is increased by about 10-fold to about 100-fold as compared to administering the recombinant expression vector alone. In some embodiments, expression of the at least one transgene in the target tissue and/or target cell population is increased by about 10-fold to about 200-fold as compared to administering the recombinant expression vector alone. In some embodiments, expression of the at least one transgene in the target tissue and/or target cell population is increased by about 10-fold, about 20-fold, about 30-fold, about 40- fold, or about 50-fold as compared to administering the recombinant expression vector alone.
- the at least one transgene is not substantially expressed in a non-target tissue and/or a non-target cell population. In some embodiments, expression of the at least one transgene in a non-target tissue and/or a non-target cell population is substantially equivalent to administering the recombinant expression vector alone.
- the non- target tissue comprises liver tissue and/or the non-target cell population comprises hepatocytes. In some embodiments, expression of the at least one transgene in the target tissue and/or the target cell population is at least about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 50-fold higher as compared to the non-target tissue and/or non-target cell population.
- expression of the at least one transgene in the target tissue and/or the target cell population is about 10-fold to about 50-fold, about 10-fold to about 100-fold, about 50- fold to about 100-fold, about 50-fold to about 200-fold, about 100-fold to about 200-fold, about 100-fold to about 300-fold, about 100-fold to about 500-fold, or about 100-fold to about 10 3 - fold higher as compared to the non-target tissue and/or non-target cell population.
- expression of the at least one transgene in the target tissue and/or the target cell population is about 10-fold to about 100-fold, about 10-fold to about 500-fold, about 10-fold to about 1x10 3 -fold, about 100-fold to about 500-fold, about 100-fold to about 1x10 3 -fold, about 100-fold to about 2x10 3 -fold, about 100-fold to about 5x10 3 -fold, or about 1x10 3 -fold to about 1x10 4 -fold higher as compared to the non-target tissue and/or non-target cell population.
- the DBD comprises or consists of an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 18.
- the DBD of an EBNA1 polypeptide comprises or consists of an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 18.
- the amino acid sequence has at least about 80% identity to SEQ ID NO: 18.
- the amino acid sequence has at least about 90% identity to SEQ ID NO: 18.
- the DBD comprises SEQ ID NO: 18.
- the DBD consists of SEQ ID NO: 18.
- the DBD of an EBNA1 polypeptide comprises SEQ ID NO: 18.
- the DBD of an EBNA1 polypeptide consists of SEQ ID NO: 18.
- the one or more chromatin binding domains is operably linked to the N-terminus of the polypeptide comprising the EBNA1 DBD.
- the polypeptide comprising the DBD is operably linked to the N-terminus of the one or more chromatin binding domains.
- the DNA-binding protein further comprises one or more NLSs.
- the polypeptide comprises or consists of an amino acid having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the amino acid sequence has 80% identity to amino acid residues 459 to 607 of SEQ ID NO: 1.
- the amino acid sequence has 90% identity to amino acid residues 459 to 607 of SEQ ID NO: 1.
- the polypeptide comprises or consists of an amino acid sequence corresponding to amino acid residues 459 to 607 of SEQ ID NO: 1.
- the polypeptide comprises or consists of an amino acid having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the amino acid sequence has 80% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the amino acid sequence has 90% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the polypeptide comprises or consists of an amino acid sequence corresponding to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the one or more chromatin binding domains binds to at least one element of human chromatin. In some embodiments, the one or more chromatin binding domains binds to at least one element of the nuclear matrix or nuclear lamina. In some embodiments, the one or more chromatin binding domains binds to euchromatin, heterochromatin, or both. In some embodiments, the one or more chromatin binding domains binds to genomic DNA, a histone protein, a nucleosome, or a combination thereof.
- the one or more chromatin binding domains are selected from a bromodomain, a PHD finger domain, a chromodomain, a MBT domain, a tudor domain, a PWWP domain, an ADD domain, a Zf-CW domain, an ankyrin repeat domain, a WD40 domain, and a combination thereof.
- the one or more chromatin binding domains comprises an AT-hook.
- the AT-hook is from HMGA1, HMGA2, AF-17, SETBP1, TTF-I interacting peptide 5, SC1, X box-binding regulatory factor, LIM/homeodomain protein LH-2, Retinoblastoma-binding protein 1, ELF3, DFS70, ZNF213, Peregrin, Methyl-CpG-binding protein 2, and MLLT10.
- the one or more chromatin binding domains comprises a histone protein or portion thereof.
- the histone protein or portion thereof is an H1.1 protein or portion thereof, an H1.2 protein or portion thereof, an H1.3 protein or portion thereof, an H1.4 protein or portion thereof, an H1.5 protein or portion thereof, an H1.6 protein or portion thereof, an H1.7 protein or portion thereof, an H1.8 protein or a portion thereof, an H1.9 protein or a portion thereof, or an H1.10 protein or a portion thereof.
- the one or more chromatin binding domains comprises an IL-33 chromatin-binding sequence.
- the one or more chromatin binding domains comprises a chromatin binding domain of a Karposi’s sarcoma-associated herpesvirus (KSHV) latency-associated nuclear antigen (LANA) or a human papillomavirus H (HPV) E2 protein.
- KSHV Karposi’s sarcoma-associated herpesvirus
- LSA latency-associated nuclear antigen
- HPV human papillomavirus H
- the one or more chromatin binding domains are selected from EBNA1 domain A, EBNA domain B, and a combination thereof.
- the one or more chromatin binding domains comprises SEQ ID NO: 14 or an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 14.
- the one or more chromatin binding domains comprises SEQ ID NO: 16 or an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 16. In some embodiments, the one or more chromatin binding domains comprise (i) SEQ ID NO: 14 or an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 14; and (ii) SEQ ID NO: 16 or an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 16, wherein (i) and (ii) are operably linked. In some embodiments, (i) is upstream of (ii). In some embodiments, (ii) is upstream of (i).
- the one or more chromatin-binding domains comprises a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from SEQ ID NO: 14, an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 14, SEQ ID NO: 16, and an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 16, and wherein L, if present, is a spacer between A and B.
- the chromatin binding domain comprise (i) an amino acid sequence selected from SEQ ID NOs: 338-340 and 347, or an amino acid sequence having at least about 80% sequence identity to an amino acid sequence selected from SEQ ID NOs: 338-340 and 347; and (ii) an amino acid sequence selected from SEQ ID NOs: 341-346, or an amino acid sequence having at least about 80% sequence identity to an amino acid sequence selected from SEQ ID NOs: 341-346, wherein (i) and (ii) are operably linked.
- (i) is upstream of (ii). In some embodiments, (ii) is upstream of (i).
- the chromatin binding domain comprises a sequence of linked amino acids according to the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently an amino acid sequence set forth in SEQ ID NOs: 338-346 and 347 or an amino acid sequence having at least about 80% sequence identity to an amino acid sequence selected from SEQ ID NOs: 338- 346 and 347, and wherein L, if present, is a spacer between A and B.
- the disclosure provides a non-viral system for increasing expression of a transgene in a cell, the system comprising: (i) an mRNA comprising an open reading frame (ORF) encoding a DNA-binding protein comprising (a) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]- C ⁇ , wherein A and B are each independently selected from SEQ ID NO: 14, an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 14, SEQ ID NO: 16, and an amino acid sequence having at least about 80% sequence identity to SEQ ID NO: 16, and wherein L, if present, is a spacer between A and B; and (b) a DNA-binding domain (DBD) of an Epstein-Barr nuclear antigen-1 (EBNA1) polypeptide, wherein (i)(a) and (b) are operably linked; (ii) a recombinant expression vector
- ORF open reading frame
- the DBD comprises SEQ ID NO: 18, or an amino acid sequence having at least about 80% identity to SEQ ID NO: 18.
- the one or more chromatin binding domains is operably linked to the N-terminus of the DBD.
- the DBD is operably linked to the N-terminus of the one or more chromatin binding domains.
- the DBD is directly fused to the chromatin binding domain.
- the DBD is linked to the chromatin binding domain via a linker.
- the linker is a peptide linker.
- the peptide linker is a Gly-Ser linker.
- the chromatin binding domain is operably-linked to the N-terminus of the DBD.
- the DBD is operably-linked to the N- terminus of the chromatin binding domain.
- the DNA-binding protein comprises one chromatin binding domain.
- the DNA-binding protein comprises two or more chromatin binding domains.
- the DNA-binding protein further comprises one or more nuclear localization sequences (NLSs).
- the one or more NLSs are positioned at the N-terminus, at the C-terminus, between the one or more chromatin-binding domains and the DBD, or a combination thereof.
- the one or more NLSs is selected from a monopartite NLS, a bipartite NLS, a non-classical NLS, and a combination thereof.
- the one or more NLSs is selected from a c-Myc NLS, SV40 NLS, a nucleoplasmin NLS, a 53BP1 NLS, an ING4 NLS, an IER5 NLS, and an ERK5 NLS.
- the one or more NLSs comprises the amino acid sequence of SEQ ID NO: 17. In some embodiments, the one or more NLSs comprises an amino acid sequence having at least about 90% identity to an amino acid sequence selected from SEQ ID NOs: 17 and 35-50. In some embodiments, the DNA-binding protein comprises SEQ ID NO: 1 or an amino acid sequence having at least 80% identity to SEQ ID NO: 1. In some embodiments, the DNA-binding protein comprises or consists of an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 3. In some embodiments, the amino acid sequence has at least about 80% identity to SEQ ID NO: 3.
- the amino acid sequence has at least about 90% identity to SEQ ID NO: 3.
- the DNA-binding protein comprises or consists of SEQ ID NO: 3.
- the DNA-binding protein is a variant of an EBNA1 polypeptide, wherein the variant comprises (a) one or more EBNA1 chromatin binding domains, (b) an EBNA1 DBD, and (c) one or more modifications of an EBNA1 domain selected from a Gly-Ala repeat region, an NLS, a transactivation (TA) domain, and a combination thereof.
- the DNA binding protein comprising a DBD of an EBNA1 polypeptide is a variant of an EBNA1 polypeptide, wherein the variant comprises (a) one or more EBNA1 chromatin binding domains, (b) an EBNA1 DBD, and (c) one or more modifications of an EBNA1 domain selected from a Gly-Ala repeat region, an NLS, a transactivation (TA) domain, and a combination thereof.
- the DNA binding protein comprises or consists of an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence selected from SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153.
- the amino acid sequence has at least about 80% identity to an amino acid sequence selected from SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153.
- the amino acid sequence has at least about 90% identity to an amino acid sequence selected from SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153. In some embodiments, the amino acid sequence is selected from SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153.
- the ORF comprises or consists of a nucleotide sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one of SEQ ID NOs: 24-29, 129, 132, 135, 138, 141, 144, 149, and 152.
- the nucleotide sequence has at least about 80% identity to any one of SEQ ID NOs: 24-29, 129, 132, 135, 138, 141, 144, 149, and 152.
- the nucleotide sequence has at least about 90% identity to any one of SEQ ID NOs: 24-29, 129, 132, 135, 138, 141, 144, 149, and 152. In some embodiments, the nucleotide sequence is any one of SEQ ID NOs: 24-29, 129, 132, 135, 138, 141, 144, 149, and 152.
- the disclosure provides a non-viral system for increasing expression of at least one transgene in a cell, the system comprising: (i) an mRNA comprising an open reading frame (ORF) encoding an Epstein-Barr nuclear antigen-1 (EBNA1) polypeptide comprising from N-terminus to C-terminus: (a) a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each a chromatin binding domain and L is a spacer between A and B, (b) a nuclear localization signal (NLS), and (c) a DNA binding domain (DBD), wherein (a), (b), and (c) are operably linked, and wherein the EBNA1 polypeptide comprises a substitution of all or a part of the sequence of linked amino acids with one or more heterologous chromatin binding domains, and (ii) a recombinant expression vector comprising (a) at least one transgen
- the DBD comprises SEQ ID NO: 18, or an amino acid sequence having at least about 80% identity to SEQ ID NO: 18.
- the EBNA1 polypeptide further comprises a deletion of the NLS.
- the disclosure provides a non-viral system for increasing expression of a transgene in a cell, the system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein comprising (a) one or more heterologous chromatin-binding domains; and (b) a polypeptide comprising an EBNA1 DBD, wherein (i)(a) and (b) are operably-linked; and (ii) a recombinant expression vector comprising (a) at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably-linked.
- the DBD comprises SEQ ID NO: 18. [0063] In some embodiments of any of the foregoing or related aspects, the DBD comprises an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 18. In some embodiments, the EBNA1 DBD the DBD comprises an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 18. In some embodiments, the DBD comprises an amino acid sequence having at least about 80% identity to SEQ ID NO: 18.
- the EBNA1 DBD comprises an amino acid sequence having at least about 80% identity to SEQ ID NO: 18. In some embodiments, the DBD comprises an amino acid sequence having at least about 90% identity to SEQ ID NO: 18. In some embodiments, the EBNA1 DBD comprises an amino acid sequence having at least about 90% identity to SEQ ID NO: 18. In some embodiments, the polypeptide comprises an amino acid corresponding to amino acid residues 459 to 641 of SEQ ID NO: 1. In some embodiments, the polypeptide comprising an EBNA1 DBD comprises an amino acid corresponding to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the polypeptide comprises an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the polypeptide comprising an EBNA1 DBD comprises an amino acid sequence having at least about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the polypeptide comprises an amino acid sequence having at least about 80% identity to amino acid residues 459 to 641 of SEQ ID NO: 1.
- the polypeptide comprising an EBNBA1 DBD comprises an amino acid sequence having at least about 80% identity to amino acid residues 459 to 641 of SEQ ID NO: 1. In some embodiments, the polypeptide comprises an amino acid sequence having at least about 90% identity to amino acid residues 459 to 641 of SEQ ID NO: 1. In some embodiments, the polypeptide comprising an EBNA1 DBD comprises an amino acid sequence having at least about 90% identity to amino acid residues 459 to 641 of SEQ ID NO: 1. [0064] In some embodiments of any of the foregoing or related aspects, the one or more heterologous chromatin binding domains binds to at least one element of the nuclear matrix or nuclear lamina.
- the one or more heterologous chromatin binding domains binds to at least one element of human chromatin. In some embodiments, the one or more heterologous chromatin binding domains binds to euchromatin, heterochromatin, or both. In some embodiments, the one or more chromatin binding domains binds to genomic DNA, a histone protein, a nucleosome, or a combination thereof.
- the one or more heterologous chromatin binding domains are selected from a bromodomain, a PHD finger domain, a chromodomain, a MBT domain, a tudor domain, a PWWP domain, an ADD domain, a Zf-CW domain, an ankyrin repeat domain, a WD40 domain, and a combination thereof.
- the one or more heterologous chromatin binding domains comprises an AT-hook.
- the AT-hook is from HMGA1, HMGA2, AF-17, SETBP1, TTF-I interacting peptide 5, SC1, X box-binding regulatory factor, LIM/homeodomain protein LH-2, Retinoblastoma-binding protein 1, ELF3, DFS70, ZNF213, Peregrin, Methyl-CpG- binding protein 2, and MLLT10.
- the one or more heterologous chromatin binding domains comprises a histone protein or portion thereof.
- the histone protein or portion thereof is an H1.1 protein or portion thereof, an H1.2 protein or portion thereof, an H1.3 protein or portion thereof, an H1.4 protein or portion thereof, an H1.5 protein or portion thereof, an H1.6 protein or portion thereof, an H1.7 protein or portion thereof, an H1.8 protein or a portion thereof, an H1.9 protein or a portion thereof, or an H1.10 protein or a portion thereof.
- the one or more heterologous chromatin binding domains comprises an IL-33 chromatin-binding sequence.
- the one or more heterologous chromatin binding domains comprises a chromatin binding domain of a Karposi’s sarcoma-associated herpesvirus (KSHV) latency-associated nuclear antigen (LANA) or a human papillomavirus H (HPV) E2 protein.
- the one or more heterologous chromatin binding domains comprises KSHV LANA.
- the one or more heterologous chromatin binding domains is operably-linked to the N-terminus of the polypeptide, or wherein the polypeptide is operably- linked to the N-terminus of the one or more heterologous chromatin binding domains.
- the DNA-binding protein further comprises one or more NLSs.
- the one or more NLSs are positioned at the N-terminus, at the C-terminus, between the one or more heterologous chromatin-binding domains and the polypeptide, or a combination thereof.
- the one or more NLSs is selected from a monopartite NLS, a bipartite NLS, a non-classical NLS, and a combination thereof.
- the disclosure provides a non-viral system for increasing expression of a transgene in a cell, the system comprising: (i) an mRNA comprising an ORF encoding a DNA-binding protein, wherein the DNA binding protein is a variant of an EBNA1 polypeptide, wherein the variant comprises (a) one or more EBNA1 chromatin binding domains, (b) an EBNA1 DBD, and (c) one or more modifications of an EBNA1 domain selected from a Gly- Ala repeat region, an NLS, a transactivation (TA) domain, and a combination thereof, and (ii) a recombinant expression vector comprising (a) at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably linked.
- an mRNA comprising an ORF encoding a DNA-binding protein
- the DNA binding protein is
- the EBNA1 variant comprises one or more modifications selected from a deletion, an insertion, a substitution, and a combination thereof.
- the one or more modifications comprises a deletion of the Gly-Ala repeat region or a portion thereof.
- the one or more modifications comprises a deletion of the NLS or portion thereof.
- the one or more modifications comprises a substitution of the NLS or portion thereof.
- the NLS is substituted with a heterologous NLS.
- the heterologous NLS is a human NLS.
- the heterologous NLS is an NLS encoded by a human gene.
- the one or more modifications comprises a deletion of the TA domain or a portion thereof. In some embodiments, the one or more modifications comprises a substitution of the TA domain or a portion thereof. In some embodiments, the one or more modifications comprises a deletion of the full TA domain. In some embodiments, the one or more modifications comprises a deletion or substitution of one or more antigens in the TA domain. In some embodiments, the one or more antigens comprises a sequence motif having the amino acid sequence of SEQ ID NO: 101. In some embodiments, the one or more modifications comprises a deletion of the NLS or portion thereof and a deletion of the TA domain or portion thereof.
- the one or more modifications comprises a substitution of the NLS or portion thereof and a deletion of the TA domain or portion thereof.
- the disclosure provides a non-viral system for increasing expression of a transgene in a cell, the system comprising: (i) an mRNA comprising an open reading frame (ORF) encoding a DNA binding protein, wherein the DNA binding protein comprises an amino acid sequence set forth in any one of SEQ ID NOs: 3 and 7-11 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in any one of SEQ ID NOs: 3 and 7-11; (ii) a recombinant expression vector comprising (a) at least one transgene; and (b) a polynucleotide comprising one or more DNA binding elements (DBEs) of an Epstein-Barr virus (EBV) origin of replication (OriP), wherein (ii)(a) and (b) are operably linked.
- ORF open reading frame
- the disclosure provides a non-viral system for increasing expression of a transgene in a cell, the system comprising: (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises an amino acid sequence set forth in any one of SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in any one of SEQ ID NOs: 3, 7-11, 130, 133, 136, 139, 142, 145, 150, and 153; (ii) a recombinant expression vector comprising (a) at least one transgene; and (b) a polynucleotide comprising one or more DBEs of an EBV OriP, wherein (ii)(a) and (b) are operably linked.
- the DBD selectively binds to the DNA binding polynucleotide or a portion thereof.
- the DBD is about 140 to about 160 amino acid residues in length.
- the DBD comprises an amino acid sequence having not more than about 90% similarity to SEQ ID NO: 18.
- the DBD of an EBNA1 homolog comprises an amino acid sequence having not more than about 90% identity to SEQ ID NO: 18.
- the DBD comprises at least about 15 to about 149 mismatches relative to SEQ ID NO: 18.
- the DBD of an EBNA1 homolog comprises at least about 15 to about 149 mismatches relative to SEQ ID NO: 18. In some embodiments, the DBD lacks a contiguous sequence of 10 amino acids present in SEQ ID NO: 18. In some embodiments, the DBD of an EBNA1 homolog lacks a contiguous sequence of 10 amino acids present in SEQ ID NO: 18. In some embodiments, the DBD lacks a sequence set forth in SEQ ID NOs: 221, 227, 233, 239, 246, 252, 258, 264, 270, and 276.
- the DBD of an EBNA1 homolog lacks a sequence set forth in SEQ ID NOs: 221, 227, 233, 239, 246, 252, 258, 264, 270, and 276 [0071]
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 3 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 3.
- the ORF comprises SEQ ID NO: 24 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 24.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 7 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 7.
- the ORF comprises SEQ ID NO: 25 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 25.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 8 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 8.
- the ORF comprises SEQ ID NO: 27 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 27.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 9 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 9.
- the ORF comprises SEQ ID NO: 28 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 28.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 10 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 10.
- the ORF comprises SEQ ID NO: 26 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 26.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 11 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 11.
- the ORF comprises SEQ ID NO: 29 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 29.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 130 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 130.
- the ORF comprises SEQ ID NO: 129 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 129.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 133 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 133.
- the ORF comprises SEQ ID NO: 132 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 132.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 136 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 136.
- the ORF comprises SEQ ID NO: 135 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 135.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 139 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 139.
- the ORF comprises SEQ ID NO: 138 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 138.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 142 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 142.
- the ORF comprises SEQ ID NO: 141 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 141.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 145 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 145.
- the ORF comprises SEQ ID NO: 144 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 144.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 150 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 150.
- the ORF comprises SEQ ID NO: 149 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 149.
- the DNA binding protein comprises an amino acid sequence set forth in SEQ ID NO: 153 or an amino acid sequence having at least 80% identity to an amino acid sequence set forth in SEQ ID NO: 153.
- the ORF comprises SEQ ID NO: 152 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 152.
- the DBE is an EBV DBE or a variant thereof.
- the EBV DBE comprises TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 51, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 51.
- the DNA binding polynucleotide comprising a NHP LCV DBE, or a variant thereof comprises a sequence having at least about 80% identity to SEQ ID NO: 316.
- the DNA binding polynucleotide comprising a NHP LCV DBE, or a variant thereof comprises SEQ ID NO: 316.
- the DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof comprises SEQ ID NO: 69, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 69. In some embodiments, the DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof, comprises SEQ ID NO: 71, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 71.
- the DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof comprises a family of repeats (FR) of the EBV OriP or a portion of the FR, wherein the recombinant expression vector lacks a dyad symmetry (DS) of the EBV OriP.
- the DBE is an NHP LCV DBE or a variant thereof.
- the NHP LCV DBE comprises (i) CGCCAACAAACGTTG (SEQ ID NO: 317), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 317, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 317.
- the NHP LCV DBE comprises CAACACCCAGTCACGCAGTCTCAAGGGTCCT (SEQ ID NO: 318), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 318, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 318.
- the NHP LCV DBE comprises TTTGTTGGCGCCAACAAA (SEQ ID NO: 319), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 319, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 319.
- the NHP LCV DBE comprises AATGTTGGCGCCAACAAA(SEQ ID NO: 336), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 336, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 336.
- the one or more DBEs comprise TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 51, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 51.
- the polynucleotide comprises 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10, or 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 DBEs.
- the polynucleotide comprises a sequence represented by the formula: 5 ⁇ -(C-D) n -3 ⁇ , wherein (i) C comprises TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 51, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 51; (ii) D comprises CCCAGATATAGATTAGGA (SEQ ID NO: 61), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 61, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 61; and (iii) n is an integer of 1 to 30.
- the polynucleotide comprises SEQ ID NO: 69 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 69. In some embodiments, the polynucleotide comprises SEQ ID NO: 70 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 70.
- the polynucleotide comprises a first nucleotide sequence operatively linked to a second nucleotide sequence, wherein (i) the first nucleotide sequence comprises SEQ ID NO: 69 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 69; and (ii) the second nucleotide sequence comprises SEQ ID NO: 70 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 70.
- the first nucleotide sequence is upstream the second nucleotide sequence. In some embodiments, the first nucleotide sequence is downstream the second nucleotide sequence.
- the polynucleotide comprises SEQ ID NO: 2 or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 2. In some embodiments, the polynucleotide comprises a nucleotide sequence having at least about 70% identity to nucleotides 107-1821 of SEQ ID NO: 2. [0075] In some embodiments of any of the foregoing or related aspects, the polynucleotide comprises at least 4 DBEs, and wherein the DBEs are the same or different. In some embodiments, the polynucleotide comprises 4 to 50, 4 to 40, 4 to 30, 4 to 20, 4 to 10, or 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, or 4 DBEs.
- the at least 4 DBEs are contiguous. In some embodiments, the at least 4 DBEs are operably linked via a spacer sequence. In some embodiments, the spacer sequence is about 1-56 nucleotides in length. In some embodiments, the spacer sequence is 25-35 nucleotides in length. In some embodiments, the spacer sequence comprises an AT-content of at least about 50% or higher. In some embodiments, each DBE comprises a 3 ⁇ spacer sequence, wherein the length of the DBE and the 3 ⁇ spacer sequence is about 20-50 nucleotides.
- the polynucleotide comprises a sequence according to the formula 5 ⁇ -[D1]-[L1]-[D2]-[L2]-[D3]-[L3]-([Dn]-[Ln])x- 3 ⁇ , wherein [D1], [D2], [D3], and [Dn] each comprise TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 51, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 51; wherein [L1], [L2], [L3], and [Ln] are each selected from: a phosphate linkage and a spacer sequence of 1-56 nucleotides, and wherein x indicates the number of ([Dn]-[Ln]) units in the sequence and is an integer of 1-47.
- x is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17.
- [D 1 ], [D2], [D3], and [Dn] are the same or different.
- the spacer sequence is 25- 35 nucleotides in length.
- the spacer sequence comprises or consists of a nucleotide sequence set forth in Table 9.
- [D1]-[L1], [D2]-[L2], [D3]- [L 3 ], and/or [D n ]-[L n ] have a length of about 20 to about 50 nucleotides.
- the spacer sequence comprises a AT-content of greater than 50%.
- [L1], [L 2 ], [L 3 ], and [L n ] are the same or different.
- [L 1 ], [L 2 ], [L 3 ], and [L n ] each comprise or consist of a nucleotide sequence set forth in Table 9, a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to a nucleotide sequence set forth in Table 9, or a nucleotide sequence having at least about 80% identity to a nucleotide sequence set forth in Table 9.
- the DNA binding polynucleotide comprises SEQ ID NO: 69, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 69.
- the DNA binding polynucleotide comprises SEQ ID NO: 71, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 71.
- the polynucleotide comprises a family of repeats (FR) of the EBV OriP or a portion of the FR.
- the polynucleotide consists of the then FR or a portion of the FR.
- the recombinant expression vector lacks a dyad symmetry (DS) of the EBV OriP.
- the nucleic acid comprises one DNA binding polynucleotide. In some embodiments, the nucleic acid comprises more than one DNA binding polynucleotide. [0079] In some embodiments of any of the foregoing or related aspects, the polynucleotide is about 0.1 kb, about 0.2 kb, about 0.3 kb, about 0.4 kb, about 0.5 kb, about 0.6 kb, about 0.7 kb, about 0.8 kb, about 0.9 kb, about 1 kb, about 1.2 kb, about 1.3 kb, about 1.4 kb, about 1.5 kb, about 1.6 kb, about 1.7 kb, about 1.8 kb, about 1.9 kb, or about 2 kb in length.
- the fusion protein is 100 to 500 residues, 100 to 600 residues, 100 to 700 residues, 100 to 800 residues, 100 to 900 residues, 100 to 1,000 residues, 200 to 500 residues, 200 to 600 residues, 200 to 700 residues, 200 to 800 residues, 200 to 900 residues, 200 to 1,000 residues, 300 to 500 residues, 300 to 600 residues, 300 to 700 residues, 300 to 800 residues, 300 to 900 residues, or 300 to 1,000 residues in length.
- the DNA binding protein is 100 to 500 residues, 100 to 600 residues, 100 to 700 residues, 100 to 800 residues, 100 to 900 residues, 100 to 1,000 residues, 200 to 500 residues, 200 to 600 residues, 200 to 700 residues, 200 to 800 residues, 200 to 900 residues, 200 to 1,000 residues, 300 to 500 residues, 300 to 600 residues, 300 to 700 residues, 300 to 800 residues, 300 to 900 residues, or 300 to 1,000 residues in length.
- the DNA binding protein is about 160 to about 200 residues, about 160 to about 250 residues, about 160 to about 300 residues, about 160 to about 350 residues, about 160 to about 350 residues, about 160 to about 400 residues, about 160 to about 450 residues, or about 160 to about 500 residues in length.
- the mRNA is present at about 20%, about 30%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, or about 70% of the total nucleic acid. In some embodiments, the mRNA is present at about 5%, 10%, 15%, 20%, about 30%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, or about 70% by mass of the total nucleic acid.
- the mRNA is present at about 10% to about 20% or about 10% to about 40% by mass of the total nucleic acid. In some embodiments, the mRNA and the recombinant expression vector are present at a 1:1 w/w ratio. In some embodiments, the mRNA and the recombinant expression vector are present at a 1:1 molar ratio. In some embodiments, the recombinant expression vector is a plasmid DNA. In some embodiments, the recombinant expression vector is a linear DNA vector. In some embodiments, the mRNA, the recombinant expression vector, or both are formulated in a lipid nanoparticle (LNP).
- LNP lipid nanoparticle
- the mRNA and the recombinant expression vector are formulated in the same LNP. In some embodiments, the mRNA and the recombinant expression vector are individually formulated in LNP. In some embodiments, the cell is a dividing cell. In some embodiments, wherein when the system is introduced to the cell, the DNA-binding protein is transcribed from the mRNA and the DBD binds to the one or more DBEs.
- binding of the DBD to the one or more DBEs results in (i) tethering the recombinant expression vector to chromatin during mitosis, (ii) increased nuclear uptake of the recombinant expression vector, (iii) increased replication of the recombinant expression vector, (iv) increased transactivation of the recombinant expression vector, or (v) a combination of (i)-(iv), as compared to introducing to the cell the recombinant expression vector alone.
- binding of the DBD to the one or more DBEs results in (i) tethering the recombinant expression vector to chromatin during mitosis, (ii) increased nuclear uptake of the recombinant expression vector, (iii) increased transcription of the transgene encoded in the recombinant expression vector, or (iv) a combination of (i)-(iii), as compared to introducing to the cell the recombinant expression vector alone.
- expression of the at least one transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110-fold, about 120- fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, or about 200-fold as compared to introducing to the cell the recombinant expression vector alone.
- expression of the at least one transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100- fold, about 110-fold, about 120-fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, about 200-fold, about 250-fold, about 300-fold, about 350-fold, about 400-fold, about 450-fold, about 500-fold, about 600-fold, about 700-fold, about 800-fold, about 900-fold, about 1x10 3 -fold, about 5x10 3 -fold, or about 1x10 4 - fold as compared to introducing to the cell the recombinant expression vector alone.
- expression of the at least one transgene in the cell is increased by at least about 5-fold, about 6-fold, about 7-fold, about 8-fold, about 9-fold, or about 10-fold as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- the at least one transgene encodes a non-coding RNA (ncRNA), a polypeptide, or a combination thereof.
- the at least one transgene encodes a ncRNA selected from a ribosomal RNA, a transfer RNA, an immunostimulatory RNA, and a small RNA.
- the small RNA is selected from an antisense oligonucleotide, a small interfering RNA, a short hairpin RNA, a microRNA, a small nucleolar RNA, and a small nuclear RNA.
- the at least one transgene encodes a polypeptide.
- the polypeptide is selected from intracellular polypeptide, a secreted polypeptide, a membrane- bound polypeptide, and a transmembrane polypeptide.
- the polypeptide is selected from a hormone, an antibiotic, an enzyme, a signaling protein, a structural protein.
- the polypeptide is an immunomodulatory polypeptide.
- the immunomodulatory polypeptide is selected from a cytokine, a chemokine, an immune cell activator, a multispecific immune cell engager, an antibody or antigen binding fragment thereof, and a TME modulator.
- the immunomodulatory polypeptide is a cytokine.
- the immunomodulatory polypeptide is an antibody or antigen binding fragment thereof.
- the immunomodulatory polypeptide is an immune checkpoint inhibitor.
- the transgene is operably linked to a promoter.
- the promoter is a cancer-specific promoter.
- the promoter is a tumor-specific promoter.
- the polynucleotide is 5’ of the at least one transgene. In some embodiments, the polynucleotide is 3’ of the at least one transgene. In some embodiments, the recombinant expression vector comprises at least one second polynucleotide comprising one or more DBEs of an EBV OriP. [0082] In some embodiments of any of the foregoing or related aspects, the non-viral expression system comprises the DNA-binding protein as a polypeptide. In some embodiments, the DNA-binding protein and the recombinant expression vector are in the same composition. In some embodiments, the DNA-binding protein and the recombinant expression vector are in different compositions.
- the DNA-binding protein and the recombinant expression vector are formulated in different lipid nanoparticles (LNPs). In some embodiments, the DNA-binding protein and the recombinant expression vector are formulated in the same LNP.
- the non-viral expression system comprises the nucleic acid encoding the DNA-binding protein. In some embodiments, the nucleic acid is an mRNA comprising an open reading frame (ORF) encoding the DNA-binding protein. In some embodiments, the mRNA and the recombinant expression vector are present at a 10:1 to a 1:10 molar ratio.
- the mRNA has a mass percent of about 5% to about 60% of the total nucleic acid.
- the nucleic acid and the recombinant expression vector are in the same composition. In some embodiments, the nucleic acid and the recombinant expression vector are in different compositions. In some embodiments, the nucleic acid and the recombinant expression vector are formulated in different LNPs. In some embodiments, the nucleic acid and the recombinant expression vector are formulated in the same LNP. [0084] In some embodiments of any of the foregoing or related aspects, the non-viral expression system comprises the recombinant expression comprising the nucleic acid encoding the DNA binding protein.
- the recombinant expression vector and the recombinant expression vector encoding the DNA binding protein are in the same composition. In some embodiments, the recombinant expression vector and the recombinant expression vector encoding the DNA binding protein are in different compositions. In some embodiments, the recombinant expression vector and the recombinant expression vector encoding the DNA binding protein are formulated in different LNPs. In some embodiments, the recombinant expression vector and the recombinant expression vector encoding the DNA binding protein are formulated in the same LNP. [0085] In some embodiments of any of the foregoing or related aspects, the recombinant expression vector is a plasmid DNA.
- the recombinant expression vector is a close-ended linear DNA vector.
- the transgene is operably linked to a promoter, optionally wherein the promoter is a tumor-specific promoter.
- the disclosure provides an LNP comprising the non-viral system described herein.
- the disclosure provides a cell comprising a non-viral system described herein or an LNP described herein. In some embodiments, the cell is a dividing cell.
- the disclosure provides a pharmaceutical composition comprising the non-viral system described herein, and a pharmaceutically acceptable carrier.
- the disclosure provides a pharmaceutical composition comprising the LNP described herein, and a pharmaceutically acceptable carrier.
- the disclosure provides a pharmaceutical composition comprising the cell described herein, and a pharmaceutically acceptable carrier.
- the disclosure provides a method of increasing expression of at least one transgene in a cell comprising contacting the cell with the non-viral system described herein, the LNP described herein, or the pharmaceutical composition described herein, wherein upon introducing to the cell the system, the LNP, or the pharmaceutical composition, the DNA- binding protein is transcribed from the mRNA and the DBD binds to the one or more DBEs, thereby increasing expression of the at least one transgene in the cell.
- the cell is a dividing cell.
- the disclosure provides a method of increasing expression of a transgene in a cell comprising contacting the cell with a non-viral system described herein, an LNP described herein, or a pharmaceutical composition described herein, wherein upon introducing to the cell the system, the LNP, or the pharmaceutical composition, the DNA- binding protein, or the DNA-binding protein produced from the nucleic acid or the recombinant expression vector encoding the DNA-binding protein, binds to the DNA binding polynucleotide, or a portion thereof, thereby increasing expression of the transgene in the cell.
- the cell is a dividing cell.
- the disclosure provides a method for increasing expression of a transgene in a dividing cell comprising contacting the cell with a non-viral expression system described herein, an LNP described herein, or a pharmaceutical composition described herein.
- the cell e.g., the dividing cell
- the cell is contacted ex vivo.
- the cell e.g., the dividing cell
- the cell is contacted in vivo.
- the cell is contacted ex vivo.
- the cell is contacted in vivo.
- binding of the DBD to the one or more DBEs results in (i) increased nuclear uptake of the recombinant expression vector, (ii) increased replication of the recombinant expression vector, (iii) increased transactivation of the recombinant expression vector, or (iv) a combination of (i)-(iii), as compared to introducing to the cell the recombinant expression vector alone.
- binding of the DBD to the one or more DBEs results in (i) increased nuclear uptake of the recombinant expression vector, (ii) increased transcription of the transgene encoded in the recombinant expression vector, (iii) a combination of (i)-(ii), as compared to introducing to the cell the recombinant expression vector alone.
- binding of the DNA binding protein to the DNA binding polynucleotide, or a portion thereof results in (i) increased nuclear uptake of the recombinant expression vector, (ii) tethering of the recombinant expression vector to chromatin, (iii) increased retention of the recombinant expression vector in the nucleus during mitosis, (iv) increased transcription of the transgene encoded in the recombinant expression vector, or (v) a combination of (i)-(iv), as compared to contacting the cell with the recombinant expression vector.
- expression of the at least one transgene is increased by at least about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65- fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110-fold, about 120-fold, about 130-fold, about 140-fold, about 150- fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, or about 200-fold as compared to introducing to the cell with the recombinant expression vector alone.
- expression of the at least one transgene is increased by at least about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110- fold, about 120-fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, about 200-fold, , about 250-fold, about 300-fold, about 350-fold, about 400-fold, about 450-fold, about 500-fold, about 600-fold, about 700- fold, about 800-fold, about 900-fold, about 1x10 3 -fold, about 5x10 3 -fold, or about 1x10 4 -fold as compared to introducing to the cell with the recombinant expression vector alone.
- expression of the transgene is increased by at least about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110-fold, about 120- fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, about 200-fold, about 250-fold, about 300-fold, about 350-fold, about 400-fold, about 450-fold, about 500-fold, about 600-fold, about 700-fold, about 800-fold, about 900-fold, about 1x10 3 -fold, about 5x10 3 -fold, or about 1x10 4 -fold as compared to contacting the cell with the recombinant expression vector.
- the expression of the transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold as compared to introducing the recombinant expression vector alone. In some embodiments, expression of the transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- expression of the transgene is increased by about 10-fold to about 10 2 -fold, about 10-fold to about 10 3 -fold, about 10 2 -fold to about 10 3 -fold, about 10 2 - fold to about 10 4 -fold, or about 10 3 -fold to about 10 4 -fold, as compared to introducing the system to a control cell contacted with a mitosis inhibitor.
- the disclosure provides a method of treating a disease or disorder to a subject, the method comprising administering to a subject an effective amount of the non-viral system described herein, the LNP described herein, the cell described herein, or the pharmaceutical composition described herein.
- the subject has cancer.
- the effective amount is administered systemically. In some embodiments, the effective amount is administered intratumorally.
- the disclosure provides a method for selectively expressing a transgene in a target tissue and/or target cell population in a subject, comprising administering to the subject a non-viral expression system described herein, an LNP described herein, a cell described herein, or a pharmaceutical composition described herein.
- the target tissue comprises tumor tissue.
- the target cell population comprises tumor cells.
- the system is administered systemically. In some embodiments, the system is administered intratumorally.
- expression of the transgene in the target tissue and/or target cell population is increased by about 10-fold to about 100-fold as compared to administering the recombinant expression vector alone. In some embodiments, expression of the transgene in the target tissue and/or target cell population is increased by about 10-fold, about 20-fold, about 30-fold, about 40-fold, or about 50-fold as compared to administering the recombinant expression vector alone. In some embodiments, the transgene is not substantially expressed in a non-target tissue and/or a non-target cell population. In some embodiments, expression of the transgene in a non-target tissue and/or a non-target cell population is substantially equivalent to administering the recombinant expression vector alone.
- the non-target tissue comprises liver tissue and/or the non-target cell population comprises hepatocytes.
- expression of the transgene in the target tissue and/or the target cell population is at least about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 50-fold higher as compared to the non-target tissue and/or non-target cell population.
- expression of the transgene in the target tissue and/or the target cell population is about 10-fold to about 50-fold, about 10-fold to about 100-fold, about 50-fold to about 100-fold, about 50-fold to about 200-fold, about 100-fold to about 200-fold, about 100-fold to about 300-fold, about 100-fold to about 500-fold, or about 100-fold to about 10 3 -fold higher as compared to the non-target tissue and/or non-target cell population.
- the disclosure provides use of the non-viral system described herein, the LNP described herein, the cell described herein, or the pharmaceutical composition described herein for treating a disease or disorder in a subject.
- the disclosure provides use of the non-viral system described herein, the LNP described herein, the cell described herein, or the pharmaceutical composition described herein in the manufacture of a medicament for treating a disease or disorder in a subject.
- the disclosure provides use of a non-viral system described herein, an LNP described herein, a cell described herein, or a pharmaceutical composition described herein for selectively expressing a transgene in a target tissue and/or target cell population in a subject.
- the disclosure provides use of a non-viral system described herein, an LNP described herein, a cell described herein, or a pharmaceutical composition described herein in the manufacture of a medicament for treating a disease or disorder in a subject.
- the disclosure provides a kit comprising a container comprising the non-viral system described herein, the LNP described herein, the cell described herein, or the pharmaceutical composition described herein, and a package insert comprising instructions for contacting a cell with the system, the LNP, the cell, or the pharmaceutical composition for increasing expression of a transgene.
- the contacting is ex vivo. In some embodiments, the contacting is in vivo.
- the disclosure provides a kit comprising a container the non- viral system described herein, the LNP described herein, the cell described herein, or the pharmaceutical composition described herein, and a package insert comprising instructions for administering the system, the LNP, the cell, or the pharmaceutical composition to a subject for treating a disease or disorder.
- the disclosure provides a kit comprising a non-viral system described herein, an LNP described herein, or a pharmaceutical composition described herein, and a package insert comprising instructions for administering the system, the LNP, or the pharmaceutical composition to a subject for selectively expressing a transgene in a target tissue and/or target cell population.
- FIG. 1 shows a graphical representation of cellular expression of a transgene following transfection with an exemplary hybrid mRNA/DNA nonviral gene expression system of the disclosure.
- mRNA comprising an open reading frame encoding an EBNA1 polypeptide described herein (1.04) and a non-viral vector (1.05) encoding a DNA binding element (1.06) and transgene are formulated as a lipid nanoparticle (LNP) (1.07) and introduced to a target cell (1.08).
- LNP lipid nanoparticle
- the exemplary EBNA1 polypeptide encoded by the mRNA comprises one or more chromatin binding domains (1.01) operably linked (1.03) to an EBNA1 DNA binding domain (DBD) (1.02).
- EBNA1 mRNA is translated into EBNA1 protein (1.09).
- Exemplary functions of EBNA1 protein include binding to the DNA binding element (1.10), association with chromosomal DNA during S-phase of the cell cycle (1.11-1.12), increasing nuclear retention following mitosis (1.13), and increasing transgene expression.
- FIG. 2A shows average peak luminescence (four highest values) for HepG2 liver cancer cells transfected in vitro with mRNA encoding EBNA1 polypeptide with a Gly- Ala repeat region truncation (“C211” or “EBNA1 mRNA”) and one of two different plasmid DNA (pDNA) constructs encoding luciferase and having an OriP (“C198” or “C220”) alone or in the presence of the mitosis inhibitor aphidicolin (“Aph”).
- mRNA and pDNA were co- formulated in an LNP. Control cells were treated with LNP-formulated pDNA (C198 or C220) alone or neither mRNA or pDNA.
- FIG. 2B-2E provide graphs showing average peak luminescence for HepG2 (FIG.2B), A375 (FIG. 2C), CT26 (FIG. 2D), or H1299 (FIG. 2E) cells transfected in vitro with EBNA1 C211 mRNA and luciferase encoding pDNA having an OriP FR (FR+ pDNA) (C673) co-formulated in an LNP. Control cells were transfected with LNP-formulated C673 pDNA alone. Cells were transfected in media alone or media containing Aph to model dividing and non-dividing cells respectively. [00106] FIG.
- FIG. 3 shows the average in vivo luminescence at day 1 and day 3 in vitro following administration of EBNA1 mRNA (C211) and C198 co-formulated as an LNP. Positive control was mRNA encoding luciferase (C605) and formulated as an LNP or in MessengerMax transfection reagent. C198 was administered alone formulated as an LNP or in lipofectamine.
- FIG. 4 shows the difference in luciferase expression between Balb/c mice or C57bl/6 mice intravenously administered with EBNA1 mRNA (C211)/C198 or Control mRNA/C198 formulated as an LNP, or saline control.
- FIGs. 5A-5B shows a time series of luminescence (RLU) by each specified condition (at days following transfection of HepG2 cells with LNP co-formulated EBNA1 mRNA (C211) and C198 pDNA, control mRNA and C198 pDNA, or C198 pDNA only). Conducted with control Cre mRNA where specified. Data were plotted as average luminescence for peak concentrations (four replicates) on a log RLU scale (FIG. 5A), and maximum luminescence (single point, maximum value per condition) on a linear RLU scale (FIG.5B).
- FIGs. 6A-6B shows a time series of luminescence (RLU) by each specified condition (at days following transfection of HepG2 cells with the LNPs described in FIGs.5A- 5B followed by a wash step). LNPs were allowed to incubate with target cells overnight prior to the wash step. Data were plotted as average luminescence for peak concentration (two replicates) on a log RLU scale (FIG.6A), and maximum luminescence (single point, maximum value per condition) on a linear RLU scale (FIG.6B).
- FIG. 7 shows luminescence (RLU) by condition 24 hours after transfection in the non-small cell lung cancer (NSCLC) cell line H1299.
- NSCLC non-small cell lung cancer
- FIG. 8 shows HiBit tag luminescence (RLU) by condition 5 days after transfection across four separate cell lines (HepG2, A431, Hep3B, and H1299).
- FIG. 9 shows tumor bioluminescence at days 1-28 after intratumoral injection of luciferase pDNA in NCG mice bearing HepG2 tumors. Mice received an LNP formulation of C211 EBNA1 mRNA and C198 pDNA or control mRNA and C198 pDNA. Control mice received no injection.
- FIG. 10 shows the average luminescence dose-response curve 24 hours post- transfection with the fusion construct C558 (mRNA encoding a fusion protein comprising a chromatin binding domain (CBD) of the latency associated nuclear antigen (LANA) fused the EBNA1 DBD).
- CBD chromatin binding domain
- LANA chromatin binding domain
- LANA latency associated nuclear antigen
- Control cells were transfected with C198 pDNA only.
- FIGs. 11A-11B shows peak luminescence as fold-change relative to control (FIG.
- FIG. 11A and by concentration (FIG. 11B) at 24 hours following lipofectamine-mediated transfection with EBNA1 mRNA (C211) or fusion construct mRNAs and OriP+ luciferase- encoding pDNA (C198).
- Fusion construct mRNAs encoded a fusion protein comprising (i) a EBNA1 amino acid residues 1-15, a LANA CBD, and an EBNA1 DBD (C558), (ii) a LANA CBD and EBNA1 DBD (C581), (iii) an IL33 AT-hook CBD and EBNA1 DBD (C569), (iv) a histone H1 CBD and EBNA1 DBD (C577), or (v) a E2 CBD and EBNA1 DBD (C587). Control cells were transfected with pDNA alone.
- FIG.11C shows peak luminescence at 24 hours and 6 days post-transfection of HepG2 cells with EBNA1 mRNA (C211) or fusion construct mRNAs (C558, C569, C577, C581, or C587) co-administered with OriP+luciferase pDNA (C198), relative to DNA alone.
- FIG.11C shows peak luminescence at 24 hours and 6 days post-transfection of HepG2 cells with EBNA1 mRNA (C211) or fusion construct mRNAs (C558, C569, C577, C581, or C587) co-administered with OriP+luciferase pDNA (C198), relative to DNA alone.
- FIG. 12A provides a graph showing luciferase expression in the tumor of xenograft mice bearing HepG2 tumors over time following a single intratumoral injection of an LNP formulation containing EBNA1 mRNA (C211) and luciferase encoding pDNA having an OriP FR (FR+ pDNA) (C673), C673 pDNA only, or control mRNA encoding luciferase (C605). Luciferase expression was measured using bioluminescence imaging and represented as radiance (photons/sec/cm ⁇ 2/steradian) averaged across mice in each cohort. [00117] FIG. 12B provides representative bioluminescent images of mice described in FIG.12A.
- FIGs. 12C-12D provide graphs showing average bioluminescent signal in the tumor or liver of mice described in FIG.12A (FIG.12C) or the ratio of tumor to liver signal of the same mice (FIG.12D).
- FIGs. 13A-13B provide graphs showing peak luminescence in HepG2 cells (FIG.13A) or H1299 cells (FIG.13B) transfected in vitro with pDNA containing a FR+ repeat array and encoding luciferase under control of a CMV promoter (C673) or an alpha fetoprotein (AFP) promoter (C780) and EBNA1 mRNA (C211) formulated in lipofectamine.
- CMV promoter C673
- AFP alpha fetoprotein
- C211 EBNA1 mRNA
- FIG. 14A provides a graph showing luciferase expression in the tumor of xenograft mice bearing HepG2 tumors following a single intratumoral injection of an LNP formulation containing EBNA1 mRNA (C211) and luciferase encoding FR+ pDNA under control of an alpha fetoprotein (AFP) promoter (C780). Luciferase expression was measured as in FIG.12A.
- FIG.14B provides representative bioluminescent images of the mice described in FIG.14A.
- FIG. 15 provides a graph showing tumor volume over time in B16F10-tumor bearing mice administered a single intratumoral injection of an LNP formulation containing EBNA1 mRNA (C211) and a combination of FR+ pDNAs encoding murine single-chain IL- 12 (scIL12), interferon- ⁇ (IFN ⁇ ), granulocyte-macrophage colony-stimulating factor (GM- CSF), and IL-15 sushi.
- scIL12 murine single-chain IL- 12
- IFN ⁇ interferon- ⁇
- GM- CSF granulocyte-macrophage colony-stimulating factor
- FIGs. 16A-16D provides graphs showing in vitro luciferase expression in HepG2 (FIG.16A), H1299 (FIG. 16B), CT26 (FIG.
- FIG. 16C 16C
- A375 FIG.16D
- tumor cells at day 5 following transfection with an LNP formulation containing EBNA1 mRNA (C211) and luciferase encoding FR+ pDNA (C673) or EBNA1 mRNA (C211) and luciferase encoding FR+ linear, close-ended DNA.
- Control cells were transfected with an LNP-formulation containing pDNA only or linear DNA only.
- 17A provides a graph showing in vitro luciferase expression in H1299 cells at 1 and 5 days following lipofectamine-mediated transfection with luciferase-encoding FR+ pDNA (C673) and mRNA encoding EBNA1 (C211) or an mRNA encoding an EBNA1 variant (EBNA1 having a deletion of the transactivation (TA) domain (C804 “EBNA1 ⁇ TA domain”); deletion of the TA domain and NLS (C805 “EBNA1 ⁇ NLS-TA domain”); deletion of the NLS (C807; “EBNA1 ⁇ NLS”); or substitution of the NLS with a c-myc NLS (C806; “EBNA1 C-Myc NLS”)).
- TA transactivation domain
- NLS C805 “EBNA1 ⁇ NLS-TA domain”
- NLS C807; “EBNA1 ⁇ NLS”
- FIG. 17B provides a graph showing in vitro luciferase expression in HepG2 cells at 1 and 5 days following lipofectamine-mediated transfection with luciferase-encoding FR+ pDNA (C673) and mRNA encoding EBNA1 (C211) or mRNA encoding an EBNA1 variant (EBNA1 having a deletion of a putative autoantigen motif (C778) or portion thereof (C779)). Control cells were transfected with pDNA (C673) only.
- FIG.18 provides representative histology images of livers harvested from mice at three days following administration of a single intravenous injection of an LNP formulation containing mRNA encoding a minimal EBNA1 (C804) and FR+ pDNA encoding GFP (C983) or mRNA encoding GFP.
- FIGs. 19A-19C provide graphs showing in vitro luciferase expression in HepG2 (FIG. 19A), A375 (FIG. 19B), and Hepa1-6 (FIG.
- FIG. 20A-20B provides graphs showing in vitro luciferase expression in HepG2 (FIG.20A) and H1299 (FIG.20B) cells at 1 day following transfection with an LNP formulation containing an EBNA1 mRNA (C211) and luciferase encoding pDNA having a full-length OriP (C590) or OriP FR (C608). Control cells were transfected with LNP- formulated pDNA (C608 or C590) only. [00129] FIGs.
- FIGs.22A-22B provide graphs showing in vitro luciferase expression in H1299 (FIG. 22A) and HepG2 (FIG.21B) cells at 1 day following transfection with an LNP formulation or lipofectamine formulation of EBNA1 mRNA (C211) and luciferase-encoding FR+ pDNA (C608). Control cells received pDNA only formulated in an LNP or lipofectamine.
- FIGs.22A-22B provide graphs showing in vitro luciferase expression in H1299 (FIG. 22A) and HepG2 (FIG.
- FIG.23 provides a graph showing in vivo tumor luminescence in subcutaneous A375 human melanoma tumors engrafted on NCG mice, following intratumoral administration of LNPs encapsulating an FR+ pDNA encoding luciferase and mRNA encoding either C804 EBNA1 ⁇ TA domain (“IT Delta TA”) or C211 EBNA1 (“IT EBN WT”). The control received no injection (“NA Neg Ctrl”).
- FIGs 24A-24B shows luminescence by condition 5 days after transfection by lipofectamine in the HepG2 cells (FIG. 24A) or H1299 (FIG. 24B) cell lines.
- FIGs 25A-25D show luminescence by condition 5 days after transfection by lipofectamine in the HepG2 (FIG. 25A and FIG 25C) and H1299 (FIG.25B and FIG.25D) cell lines.
- Cells were transfected with EBNA1 mRNA (except control condition) and pDNAs encoding luciferase and containing one or more repeat arrays of an OriP DBE.
- FIGs. 26A-26D provide graphs showing in vitro luciferase expression in HepG2 (FIG.26A), Hep3B (FIG.26B), Hepa1-6 (FIG.26C), or B16F10 (FIG.26D) cells at days 1, 3, and 5 following transfection with LNP encapsulating mRNA encoding EBNA1 (mC1036) or an EBNA1 homolog derived from baboon LCV (mC1033), rhesus LCV (mC1037), arctoides LCV (mC1061), cynomolgus LCV (mC1062), or gorilla/rhesus LCV (mC1068) and pDNA encoding a luciferase transgene and
- FIGs. 27A-27C provide graphs showing in vitro luciferase expression in HepG2 (FIG. 27A), Hep3B (FIG. 27B), or Hepa1-6 (FIG.
- FIGs.28A-28B provide graphs showing in vitro luciferase expression in A375 (FIG. 28A) or H1299 (FIG. 28B) cells at days 1, 3, and 5 following transfection with LNP encapsulating (i) truncated EBNA1 mRNA (mC211) and luciferase encoding pDNA having an EBV OriP FR (pC673); (ii) mRNA encoding an EBNA1 homolog from arctoides LCV (mC1061) and pC673; (iii) mRNA encoding EBNA1 homolog from marmoset EBNA1 (mC1035) and luciferase-encoding pDNA having a marmoset LCV OriP FR (pC1072).
- Non-viral expression systems comprising (i) a DNA binding protein (e.g., as a polypeptide or a nucleic acid or recombinant expression vector encoding the polypeptide) comprising a DNA binding domain (DBD) and a chromatin binding domain, and (ii) a recombinant expression vector comprising a transgene and a DNA binding polynucleotide comprising a DNA binding element (DBE) that binds the DBD, wherein the non-viral expression system provides an enhanced level and/or duration of expression of the transgene when introduced to a cell as compared to introducing the recombinant expression vector alone.
- a DNA binding protein e.g., as a polypeptide or a nucleic acid or recombinant expression vector encoding the polypeptide
- DBD DNA binding domain
- DBE DNA binding element
- the DNA binding protein mediates increased nuclear uptake of the recombinant expression vector and/or tethering of the recombinant expression vector to chromatin during mitosis, thereby retaining the recombinant expression vector in the nucleus and enabling enhanced transgene expression.
- a non-viral recombinant expression vector comprising at least one transgene is enhanced when engineered to comprise a polynucleotide comprising one or more Epstein Barr virus (EBV) DNA binding elements (DBEs) and introduced with an mRNA encoding a polypeptide comprising one or more chromatin binding domains and at least one EBNA1 DNA binding domain (DBD).
- EBV Epstein Barr virus
- DBEs DNA binding elements
- polypeptides of the disclosure comprising at least one EBNA1 DBD are referred to herein as “DNA binding proteins.”
- DNA binding proteins once introduced to a cell, the DNA binding protein is transcribed and mediates nuclear uptake, episomal replication, and/or transactivation of the non-viral vector, thereby enhancing the level and duration of expression of the transgene as compared to introduction of the non-viral vector alone.
- the system is amenable to expression of the DNA binding protein using a DNA vector, such systems are potentially deleterious due to constitutive expression of the DNA binding protein that creates a risk of altering the cellular state (e.g., via transformation to a malignant state).
- An advantage of the systems described herein is the use of an mRNA to encode the DNA binding protein, which allows for expression of the DNA binding protein for a sufficient duration to achieve the desired outcome (i.e., high level of expression of the transgene for an extended duration), without risk of constitutive expression.
- the desired outcome was substantially equivalent or even improved if the system combined the non-viral recombinant expression vector with an mRNA encoding an EBNA1 fusion protein comprising an EBNA1 DBD and one or more heterologous chromatin binding domain as compared to an mRNA encoding a full-length EBNA1 protein or fragment thereof.
- exemplary systems of the disclosure were shown to increase expression of a transgene in non-dividing cells (e.g., dividing cells contacted with a mitosis inhibitor) as compared to the non-viral vector alone, the greatest increase was observed in dividing cells.
- non-dividing cells e.g., dividing cells contacted with a mitosis inhibitor
- active cell division e.g., mitotic cells
- a recombinant expression vector introduced to non-dividing cells is excluded from the nucleus; whereas in dividing cells, which have a disassembled nuclear envelop, the recombinant expression vector is able to associate with chromosomal DNA to undergo processing.
- the recombinant expression vector is preferentially expressed in dividing cells as compared to non-dividing cells. As shown herein, this effect is substantially enhanced by co- introducing the recombinant expression vector with an mRNA encoding a DNA binding protein described herein capable of binding to one or more EBV DBEs present on the vector.
- a system of the disclosure e.g., a system comprising an mRNA encoding an EBNA1 polypeptide and a recombinant expression vector comprising a polypeptide-encoding transgene and a polynucleotide comprising one or more DBEs of an EBV OriP
- the level and duration of expression of a polypeptide-encoding transgene was increased in the tumor as compared to that in control subjects that received the recombinant expression vector alone or an mRNA encoding the polypeptide.
- test subjects In contrast to control subjects administered either the recombinant expression vector or the mRNA, in which levels of expression in tumor were respectively low for an extended duration and high for a short duration, the test subjects demonstrated a level of expression in tumor that was high for an extended duration.
- transgene expression in the test subjects was preferentially localized to tumor tissue, with levels of transgene expression in non-tumor tissue (e.g., liver tissue) substantially equivalent to background (e.g., as compared to the same tissue in a control or untreated subject). This result was observed whether the system was administered by intratumoral injection or systemic (e.g., intravenous) injection.
- the systems of the disclosure are effective for providing a level of expression in cells undergoing cell division that is enhanced as compared to that observed in quiescent and/or non-dividing cells, and as such, are amenable to selective expression following in vivo administration (e.g., systemic administration) in tissues comprising cells undergoing cell division as compared to tissues substantially comprising quiescent and/or non-dividing cells.
- in vivo administration e.g., systemic administration
- tumor tissue is composed of rapidly dividing cells (e.g., rapidly dividing tumor cells), whereas non-tumor tissue substantially comprises quiescent and/or non-dividing cells (e.g., liver tissue).
- present disclosure is further based, at least in part, on identification of components of the DNA binding protein encoded by the mRNA of the system and the polynucleotide comprising one or more DBEs of an EBV OriP presented by the recombinant expression vector that yield the desired enhancement in expression when introduced to a cell as compared to the recombinant expression vector introduced alone.
- an mRNA encoding (i) a fusion protein comprising one or more non-EBNA1 (“heterologous”) chromatin binding domains and an EBNA1 DBD, or (ii) a variant of an EBNA1 polypeptide comprising one or more EBNA1 chromatin binding domains, an EBNA1 DBD, and at least one modification (e.g., deletion, insertion, and/or substitution) of the NLS, TA domain, and/or Gly- Arg repeat region, was effective for increasing expression of a transgene present on a recombinant expression vector comprising one or more DBEs of an EBV OriP.
- an encoded variant of an EBNA1 polypeptide having a modified (e.g., deleted) TA domain was effective for achieving this outcome.
- a variant of an EBNA1 polypeptide comprising a modified (e.g., deleted) TA domain has the further desirable benefit of reduced immunogenicity when encoded by an mRNA administered to a subject (e.g., a human subject), as the TA domain comprises one or more antigens that are thought to trigger a deleterious immune response.
- a recombinant expression vector comprising an FR region of an EBV OriP, but lacking the DS region, was shown to yield the desired increase in transgene expression when introduced to a cell with an mRNA encoding a DNA binding protein described herein.
- the DS region present in the EBV genome has been implicated in plasmid replication.
- a system of the disclosure comprising a recombinant expression vector lacking the DS region does not undergo replication, thereby providing an improved safety profile that is desirable in certain in vivo applications by reducing the risk of the recombinant expression vector replicating and persisting indefinitely.
- the present disclosure provides a non-viral system for enhancing expression of a transgene, the system comprising an mRNA encoding a DNA binding protein and a non-viral recombinant expression vector comprising a transgene and a polynucleotide comprising one or more EBV DBEs.
- the DNA binding protein comprises at least one EBNA1 DBD and one or more chromatin binding domains.
- the DNA binding protein comprises at least one EBNA1 DBD and one or more EBNA1 chromatin binding domains.
- the DNA binding protein comprises a full-length EBNA1 polypeptide.
- the DNA binding protein comprises a truncated EBNA1 polypeptide. In some embodiments, the DNA binding protein comprises at least one EBNA1 DBD and one or more heterologous chromatin binding domains described herein. [00144] As described herein, a DNA binding protein comprising an EBNA1 DBD effectively achieves the desired cellular function(s) to yield increased and extended expression of a transgene-encoding recombinant expression vector comprising an EBV DBE as compared to the recombinant expression vector alone.
- such systems are susceptible to diminished efficacy in individuals previously exposed to EBV as result of an immune response against EBNA1, such as a memory T cell response against peptide antigens present in EBNA1.
- EBNA1 immune response against EBNA1 or a portion thereof (e.g., a DBD thereof)
- the immune response induced by re-exposure to EBNA1 or a portion thereof has the potential to trigger a T cell-mediated response against cells transfected with the EBNA1 or portion thereof (e.g., a DBE thereof), potentially resulting in undesirable cell death of otherwise healthy transfected cells.
- systems comprising a DNA binding protein and recombinant expression vector, wherein the DNA binding protein comprises a DBD from an EBNA1 homolog, wherein the EBNA1 homolog is derived from an LCV infecting an NHP host and the recombinant expression vector comprises a DBE from EBV or from the same LCV, results in enhanced transgene expression compared to the recombinant expression vector alone, despite the EBNA1 homolog having relatively low sequence homology to EBNA1 (e.g., about 50% to about 65% sequence identity to EBNA1).
- a non-viral expression system of the disclosure comprising a DNA binding protein comprising a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is derived from an LCV infecting an NHP of the family Hominoidea or Cercopithecoidea (also known as apes or Old World monkeys respectively; see, e.g., representative species listed in Table 22), effectively increased expression of a transgene-encoding recombinant expression vector comprising an array of EBV DBEs in a manner comparable to a control system comprising the recombinant expression vector and a DNA binding protein comprising an EBNA1 DBD.
- a transgene-encoding recombinant expression vector was increased in the presence of a DNA binding protein comprising a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is derived from an LCV infecting an NHP of the parvorder Platyrrhini (also known as New World monkeys; e.g., Callithrix jacchus, also known as the common marmoset) and the recombinant expression vector comprises an array of DBEs derived from the genome of the same LCV.
- a DNA binding protein comprising a DBD of an EBNA1 homolog
- the EBNA1 homolog is derived from an LCV infecting an NHP of the parvorder Platyrrhini (also known as New World monkeys; e.g., Callithrix jacchus, also known as the common marmoset) and the recombinant expression vector comprises an array of DBEs derived from the genome of the same LCV.
- a system comprising a DNA binding protein of the disclosure (e.g., a DNA binding protein comprising a chromatin binding domain and a DBD of an EBNA1 homolog described herein) functions to increases expression a recombinant expression vector of the disclosure (e.g., a recombinant expression vector comprising a transgene and an array of EBV DBEs or an array of DBEs from an NHP LCV), while reducing the risk of an EBNA1-associated immune response following in vivo administration due to the complete or partial absence of EBNA1 T cell epitopes in the DNA binding protein (e.g., one or more of the EBNA1 T cell epitopes listed in Table 26).
- a recombinant expression vector of the disclosure e.g., a recombinant expression vector comprising a transgene and an array of EBV DBEs or an array of DBEs from an NHP LCV
- the reduced immune recognition of a system of the disclosure in an individual previously exposed to EBV enables the system to be administered to the individual to achieve increased and durable transgene expression, without the risk of an EBV-associated immune response directed to transfected cells that express components of the system (e.g., transfected cells expressing the DNA binding protein).
- the present disclosure provides a non-viral system for enhancing expression of a transgene, the system comprising a DNA binding protein, or a nucleic acid or a recombinant expression vector encoding the DNA binding protein, and a non-viral recombinant expression vector comprising a transgene and a DNA binding polynucleotide, wherein the DNA binding protein comprises a DBD of an EBNA1 homolog of an NHP LCV, or a variant thereof, and wherein the DNA binding polynucleotide comprises a DBE of EBV or an NHP LCV.
- the DNA binding protein comprises a DBE of an EBNA1 homolog or a variant thereof, wherein the EBNA1 homolog is of an NHP LCV.
- the NHP is of the family Hominoidea. In some embodiments, the NHP is an ape. In some embodiments, the NHP is of the family Cercopithecoidea. In some embodiments, the NHP is an Old World monkey. In some embodiments, the NHP is of the parvorder Platyrrhini. In some embodiments, the NHP is of the family Callitrichidae. In some embodiments, the NHP is of the family Cebidae. In some embodiments, the NHP is of the family Aotidae.
- the NHP is of the family Pitheciidae. In some embodiments, the NHP is of the family Atelidae. In some embodiments, the NHP is a New World monkey.
- the DNA binding protein further comprises a chromatin binding domain. In some embodiments, the chromatin binding domain is an EBNA1 chromatin binding domain. In some embodiments, the chromatin binding domain is an NHP LCV chromatin binding domain. In some embodiments, the chromatin binding domain is a heterologous chromatin binding domain described herein. [00150] In some embodiments, the DNA binding polynucleotide comprises a DBE of EBV (e.g., 1, 2, 3, 4, 5, or more EBV DBEs).
- the DNA binding polynucleotide comprises an array of EBV DBEs. In some embodiments, the DNA binding polynucleotide comprises a DBE of an NHP LCV (e.g., 1, 2, 3, 4, 5, or more NHP LCV DBEs). In some embodiments, the DNA binding polynucleotide comprises an array of NHP LCV DBEs.
- the system comprises a nucleic acid encoding the DNA binding protein. In some embodiments, the system comprises a mRNA encoding the DNA binding protein. In some embodiments, the system comprises a recombinant expression vector encoding the DNA binding protein. In some embodiments, the system comprises the DNA binding protein as a polypeptide.
- the disclosure provides a delivery vehicle comprising a system described herein.
- the delivery vehicle comprises one or more LNPs.
- the DNA binding protein and the transgene-encoding recombinant expression vector are co-formulated as an LNP, wherein the DNA binding protein is a polypeptide or encoded by a nucleic acid or recombinant expression vector.
- the DNA binding protein and the transgene-encoding recombinant expression vector are formulated as separate LNPs.
- the DNA binding protein is encoded by an mRNA.
- the mRNA and non-viral recombinant expression vector are co-formulated as an LNP.
- the disclosure provides a method of increasing an expression level of a transgene in a cell, comprising introducing to the cell a system described herein.
- the cell is a dividing cell.
- the cell is a non-dividing cell.
- the disclosure provides an in vivo method for increasing expression of a transgene in a target cell population and/or target tissue in a subject, comprising administering to the subject a system described herein or a delivery vehicle comprising the system described herein.
- the target cell population comprises tumor cells.
- the target tissue comprises cancerous tissue.
- expression of the transgene is enhanced in the target cell population and/or target tissue as compared to a non-target cell population and/or non-target tissue.
- the disclosure provides a method for treating a disease or disorder in a subject comprising administering to the subject a system described herein or a delivery vehicle comprising the system described herein.
- the subject has cancer.
- the administering comprises intratumoral injection. Definitions [00156]
- the term “a” or “an” refers to one, or more than one, of that entity. In some embodiments, “a” refers to plural referents.
- a “DNA binding protein” refers to a polypeptide comprising at least one DNA binding domain (DBD), wherein the DBD comprises a nucleotide sequence that binds to a target DNA (e.g., a single- or double-stranded DNA).
- DBD DNA binding domain
- the DNA binding protein comprises a DNA binding domain (DBD) and a chromatin binding domain, wherein the DBD comprises a nucleotide sequence that binds to a target DNA (e.g., a single- or double-stranded DNA).
- the nucleotide sequence binds to the target DNA with a binding affinity sufficient to achieve binding under physiological conditions in a mammalian cell.
- the binding affinity is micromolar or lower (e.g., 10 -6 M to 10 -9 M).
- the binding affinity is nanomolar or lower (e.g., less than 10 -9 M).
- the target DNA is a DNA binding polynucleotide described herein.
- the target DNA is a polynucleotide comprising an array of DBEs, wherein the DBE comprises a sequence that binds to the DBD.
- chromatin binding domain refers to an agent that associates with, binds to, and/or localizes to chromatin, chromatin-associated structures, the nuclear lamina, and/or the nuclear matrix.
- DNA binding polynucleotide refers to a polynucleotide comprising an array of sequence elements, wherein each sequence element of the array comprises a DBE described herein or a variant thereof or a fragment thereof.
- the term “DNA binding polynucleotide” is used interchangeably herein with the term “polynucleotide comprising one or more DBEs” or “polynucleotide comprising a DBE.”
- the polynucleotide comprises an array of DBEs (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, or more DBEs).
- polynucleotide refers to a polymer of nucleotides/nucleosides joined together, e.g., by a phosphodister linkage between 5 ⁇ and 3 ⁇ carbon atoms.
- internucleoside linkage refers to a linkage joining one nucleotide/nucleoside unit of a polynucleotide to another nucleotide/nucleoside unit.
- the linkage is a phosphodiester linkage.
- the linkage is a modified linkage, e.g., a phosphorothioate linkage.
- sequence element refers to a segment of nucleotides/nucleosides in a polynucleotide, wherein the segment or a portion thereof comprises a biological activity/function.
- the biological activity/function comprises binding to a DNA binding protein described herein.
- a sequence element is at least 5 nucleotides/nucleosides in length and up to 100 nucleotides/nucleosides in length.
- the sequence element comprises a DBE that binds to a DNA binding element described herein.
- nucleoside refers to a molecule comprising a purine or pyrimidine base covalently linked to a ribose or deoxyribose sugar (e.g., adenosine, guanosine, cytidine, uridine, and thymidine).
- nucleotide refers to a nucleoside comprising one or more phosphate groups joined in ester linkages to the sugar moiety (e.g., nucleoside monophosphates, disphosphates, and triphosphates).
- the term “array” refers to a tandem arrangement of two or more sequence elements, wherein the two or more sequence elements are operably linked (e.g., by a phosphate linkage, or analog thereof, or by a spacer sequence).
- the array comprises at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 sequence elements.
- the sequence elements in the array have the same sequence. In some embodiments, the sequence elements in the array do not have the same sequence.
- an array comprises a tandem arrangement of two or more DBEs, wherein the two or more DBEs are operably linked (e.g., by a phosphate linkage, or analog thereof, or by a spacer sequence).
- the array comprises at least about 2, at least about 3, at least about 4, at least about 5, at least about 6, at least about 7, at least about 8, at least about 9, or at least about 10 DBEs.
- the DBEs in the array have the same sequence. In some embodiments, the DBEs in the array do not have the same sequence. In some embodiments, a portion of (e.g., about 20% or more) or a majority of (i.e., 50% or more) the DBEs in the array have the same sequence. In some embodiments, a portion of (e.g., about 20% or more) or a majority of (i.e., 50% or more) the DBEs in the array do not have the same sequence.
- each DBE is selected from a DBE of an EBV OriP described herein, a fragment thereof, and a variant thereof.
- the DBE is selected from a DBE of an NHP LCV described herein, a fragment thereof, and a variation thereof.
- the DBEs in the array are arranged in a manner (e.g., having a number, orientation, sequence similarity, and/or spacing) that results in a polynucleotide that binds to the DNA binding protein, e.g., as determined by a method of measuring binding interactions described herein.
- DBE DNA binding element
- DNA binding element refers to a polynucleotide sequence that binds to a DNA binding protein described herein, e.g., under physiological conditions.
- a fragment of the DBE is understood to mean a sequence shorter than the DBE (e.g., a truncation of the DBE comprising one or more deletions at the 5 ⁇ end, the 3 ⁇ end, and/or an internal region) that retains binding to the DNA binding protein (e.g., retains substantially equivalent binding to the DNA binding protein as compared to the DBE).
- the fragment of the DBE comprises one or more deletions at the 5 ⁇ end, the 3 ⁇ end, and/or an internal region.
- a “variant of the DBE” is understood to mean a sequence comprising one or more mismatches relative to the DBE that retains binding to the DNA binding protein (e.g., retains substantially equivalent binding to the DNA binding protein as compared to the DBE).
- the variant of the DBE comprises 1, 2, 3, 4, 5, or more mismatches relative to the DBE.
- the variant comprises 1 or 2 mismatches relative to the DBE.
- the variant comprises at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the DBE. In some embodiments, the variant comprises at least about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% identity to the DBE.
- EBV DBE interchangeably used with “DBE of EBV,” refers to a repeating sequence element present in an origin of replication (OriP) of the EBV genome. Representative repeating sequence elements present in the OriP of the EBV genome are set forth in Table 8.
- the repeating sequence element is a DBE that binds to a DNA binding protein described herein (e.g., a DNA binding protein comprising a DBD of EBNA1 or a DBD of an EBNA1 homolog).
- a DNA binding protein described herein
- EBV DBE or “DBE of EBV” are used interchangeably herein with the terms “OriP DBE” and “EBV OriP DBE.” Methods for measuring binding of a repeating sequence element to a DNA binding protein are further described herein.
- a “consensus sequence” refers to a sequence having the most frequent residues found at each position of a sequence alignment. Methods to generate a consensus sequence are known in the art.
- a consensus sequence is generated by aligning a series of related sequences, determining the frequency of each nucleobase occurring at each position of the alignment, and selecting the nucleobase that occurs most frequently at each position.
- an “OriP DBE consensus sequence” refers to a sequence having the most frequent residues found at each position of a sequence alignment formed from repeating sequence elements present in the OriP of the EBV genome.
- the repeating sequence elements are set forth in SEQ ID NOs: 52-60.
- the OriP DBE consensus sequence is set forth in SEQ ID NO: 8.
- a “fragment of an OriP DBE consensus sequence” refers to a sequence shorter than the OriP DBE consensus sequence (e.g., a truncation of the OriP DBE sequence comprising one or more deletions at the 3 ⁇ end, the 5 ⁇ end, and/or an internal region).
- a fragment of an OriP DBE consensus sequence comprises one or more deletions at the 3 ⁇ end, the 5 ⁇ end, and/or an internal region of SEQ ID NO: 51.
- a ”variant of an OriP DBE consensus sequence refers to a sequence comprising one or more mismatches relative to the OriP DBE consensus sequence (e.g., the OriP DBE consensus sequence SEQ ID NO: 51).
- the variant of the OriP DBE consensus sequence comprises 1, 2, 3, 4, 5, or more mismatches relative to the OriP DBE consensus sequence (e.g., the OriP DBE consensus sequence SEQ ID NO: 51).
- the variant comprises 1 or 2 mismatches relative to the OriP DBE consensus sequence (e.g., the OriP DBE consensus sequence SEQ ID NO: 51).
- the variant comprises at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the OriP DBE consensus sequence (e.g., the OriP DBE consensus sequence SEQ ID NO: 51). In some embodiments, the variant comprises at least about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% identity to the OriP DBE consensus sequence (e.g., the OriP DBE consensus sequence SEQ ID NO: 51).
- the term “DBE of an NHP LCV,” interchangeably used with “NHP LCV DBE,” refers to a repeating sequence element present in a genome of an NHP LCV.
- the repeating sequence element is a DBE that binds to a DNA binding protein described herein (e.g., a DNA binding protein comprising a DBD of EBNA1 or a DBD of an EBNA1 homolog). Methods for identifying a repeating sequence element present in the genome of an NHP LCV are further described herein.
- the term “NHP LCV,” interchangeably used with “LCV of an NHP,” refers to a virus of the genera Lymphocryptovirus in the subfamily Gammaherpesvirinae, wherein the virus is identified as infecting an NHP.
- the term “NHP” or “non-human primate” refers to an animal of the order Primates.
- the NHP is a simian (i.e., an NHP of the infraorder Simiiformes).
- the NHP is selected from the group consisting of the Integrated Taxonomic Information System (IT IS) Taxonomic Serial No.: 943778 (Simiiformes).
- the NPH is selected from a group consisting of species listed in Table 1.
- the IT IS refers to a public database (accessible via the world wide web: gbif.org) that provides taxonomic information on plants, animals, fungi, and microbes.
- the term “EBNA1” refers to wild-type EBNA1 unless otherwise specified.
- wild-type EBNA1 refers to the native protein encoded by the EBV genome that functions in replication and partitioning of the viral genome during a latent EBV infection.
- EBNA1 is about 641 amino acid residues in length. Amino acid and nucleotide sequence information for EBNA1 str accessible via one or more public databases using the identification number P03211 (UniProt) and gene ID 3783709 (NCBI).
- the portion of the EBV reference genome corresponding to EBNA1 is set forth by coordinates 95662 to 97587 (according to the EBV reference genome identified by NCBI reference sequence NC_007605.1).
- the EBNA1 polypeptide comprises or consists of the amino acid sequence set forth in SEQ ID NO: 1.
- the EBNA1 polypeptide comprises or consists of an amino acid sequence encoded by a nucleotide sequence set forth in SEQ ID NO: 12.
- the term “functional fragment of an EBNA1 polypeptide” and “functional variant of an EBNA1 polypeptide” refer respectively to a truncated EBNA1 polypeptide or an altered EBNA1 polypeptide that maintains one or more functional properties of the wild-type EBNA1 polypeptide.
- the one or more functional properties of a wild-type EBNA1 polypeptide comprises (i) binding to an EBV DBE; (ii) binding to chromatin; (iii) transport into the cell nucleus; (iv) tethering of DBE-containing DNA to chromatin; (v) retention of DBE-containing DNA in the nucleus during or following mitosis; or (vi) a combination of (i)-(v).
- the one or more functional properties of a EBNA1 polypeptide described herein are determined according to methods further described herein.
- the one or more functional properties of a wild- type EBNA1 polypeptide result in a desired outcome when introduced to a cell in combination with a recombinant expression vector comprising a transgene and the EBV OriP.
- the desired outcome is improved or enhanced as compared to introducing the recombinant expression vector alone.
- the desired outcome is selected from (i) replication of the recombinant expression vector; (ii) episomal maintenance of the recombinant expression vector; (iii) nuclear transport of the recombinant expression vector; (iv) tethering of the recombinant expression vector to chromatin; (v) retention of the recombinant expression vector in the nucleus following mitosis; and (vi) a combination of (i)- (v). Methods for determining whether the desired outcome is achieved are further described herein.
- sequence homology between amino acid sequences or between nucleic acid sequences is defined based upon a shared ancestry. For examples, in some embodiments, two nucleic acids have a shared ancestry due to a speciation event (orthologs) or a duplication event (paralogs).
- sequence homology between amino acid sequences or between nucleic acid sequences is defined based upon a high degree of sequence similarity. Substantial sequence similarity suggests that two sequences are related by divergent evolution from a common ancestor. Alignment of multiple sequences is performed to identify homologous regions. Methods to perform sequence alignment are known in the art, and further described herein. [00180] As used herein, the term “sequence similarity,” “similarity,” “sequence identity,” and identity” refer to the overall relatedness between nucleic acids or polypeptides. “Percent identity” refers to the number of identical nucleic acid residues or amino acids over a defined length in a given alignment of nucleic acid or polypeptide sequences.
- Percent similarity refers to the number of amino acids over a defined length in a given alignment that are identical or conservative amino acid substitutions. Calculation of the percent identity of two nucleic acid sequences, for example, can be performed by aligning the two sequences for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second nucleic acid sequences for optimal alignment and non-identical sequences can be disregarded for comparison purposes). The nucleotides at corresponding nucleotide positions are then compared. When a position in the first sequence is occupied by the same nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position.
- the percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap which needs to be introduced for optimal alignment of the two sequences.
- the percent similarity between two sequences is performed using a similarity-scoring matrix such as EBLOSUM62 (aka “BLOSUM 62,” accessible via the world wide web: ebi.ac.uk/Tools/psa/emboss_needle/).
- the similarity-scoring matrix provides a substitution matrix for sequence alignment of proteins for the purpose of identifying whether an amino acid substitution is conservative or non-conservative.
- the similarity-scoring matrix is a BLOSUM (Blocks Substitution Matrix).
- a BLOSUM is constructed based on local alignments of conserved regions of protein families, which are then counted based upon the relative frequency of amino acid and their substitution probabilities.
- a log-odds score for each of the 210 possible substitution pairs of 20 standard amino acids is tabulated and reflected in the matrix.
- the possible substitution pairs are arranged pair-wise in a matrix of (i) residues and (j) residues, with a BLOSUM score for a substitution of an (i)th residue with a (j)th residue.
- the scores range from +11 to -4.
- a higher BLOSUM score indicates the substitution is conservative and a lower BLOSUM score indicates the substitution is non- conservative.
- a conservative substitution of serine (S) with threonine (T) or substitution of leucine (L) with isoleucine (I) has a positive BLOSUM score (e.g., +1 for S to T and +2 for L to I).
- BLOSUM a positive BLOSUM score
- Methods for generating a BLOSUM are further described in Henikoff, et al (1992) PNAS 89:10915.
- Methods to perform sequence alignment for determining percent identity or similarity are known in the art and described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D.
- Exemplary algorithms for performing sequence alignment include, but are not limited to, the algorithm of Meyers and Miller (CABIOS, 1989, 4:11-17), which has been incorporated into the ALIGN program (version 2.0) , the GAP program in the GCG program package (see Devereux et al., Nucleic Acids Research, 12(1): 387,1984), BLASTP, BLASTN, and FASTA (see Altschul, S. F. et al., J. Molec. Biol., 215, 403, 1990).
- chimeric in reference to a DNA binding protein comprising a EBNA1 DBD refers to a polypeptide comprising (i) one or more chromatin binding domains and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are mutually heterologous in that they do not occur together in the same arrangement in a wild-type EBNA1 polypeptide.
- the polypeptide comprises (i) and (ii) in an arrangement that does not occur in a wild-type EBNA1 polypeptide.
- the polypeptide comprises a different order, orientation, and/or spacing of (i) and (ii) as compared to domains having a substantially similar function that are present in a wild-type EBNA1 polypeptide.
- the chimeric DNA binding protein comprises (i) one or more EBNA1 chromatin binding domains and (ii) at least one EBNA1 DBD, wherein the arrangement of (i) and (ii) are different than domains having a substantially similar function that are present in a wild-type EBNA1 polypeptide.
- the chimeric DNA binding protein comprises (i) one or more chromatin binding domains and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are mutually heterologous in that the one or more chromatin binding domains are not derived from an EBNA1 polypeptide.
- (i) and (ii) are operably linked, optionally via a linker.
- the chimeric DNA binding protein comprises one or more additional domains described herein, e.g., a nuclear localization signal, transactivation domain, or transcription factor binding domain.
- the polypeptide comprises (i) and (ii) in an arrangement that does not occur in the wild-type EBNA1 homolog.
- the polypeptide comprises a different order, orientation, and/or spacing of (i) and (ii) as compared to domains having a substantially similar function that are present in the wild-type EBNA1 homolog.
- the chimeric DNA binding protein comprises (i) a chromatin binding domain of the EBNA1 homolog and (ii) a DBD of the EBNA1 homolog, wherein the arrangement of (i) and (ii) are different than domains having a substantially similar function that are present in the wild-type EBNA1 homolog.
- the chimeric DNA binding protein comprises (i) a chromatin binding domain and (ii) a DBD of an EBNA1 homolog, wherein (i) and (ii) are mutually heterologous in that the chromatin binding domain is not derived from the EBNA1 homolog.
- (i) and (ii) are operably linked, optionally via a linker.
- the chimeric DNA binding protein comprises one or more additional domains described herein, e.g., a nuclear localization signal, transactivation domain, or transcription factor binding domain.
- the term “heterologous” refers to a substance coming from a source other than its native source.
- heterologous chromatin binding domain refers to a chromatin binding domain from a source other than the source of the DBD present in the DNA binding protein.
- the heterologous chromatin binding domain is a chromatin binding domain from a source other than the EBNA1 homolog.
- the term “at least a portion” or “fragment” of a nucleic acid or a polypeptide refers to a portion of a reference sequence having the minimal size characteristic for the said nucleic acid or polypeptide and up to the full-length reference sequence.
- the term “at least a portion” or “fragment” in reference to a wild-type EBNA1 refers to a portion of full- length EBNA1 comprising the EBNA1 DBD.
- the fragment is a truncated EBNA1 (e.g., EBNA1 truncated at the N-terminus, the C-terminus, and/or an internal site) comprising the EBNA1 DBD.
- EBNA1 homolog refers to a portion of full-length EBNA1 homolog comprising the DBD.
- the fragment is a truncated EBNA1 homolog (e.g., the EBNA1 homolog truncated at the N-terminus, the C-terminus, and/or an internal site) comprising the DBD.
- variant of a nucleic acid or a polypeptide refers to a sequence variant of a reference sequence that maintains a desired biological function of the reference sequence.
- variants in reference to EBNA1 refers to an EBNA1 altered by nucleotide substitution, deletion, and/or insertion that maintains a biological function/activity of EBNA1.
- variants in reference to an EBNA1 homolog described herein refers to an EBNA1 homolog altered by nucleotide substitution, deletion, and/or insertion that maintains a biological function/activity of the wild-type EBNA1 homolog.
- biological function used interchangeably with “functional property” in reference to a wild-type EBNA1 or EBNA1 homolog refers to a desirable characteristic when introduced to a cell as part of a system described herein (e.g., a system comprising the wild-type EBNA1 or EBNA1 homolog, or a nucleic acid or recombinant expression vector encoding said EBNA1 or EBNA1 homolog, and a transgene-encoding recombinant expression vector comprising a DNA binding polynucleotide that binds the EBNA1 or EBNA1 homolog).
- the desirable characteristics results in enhanced expression (e.g., an increased level and/or duration of expression) of the transgene- encoding recombinant expression vector as compared to the recombinant expression vector introduced alone.
- the functional property is selected from (i) binding to a DNA binding polynucleotide described herein; (ii) binding to chromatin; (iii) transport into the cell nucleus; (iv) tethering of a DNA binding polynucleotide to chromatin; (v) retention of the DNA binding polynucleotide in the nucleus during or following mitosis; and (vi) a combination of (i)-(v).
- the functional property is determined according to a method further described herein.
- the functional properties of a wild- type EBNA1 homolog result in enhanced expression (e.g., an increased level and/or duration of expression) when introduced to a cell in combination with a recombinant expression vector comprising a transgene and a DBE of an EBV or NHP LCV.
- a "plasmid” refers to a circular double stranded DNA loop into which additional DNA segments can be ligated
- vector refers to a nucleic acid sequence capable of transporting another nucleic acid to which it has been linked for expression in a host cell.
- the term refers to a nucleic acid suitable for cloning and expression of a nucleotide sequence.
- certain vectors are capable of directing the expression of genes to which they are operatively linked.
- the term “recombinant expression vector” refers to a vector comprising a transgene and one or more additional components to enable expression of the transgene when introduced to a cell.
- the present disclosure provides a system for enhancing expression of a transgene in a cell or population of cells, the system comprising (i) an mRNA encoding a DNA binding protein, wherein the DNA binding protein comprises one or more chromatin binding domains and at least one EBNA1 DBD, and (ii) a recombinant expression vector comprising at least one transgene and a polynucleotide comprising one or more EBV DBEs.
- the present disclosure provides a system for enhancing expression of a transgene, the system comprising (i) a DNA binding protein, or a nucleic acid or a recombinant expression vector encoding the DNA binding protein, wherein the DNA binding protein comprises a DBD of an EBNA1 homolog, or a variant thereof or a fragment thereof, and a chromatin binding domain, wherein the EBNA1 homolog is of an NHP LCV, and (ii) a recombinant expression vector comprising a transgene and a DNA binding polynucleotide comprising a DNA binding element (DBE) of an Epstein-Barr virus (EBV or a DBE of an NHP LCV.
- DBE DNA binding element
- EBV Epstein-Barr virus
- the DNA binding protein comprises a DBD of an EBNA1 homolog, or a variant thereof or a fragment thereof, and a chromatin binding domain, wherein the EBNA1 homolog is derived from an NHP LCV.
- the DNA binding protein has a first targeting function mediated by the DBD.
- the DNA binding protein has a first targeting function mediated by the at least one EBNA1 DBD.
- the DNA binding protein has a first targeting function mediated by the DBD of an EBNA1 homolog, or a variant or a fragment thereof.
- the first targeting function localizes and associates the DNA binding protein to the recombinant expression vector when a system described herein is introduced to a cell (e.g., a mammalian cell).
- the recombinant expression vector comprises at least one transgene for expression in the cell and a polynucleotide comprising one or more DBEs (e.g., one or more DBEs from an EBV OriP).
- the polynucleotide comprises an EBV DBE, or a fragment or a variant thereof.
- the DBD binds to the DNA binding polynucleotide with a binding affinity sufficient to mediate association of the DNA binding protein to the recombinant expression vector under physiological conditions.
- the DNA binding protein comprises at least one EBNA1 DBD and the DNA binding polynucleotide comprises an EBV DBE (e.g., an array of EBV DBEs).
- the at least one EBNA1 DBD binds to the polynucleotide with a binding affinity sufficient to mediate association of the DNA binding protein to the recombinant expression vector under physiological conditions.
- the DNA binding protein comprises at least one DBD of an EBNA1 homolog and the DNA binding polynucleotide comprises an EBV DBE (e.g., an array of EBV DBEs) or an NHP LCV DBE (e.g., an array of NHP LCV DBEs).
- the at least one DBD of the EBNA1 homolog binds to the polynucleotide with a binding affinity sufficient to mediate association of the DNA binding protein to the recombinant expression vector under physiological conditions.
- the DNA binding protein has a second targeting function mediated by the one or more chromatin binding domains.
- the second targeting function localizes and associates the DNA binding protein, or a complex of the DNA binding protein and the recombinant expression vector, to chromatin, a chromatin-associated structure, the nuclear lamina, and/or the nuclear matrix in the cell.
- Methods to evaluate targeting a chromatin- associated structure, the nuclear lamina, and/or the nuclear matrix in a cell are known in the art and further described herein.
- FIG. 1 provides a schematic illustrating exemplary functions of the DNA binding protein that result in increased expression of the at least one transgene present in the recombinant expression vector upon introducing a system described herein to a cell as compared to introducing the recombinant expression vector alone.
- the DNA binding protein increases nuclear uptake of the recombinant expression vector when a system described herein is introduced to a cell. In some embodiments, the DNA binding protein tethers the recombinant expression vector to chromatin when a system described herein is introduced to a cell. In some embodiments, the DNA binding protein increases retention of the recombinant expression vector in the nucleus during mitosis when a system described herein is introduced to a cell.
- the portion of the recombinant expression vector comprising a polynucleotide comprising one or more DBEs forms a complex with the DNA binding protein or a plurality of DNA binding proteins.
- DBEs e.g., one or more DBEs of an EBV OriP
- formation of the complex results in enhanced expression of the system due to enhanced nuclear uptake, retention in the nuclease during cell mitosis, and/or increased tethering to chromatin.
- the systems described herein comprise a DNA binding protein, a nucleic acid (e.g., mRNA) encoding the DNA binding protein, or a recombinant expression vector comprising a nucleic acid encoding the DNA binding protein.
- the DNA binding protein comprises a chromatin binding domain and an DBD.
- the DBD comprises a DBD of EBNA1, or a variant or fragment thereof.
- the DBD comprises a DBD of an EBNA1 homolog, or a variant or fragment thereof, wherein the EBNA1 homolog is derived from an NHP LCV described herein.
- the disclosure provides a DNA binding protein comprising one or more chromatin binding domains and at least one EBNA1 DBD, a nucleic acid (e.g., mRNA) encoding the DNA binding protein, or a recombinant expression vector comprising a nucleic acid encoding the DNA binding protein.
- the disclosure provides an mRNA encoding a DNA binding protein comprising one or more chromatin binding domains and at least one EBNA1 DBD.
- the DNA binding protein comprises a full-length or truncated EBNA1 polypeptide.
- the EBNA1 polypeptide is a wild-type EBNA1 polypeptide.
- the EBNA1 polypeptide is a variant EBNA1 polypeptide.
- the variant comprises a modification (e.g., a deletion, an insertion, and/or a substitution) of a domain present in wild-type EBNA1 (e.g., a modification of the TA domain, the NLS domain, and/or the Gly-Arg repeat region).
- the DNA binding protein comprises a chimeric polypeptide comprising one or more chromatin binding domain and a polypeptide comprising at least one EBNA1 DBD.
- the DNA binding protein is a chimeric polypeptide comprising a chromatin binding domain (e.g., 1, 2, 3, or 4 chromatin binding domains) and an EBNA1 DBD, or a variant or a fragment thereof.
- a chromatin binding domain e.g., 1, 2, 3, or 4 chromatin binding domains
- the one or more chromatin binding domains are derived from an EBNA1 polypeptide. In some embodiments, the one or more chromatin binding domains are not from an EBNA1 polypeptide.
- the systems described herein comprise a DNA binding protein, or a nucleic acid or recombinant expression vector encoding the DNA binding protein, wherein the DNA binding protein comprises DBD of an EBNA1 homolog, or a variant thereof or a fragment thereof, wherein the EBNA1 homolog is derived from an NHP LCV, and a chromatin binding domain.
- the DNA binding protein comprises one DBD.
- the DNA binding protein comprises more than one DBD.
- the DNA binding protein comprises one chromatin binding domain.
- the DNA binding protein comprises more than one chromatin binding domain.
- the DNA binding protein comprises a full-length or truncated EBNA1 homolog, wherein the EBNA1 homolog is of an NHP LCV described herein.
- the EBNA1 homolog is a wild-type EBNA1 homolog (i.e., a native EBNA1 homolog encoded by the NHP LCV genome).
- the EBNA1 homolog comprises a modification (e.g., a deletion, an insertion, and/or a substitution) relative to the wild-type EBNA1 homolog.
- the DNA binding protein is a chimeric polypeptide comprising a chromatin binding domain (e.g., 1, 2, 3, or 4 chromatin binding domains) and an EBNA1 homolog DBD, or a variant or a fragment thereof.
- the one or more chromatin binding domains are derived from an EBNA1 polypeptide.
- the one or more chromatin binding domains are derived from an EBNA1 homolog of an NHP LCV described herein.
- the one or more chromatin binding domains are not from an EBNA1 polypeptide or from an EBNA1 homolog.
- the disclosure provides an EBNA1 polypeptide, or a variant or a fragment thereof. In some embodiments, the disclosure provides an mRNA comprising an ORF encoding an EBNA1 polypeptide. In some embodiments, the EBNA1 polypeptide is a wild-type EBNA1 polypeptide. In some embodiments the EBNA1 polypeptide has the same length or substantially the same-length as the wild-type EBNA1 polypeptide.
- the EBNA1 polypeptide comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 1. In some embodiments, the EBNA1 polypeptide comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 1. In some embodiments, the EBNA1 polypeptide comprises SEQ ID NO: 1.
- the EBNA1 polypeptide comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 12. In some embodiments, the EBNA1 polypeptide comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 12. In some embodiments, the EBNA1 polypeptide comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 12.
- the disclosure provides an mRNA comprising an ORF encoding an EBNA1 polypeptide, wherein the ORF comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 12.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 12.
- the ORF comprises SEQ ID NO: 12.
- the disclosure provides a functional fragment of an EBNA1 polypeptide and/or a functional variant of an EBNA1 polypeptide.
- the disclosure provides an mRNA comprising an ORF encoding a functional fragment of an EBNA1 polypeptide or a functional variant of an EBNA1 polypeptide.
- the one or more functional properties of wild-type EBNA1 polypeptide are mediated by one or more domains.
- the EBNA1 polypeptide comprises from N-terminus to C- terminus: an N-terminal domain (corresponding to amino acid residues 1 to about 39 of SEQ ID NO: 1); a Gly-Arg Repeat Region 1 (corresponding to residues about 33 to about 89 of SEQ ID NO: 1); a Gly-Ala Repeat Region 1 (corresponding to residues about 90 to about 324 of SEQ ID NO: 1); a Gly-Arg Repeat Region 2 (corresponding to residues about 325 to about 380 of SEQ ID NO: 1) comprising a looping domain (corresponding to residues about 325 to about 376 of SEQ ID NO: 1); a DBD (corresponding to residues about 459 to about 607 of SEQ ID NO: 1); and an acidic tail C terminal region (corresponding to residues about 608 to about 641 of SEQ ID NO: 1).
- an N-terminal domain corresponding to amino acid residues 1 to about 39 of SEQ ID NO: 1
- a Gly-Arg Repeat Region 1 corresponding to residues about 33 to
- the EBNA1 polypeptide comprises two chromatin-binding regions (“domain A” and “domain B”), a nuclear localization signal (NLS), and a DNA binding domain.
- domain A corresponds to residues about 33 to about 89 of SEQ ID NO: 1.
- domain B corresponds to residues about 325 to about 378of SEQ ID NO: 1.
- the EBNA1 NLS corresponds to residues about 379 to about 386 of SEQ ID NO: 1.
- the EBNA1 DBD corresponds to residues about 459 to about 607 of SEQ ID NO: 1.
- the EBNA1 polypeptide comprises a transactivation domain (“TA domain”).
- the TA domain corresponds to residues about 393 to about 450 of SEQ ID NO: 1.
- the TA domain has been implicated as having one or more antigens that are immunogenic in humans.
- EBNA1 has been shown to contain an antigen that induces generation of antibodies capable of cross-reacting with human tissues, thereby inducing autoimmune responses that contribute to the undesirable disease or condition.
- antibodies generated against an antigen in the TA domain of EBNA1 cross-react with glial cell adhesion molecule (GlialCAM).
- the antigen identified corresponds to amino acid residue about 386 to about 405 of SEQ ID NO: 1.
- antibodies generated against a second antigen in the TA domain of EBNA1 cross-react with alpha-crystallin B (CRYAB).
- the second antigen corresponds to amino acid residue about 393 to about 412.
- the generation of cross-reactive antibodies in subjects with an EBV infection is thought to contribute to the etiology of certain disease, such as MS (see Barr-Or, et al (2020) Trends Mol Med 26:296).
- deletion of the TA domain or portion thereof may reduce the capability of an EBNA1 polypeptide to cause chromosomal damage.
- genomic instability is associated with EBNA1 binding to genomic regions comprising a repeat array of DBEs with sequence similarity to a DBE of an EBV OriP, which is mitigated for EBNA1 having a deletion of the TA domain or portion thereof.
- an mRNA of the disclosure encoding an EBNA1 polypeptide comprising a deletion of the TA domain or a portion thereof (e.g., a portion corresponding to amino acid residue about 393 to about 412 of SEQ ID NO: 1, a portion corresponding to amino acid residue about 386 to about 405 of SEQ ID NO: 1, a portion corresponding to amino acid residue about 394 to about 399 of SEQ ID NO: 1, or a portion corresponding to amino acid residue about 386 to about 405) has one or more desirable properties for use in vivo, including reduced risk of inducing a deleterious immune response (e.g., an autoimmune response) and/or genomic instability.
- a deleterious immune response e.g., an autoimmune response
- the disclosure provides an mRNA comprising an ORF encoding a functional fragment of an EBNA1 polypeptide.
- the functional fragment comprises a truncated sequence shorter than wild-type EBNA1, wherein the truncated sequence comprises a portion of wild-type EBNA1 that retains its functional activity.
- the truncated sequence comprises a deletion of an amino terminal region of wild-type EBNA1.
- the truncated sequence comprises a deletion of a carboxy terminal region of wild-type EBNA1.
- the truncated sequence comprises a deletion of an internal region of wild-type EBNA1.
- the truncated sequence comprises a deletion of (i) the Gly-Ala Repeat Region 1 (corresponding to residues about 90 to about 324 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 230, 220, 210, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 residues thereof); (ii) the NLS (corresponding to residues about 379 to about 386 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 10, 9, 8, 7, 6, or 5 residues thereof); (iii) the N-terminal domain (corresponding to amino acid residues 1 to about 39 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9,
- the truncated sequence comprises a deletion of (i) the Gly-Ala Repeat Region 1 (corresponding to residues about 90 to about 324 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 230, 220, 210, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 residues thereof); (ii) the NLS (corresponding to residues about 379 to about 386 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 10, 9, 8, 7, 6, or 5 residues thereof); (iii) the N-terminal domain (corresponding to amino acid residues 1 to about 39 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9,
- the truncated sequence is about 160 to about 300, about 160 to about 350, about 160 to about 400, about 160 to about 450, about 160 to about 500, about 300 to about 400, about 300 to about 450, about 300 to about 500, or about 400 to about 500 amino acids in length.
- the truncated sequence is about 160, about 170, about 180, about 190, about 200, about 210, about 220, about 230, about 240, about 250, about 260, about 270, about 280, about 290, about 300, about 310, about 320, about 330, about 340, about 350, about 360, about 370, about 380, about 390, about 400, about 410, about 420, about 430, about 440, about 450, about 460, about 470, about 480, about 490, or about 500 amino acids in length. In some embodiments, the truncated sequence is about 160 to about 400 amino acids in length.
- the truncated sequence is about 200 to about 400 amino acids in length.
- the truncated sequence comprises at least one EBNA1 chromatin-binding domain (e.g., EBNA1 domain A and/or EBNA1 domain B) or portion thereof and the EBNA1 DBD or a portion thereof, wherein the truncated sequence comprises a deletion of (i) the N-terminal domain (corresponding to amino acid residues 1 to about 39 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, or 5 residues thereof), (ii) the Gly-Ala Repeat Region 1 (corresponding to residues about 90 to about 324 of SEQ ID NO: 1) or a portion thereof (e.g., a sequence of about 230, 220, 210, 200,
- the truncated sequence comprises at least one chromatin- binding domain (e.g., domain A and/or domain B) or portion thereof and the EBNA1 DBD or a portion thereof. In some embodiments, the truncated sequence comprises domain A or a portion thereof and the EBNA1 DBD or a portion thereof. In some embodiments, the truncated sequence comprises an amino acid sequence corresponding to residues about 33 to about 89 of SEQ ID NO: 1 or a portion thereof and residue 459 to about 607 of SEQ ID NO: 1 or a portion thereof.
- the functional fragment of an EBNA1 polypeptide comprises SEQ ID NO: 14 or a portion thereof and SEQ ID NO: 18 or a portion thereof.
- the truncated sequence comprises domain B or a portion thereof and the EBNA1 DBD or a portion thereof.
- the truncated sequence comprises an amino acid sequence corresponding to residues about 325 to about 378 of SEQ ID NO: 1 or a portion thereof and an amino acid sequence corresponding to residue 459 to about 607 of SEQ ID NO: 1 or a portion thereof.
- the functional fragment of an EBNA1 polypeptide comprises SEQ ID NO: 16 or a portion thereof and SEQ ID NO: 18 or a portion thereof.
- the truncated sequence comprises domain A, domain B, and the EBNA1 DBD. In some embodiments, the truncated sequence comprises an amino acid sequence corresponding to residues about 33 to about 89 of SEQ ID NO: 1 or a portion thereof, an amino acid sequence corresponding to residues about 325 to about 378 of SEQ ID NO: 1 or a portion thereof, and an amino acid sequence corresponding to residue 459 to about 607 of SEQ ID NO: 1 or a portion thereof. In some embodiments, the functional fragment of an EBNA1 polypeptide comprises SEQ ID NO: 14 or a portion thereof, SEQ ID NO: 16 or a portion thereof, and SEQ ID NO: 18 or a portion thereof.
- the truncated sequence comprises at least one chromatin-binding domain (e.g., domain A and/or domain B), the NLS, and the DBD. In some embodiments, the truncated sequence comprises domain A, the NLS, and the DBD. In some embodiments, the truncated sequence comprises domain B, the NLS, and the DBD. In some embodiments, the truncated sequence comprises domain A, domain B, the NLS, and the DBD. [00210] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 3.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 3. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 24. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: 24. [00211] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 130.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 130. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 129. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 129. [00212] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 133.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 133. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 132. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 132. [00213] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 137.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 137. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 138. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 138. [00214] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 143.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 143. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 144. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 144. [00215] In some embodiments, the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 140.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 140. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 141. In some embodiments, the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 141.
- the truncated sequence comprises a substitution of the EBNA1 NLS or a portion thereof (e.g., an EBNA NLS having the amino acids sequence of SEQ ID NO: 7 or a portion thereof) with a heterologous EBNA1 NLS (e.g., a heterologous NLS describe herein, e.g., a c-myc NLS).
- the truncated sequence comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 134.
- the truncated sequence comprises an amino acid sequence set forth in SEQ ID NO: 134.
- the truncated sequence comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 135.
- the truncated sequence comprises an amino acid sequence encoded by SEQ ID NO: SEQ ID NO: 135.
- the disclosure provides an mRNA comprising an ORF encoding a functional variant of an EBNA1 polypeptide or a portion thereof.
- the functional variant comprises the sequence of a full-length or truncated EBNA1 polypeptide with one or more alterations.
- the one or more alterations comprises substitution of one or more amino acid residues.
- the functional variant comprises SEQ ID NO: 1 with one or more conservative amino acid substitutions.
- the functional variant comprises a portion of SEQ ID NO: 1 with one or more conservative amino acid substitutions.
- the functional variant comprises SEQ ID NO: 3 with one or more conservative amino acid substitutions.
- the functional variant comprises SEQ ID NO: 130 with one or more conservative amino acid substitutions.
- the functional variant comprises SEQ ID NO: 131 with one or more conservative amino acid substitutions.
- the functional variant comprises SEQ ID NO: 137 with one or more conservative amino acid substitutions.
- the functional variant comprises SEQ ID NO: 140 with one or more conservative amino acid substitutions. In some embodiments, the functional variant comprises SEQ ID NO: 143 with one or more conservative amino acid substitutions. [00218] In some embodiments, the functional variant comprises domain A or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain A and/or the EBNA1 DBD comprise one or more alterations.
- the functional variant comprises domain A or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 14.
- the functional variant comprises domain A or a portion thereof and the EBNA1 DBD or a portion thereof, wherein the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 amino acid residues present in SEQ ID NO: 18.
- the functional variant comprises domain A or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 14 and the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 amino acid residues present in SEQ ID NO: 18.
- the functional variant comprises domain B or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain B and/or the EBNA1 DBD comprise one or more alterations.
- the functional variant comprises domain B or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 16.
- the functional variant comprises domain B or a portion thereof and the EBNA1 DBD or a portion thereof, wherein the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 amino acid residues present in SEQ ID NO: 18.
- the functional variant comprises domain B or a portion thereof and the EBNA1 DBD or a portion thereof, wherein domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 16 and the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 amino acid residues present in SEQ ID NO: 18.
- the functional variant comprises domain A or a portion thereof, domain B or a portion thereof, and the EBNA1 DBD or a portion thereof, wherein domain A, domain B and/or the EBNA1 DBD comprise one or more alterations.
- the functional variant comprises domain A or a portion thereof, domain B or a portion thereof, and the EBNA1 DBD or a portion thereof, wherein (i) domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 15; (ii) domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 16; (iii) the DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120,
- the DNA binding protein comprises a wild-type EBNA1 homolog, or a fragment thereof or a variant thereof, wherein the EBNA1 homolog is derived from an NHP LCV, and wherein the DNA binding protein comprises a DBD of the wild-type EBNA1 homolog, or a fragment thereof or a variant thereof.
- the EBNA1 homolog is identified in a public database such as GenBank. GenBank is a genetic sequence database that provides an annotated collection of publicly available DNA sequences (see Nucleic Acids Res (2013) 41:D36).
- Sequence data from GenBan is retrieved by searching for sequence identifiers and annotations with Entrez nucleotide (ncbi.nlm.nih.gov/nucleotide/), performing a search and alignment of GenBank sequences to a query sequence using BLAST (blast.ncbi.nlm.nih.gov/blast.cgi), or downloading sequences using NCBI e-utilities (ncbi.nlm.nih.gov/books/NBK25501/).
- the EBNA1 homolog is identified from an NHP LCV using methods further described herein.
- the EBNA1 homolog is derived from an NHP LCV.
- LCVs have been identified from infections of numerous NHP species, including those in the wild and held in captivity. LCV infection, as measured by seropositivity (detection of anti- LCV antibodies in serum), is greater than about 95% in Old World monkeys and greater than about 50% in New World monkeys (see Fogg, et al (2005) J. Virol 79:10069)
- the NHP is a simian (i.e., one of the infraorder Simiiformes).
- the NHP identified by IT IS Taxonomic Serial No.: 943778 (Simiiformes).
- the NHP is a species selected from the group consisting of species listed in Table 1. Table 1: NHP species in the infraorder Simmiformes (IT IS Taxonomic Serial No. 943778)
- the NHP is of the Superfamily Hominoidea (also referred to as apes). In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 943782. In some embodiments, the NHP is a species selected from the group consisting of species listed in Table 2. Table 2: NHP species in the Superfamily Hominoidea (IT IS Taxonomic Serial No: 943782) [00225] In some embodiment, the NHP is of the Superfamily Cercopithecoidea (also referred to as Old World Monkeys). In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 943783.
- the NHP is a species selected from the group consisting of species listed in Table 3.
- Table 3 NHP species in the Superfamily Cercopithecoidea (IT IS Taxonomic Serial No: 943783)
- the NHP is of the Parvorder Platyrrhini (also referred to as New World Monkeys).
- the NHP is selected from the group consisting of species set forth in Table 4.
- the NHP of the Parvorder Platyrrhini is of the Family Callitrichidae.
- the NHP of the Parvorder Platyrrhini is of the Family Cebidae.
- the NHP of the Parvorder Platyrrhini is of the Family Aotidae. In some embodiments the NHP of the Parvorder Platyrrhini is of the Family Pitheciidae. In some embodiments the NHP of the Parvorder Platyrrhini is of the Family Atelidae. Table 4: NHP species in the Parvorder Platyrrhini
- the NHP is of the Family Callitrichidae. In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 572774. In some embodiments, the NHP is a species selected from Callimico goeldii, Callithrix aurita, Callithrix flaviceps, Callithrix geoffroyi, Callithrix jacchus, Callithrix kuhlii, Callithrix penicillata, Cebuella niveiventris, Cebuella pygmaea, Leontocebus cruzlimai, Leontocebus fuscicollis, Leontocebus fuscus, Leontocebus illigeri, Leontocebus lagonotus, Leontocebus leucogenys, Leontocebus nigricollis, Leontocebus nigrifrons, Leontocebus tripartitus, Leontocebus weddelli, Leontopithecus cais
- the NHP is of the Family Cebidae. In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 180093. In some embodiments, the NHP is a species selected from Cebus aequatorialis, Cebus albifrons, Cebus brunneus, Cebus capucinus, Cebus castaneus, Cebus cesarae, Cebus cuscinus, Cebus imitator, Cebus kaapori, Cebus leucocephalus, Cebus malitiosus, Cebus olivaceus, Cebus unicolor, Cebus versicolor, Cebus yuracus, Saimiri boliviensis, Saimiri cassiquiarensis, Saimiri collinsi, Saimiri macrodon, Saimiri oerstedii, Saimiri sciureus, Saimiri ustus, Saimiri vanzolinii, Sapajus apella, Sapajus ca
- the NHP is of the Family Aotidae. In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 943784. In some embodiments, the NHP is a species selected from Aotus azarae, Aotus brumbacki, Aotus griseimembra, Aotus jorgehernandezi, Aotus lemurinus, Aotus miconax, Aotus nancymai, Aotus nigriceps, Aotus trivirgatus, Aotus vociferans, and Aotus zonalis.
- the NHP is of the Family Pitheciidae. In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 612144. In some embodiments, the NHP is a species selected from Cacajao amuna, Cacajao ayresi, Cacajao calvus, Cacajao hosomi, Cacajao melanocephalus, Cacajao novaesi, Cacajao rubicundus, Cacajao ucayalii, Callicebus barbarabrownae, Callicebus coimbrai, Callicebus melanochir, Callicebus nigrifrons, Callicebus personatus, Cheracebus lucifer, Cheracebus lugens, Cheracebus medemi, Cheracebus regulus, Cheracebus torquatus, Chiropotes albinasus, Chiropotes chiropotes, Chiropotes israelita, Chiropotes
- the NHP is of the Family Atelidae. In some embodiments, the NHP is identified by IT IS Taxonomic Serial No.: 943785. In some embodiments, the NHP is a species selected from Alouatta arctoidea, Alouatta belzebul, Alouatta caraya, Alouatta discolor, Alouatta guariba, Alouatta macconnelli, Alouatta nigerrima, Alouatta palliata, Alouatta pigra, Alouatta sara, Alouatta seniculus, Alouatta ululata, Ateles belzebuth, Ateles chamek, Ateles fusciceps, Ateles geoffroyi, Ateles hybridus, Ateles marginatus, Ateles paniscus, Brachyteles arachnoides, Brachyteles hypoxanthus, Lagothrix flavicauda, and Lagothrix lagothricha.
- the NHP is characterized as a host of virus in the subfamily Gammaherpesvirinae. In some embodiments, the NHP is characterized as a host to a virus of the genus Lymphocryptovirus (LCV). In some embodiments, the NHP is characterized as a host to an LCV identified by the International Committee on Taxonomy of Viruses (ICTV). Exemplary LCVs identified by the ICTV include, but are not limited to, those listed in Table 5.
- ICTV International Committee on Taxonomy of Viruses
- the NHP is characterized as a host of an LCV infection, but for which a virus obtained from the NHP has not been identified by the ICTV as belonging to the genus Lymphocryptovirus.
- the virus obtained from the NHP shares substantial sequence homology with EBV or an NHP LCV described herein.
- NHPs include, but are not limited to, Saguinus midas, Saimiri sciureus, Pithecia Pithecia, Alouatta seniculus, Hylobates leucogenys, Hylobates lar, Semnopithecus entellus, Mandrillus sphinx, Colobus guereza, Piliocolobus badius, Cercocebus aterrimus, Macaca mulatta, Macaca fascicularis, Macaca fuscata, Macaca fuscata, Macaca silenus, Macaca sylvanus, Macaca tibetana, Erythrocebus patas, Cebus albifrons, Ateles paniscus, or Callithrix penicillate (see, e.g., de Thoisy, et al (2003) J Virol 77:9009; Ehlers, et al (2003) J.
- the NHP is characterized as a host of an LCV infection, wherein a portion of the LCV genomic sequence is available (see, e.g., Ehlers, et al (2010) J Gen Virol 91:630-642).
- Exemplary NHPs include, but are not limited to, those listed in Table 6. Table 6: LCVs from NHPs having partially assembled genomes
- the method comprises obtaining a genomic sequence of an LCV of an NHP.
- the genomic sequence is identified in a public genome database, such as GenBank.
- the genomic sequence is identified by sequencing an infected cell obtained from an NHP.
- LCVs primarily infect B lymphocytes, which provide a source material from which viral DNA is obtained. While B lymphocytes may be harvested, isolated, and used directly for untargeted NGS sequencing, the relative sequencing depth needed would result in high cost.
- LCLs lymphoblastoid cell lines
- An exemplary method to generate an LCL include, but are not limited to, the following. Peripheral blood is collected from an infected NHP and mononuclear cells are isolated using standard methods (e.g., density centrifugation using Ficoll-Hypaque gradients).
- the cells are cultured for a number of weeks using standard methods (e.g., 10 6 cells per mL in RPMI-1640 medium for 6 weeks), to isolate a proliferative population highly enriched for LCV-transformed cells (e.g., defined as surviving and replicating under conditions in which non-transformed B lymphocytes would not substantially survive and replicate).
- monoclonal isolation of a proliferative population is performed.
- the method is performed as described for LCV from Gorilla (see Neubauer, et al (1979) J Virol 31:845), LCV from Chimpanzee (see Gerber et al (1976) J Virol 19:1090), and LCV from Marmoset (Deinhardt et al (1979) Primates Med 10:163).
- An exemplary method to isolate an LCV episome include, but are not limited to, restriction digest of LCV DNA (e.g., LCV DNA isolated from an LCL), ligation thereof into a cosmid destination vector, and subsequent amplification.
- the method is performed as described for noming LCVs from Marmoset (see Rivailler et al (2002) J Virol 76:12055 and Gyu Cho et al (2001) PNAS 98:1224) and from Rhesus (see Rivailler et al (2002) J Virol 76:421).
- a further exemplary method to isolate an LCV episome include those used to isolate extrachromosomal circulate DNA.
- the method comprises isolating nuclei, harvesting DNA from the nucleic (e.g., using a plasmid midi-prep kit), digesting single-stranded DNA (e.g., using Exo VII), and removing linear double stranded DNA (e.g., using ATP-dependent DNase).
- nucleic e.g., using a plasmid midi-prep kit
- digesting single-stranded DNA e.g., using Exo VII
- linear double stranded DNA e.g., using ATP-dependent DNase
- the method is one described in Gagne, Isolation of circular DNA from cell culture, world wide web: protocols.io/view/isolation-of-circular-dna-from-cell-culture-iwacfae; Quinn and Trevor (1997) BioTechniques 23:1044; Moller (2020) Methods Mol Biol 2119:165; Moller, et al (2016) J Vis Exp 110:54239.
- LCV DNA e.g., LCV DNA harvested from an infected NHP B cell, an LCL, or an LCV episome
- NGS next generation sequencing
- LCV DNA is used to generate a whole genome sequencing library that is sequence by NGS (see, e.g., methods for genomic sequencing of a lymphoblastoid cell line as described in Garcia-Perez, et al (2021) Nat Comm 12:3116).
- the LCV DNA is sequenced using a long read platform (i.e., Oxford Nanopore or PacBio HiFi).
- a long read platform provides the benefit of resolving sequence elements present in the LCV DNA (e.g., repetitive elements such as the Gly-Arg-rich region in the EBNA1 homolog or an array of DBEs).
- a long read platform provides the sequence for an EBNA1 homolog encoded by the LCV genome.
- the LCV sequencing data is assembled according to methods known in the art.
- the raw sequencing data is converted to a standard nucleotide sequence format (e.g., FASTQ or FASTA) using a base-calling approach suitable to the sequencing platform used to generate the data.
- the LCV sequencing data is obtained from a long-read platform (i.e., Oxford Nanopore or PacBio HiFi) and the sequence of the EBNA1 homolog is determined without genome assembly.
- the LCV sequencing data is obtained from a long-read platform or a short-read platform, and a primary LCV genome assembly is generated.
- the LCV genome assembly is generated using a de novo assembler program known in the art.
- Exemplary de novo assembler programs include, but are not limited to, Velvet (see Zerbino (2010) Curr Protoc Bioinformatics Unit 11.5).
- the VelvetOptimiser script is used to automatically determine optimal assembly parameters for the generated raw read data.
- identification of a DBD of an EBNA1 homolog is achieved using a fully assembled or a partially assembled LCV genome, so long as the partially assembled LCV genome comprises a coding sequence for an EBNA1 homolog or fragment thereof comprising the DBD.
- the EBNA1 homolog in the LCV genome is determined using an alignment method. Methods for generating sequence alignments are known in the art.
- all open reading frames (ORFs) are identified by searching the LCV genome for all instances of ATG followed by an in-frame stop codon, with a minimum number of codons in-between (e.g., a minimum of about 200 codons).
- the ORF nucleotide sequence is converted to a corresponding amino acid sequences.
- Algorithms to extract ORFs from genomic sequence data are known in the art and include, but are not limited to, orfipy (see Singh and Wurtele (2021) Bioinformatics 37:3019).
- the ORF amino acid sequences obtained from the LCV genome are then aligned to wild-type EBNA1 using an alignment tool, e.g., protein BLAST. Performing a sequence alignment is within the skill of the ordinary artisan using, e.g., resources publicly available via the world wide web: blast.ncbi.nlm.nih.gov.
- the sequence alignment requires a full-length protein.
- a challenge in assembling EBNA1 homologs is the presence of sequence elements homologous to the Gly-Arg-rich region of wild-type EBNA1, which in some cases precludes simple full-length EBNA1 alignment from NGS data.
- TBLASTN is used. TBLASTN is a mode of BLAST that aligns a protein sequence query to a nucleotide reference sequence, wherein the nucleotide reference sequence is translated in all six frames and the regions of the genome nucleotide sequence are returned that have similarity to the DBD query.
- the returned genomic nucleotide sequences are curated by extending the codon sequence in both orientations until a full or partial ORF is identified.
- the protein sequence query is EBNA1 or a portion thereof (e.g., a DBD thereof).
- the protein sequence query is an EBNA1 homolog or portion thereof (e.g., a DBD thereof).
- the nucleotide reference sequence is an LCV genome, e.g., an LCV genome assembled according to a method described herein.
- a partial ORF identified in the LCV genome is sufficient to determine the sequence of an EBNA1 homolog DBD encoded therein.
- a partial ORF is extended by generating primers to extend into and amplify the unknown sequence of the LCV genome using a known sequencing method (e.g., Sanger sequencing or long-read sequencing).
- a known sequencing method e.g., Sanger sequencing or long-read sequencing.
- Exemplary EBNA1 Homologs [00244]
- the EBNA1 homolog is derived from an LCV infecting an NHP of the infraorder Simiiformes.
- the EBNA1 homolog is derived from an LCV infecting an NHP of the family Hominoidea.
- the EBNA1 homolog is derived from an LCV infecting an NHP of the family Cercopithecoidea.
- the EBNA1 homolog is derived from an LCV identified in Table 5. In some embodiments, the EBNA1 homolog is derived from an LCV identified in Table 6. In some embodiments, the EBNA1 homolog is derived from an LCV infecting an NHP of the parvorder Platyrrhini. family Callitrichidae. In some embodiments the NHP of the Parvorder Platyrrhini is of the Family Cebidae. In some embodiments the NHP of the Parvorder Platyrrhini is of the Family Aotidae. In some embodiments the NHP of the Parvorder Platyrrhini is of the Family Pitheciidae.
- the NHP of the Parvorder Platyrrhini is of the Family Atelidae.
- the EBNA1 homolog comprises a sequence having at least about 50% similarity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1). In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50%, about 60%, about 70%, about 80%, about 90% or about 95% similarity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1).
- the EBNA1 homolog comprises a sequence having at least about 50% to about 70% similarity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1). In some embodiments, the EBNA1 homolog comprises a sequence having at least about 55% to about 65% similarity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1).
- the EBNA1 homolog comprises a sequence having at least about 50% identity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1). In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50%, about 60%, about 70%, about 80%, about 90% or about 95% identity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1).
- the EBNA1 homolog comprises a sequence having at least about 50% to about 70% identity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1). In some embodiments, the EBNA1 homolog comprises a sequence having at least about 55% to about 65% identity to a wild-type EBNA1 (e.g., a wild-type EBNA1 polypeptide as set forth in SEQ ID NO: 1). [00247] In some embodiments, the EBNA1 homolog comprises a sequence lacking a T cell epitope present in EBNA1.
- the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence lacking a T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having no more than one T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having no more than one T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having no more than two T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having no more than two T cell epitope present in EBNA1.
- the EBNA1 homolog comprises a sequence element having sequence homology to a T cell epitope present in EBNA1, wherein the sequence element comprises at least 1, 2, 3, 4, or 5 mismatches relative to the T cell epitope.
- a T cell epitope present in EBNA e.g., a T cell epitope present in the EBNA1 DBD
- Table 26 e.g., the T cell epitope present in EBNA1 is set forth in Table 26.
- the T cell epitope present in EBNA1 is selected from SEQ ID NOs: 221, 227, 233, 239, 246, 252, 258, 264, 270, and 276.
- the EBNA1 homolog comprises a sequence having at least about 50% similarity to EBNA1, wherein the sequence lacks a T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50% identity to EBNA1, wherein the sequence lacks a T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50% similarity to EBNA1, wherein the sequence comprises no more than one T cell epitope present in EBNA1.
- the EBNA1 homolog comprises a sequence having at least about 50% identity to EBNA1, wherein the sequence comprises no more than one T cell epitope present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50% similarity to EBNA1, wherein the sequence comprises no more than two T cell epitopes present in EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence having at least about 50% identity to EBNA1, wherein the sequence comprises no more than two T cell epitopes present in EBNA1.
- the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% similarity to an EBNA1 DBD described herein, wherein the sequence lacks a T cell epitope present in the EBNA1 DBD. In some embodiments, the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% identity to an EBNA1 DBD described herein, wherein the sequence lacks a T cell epitope present in the EBNA1 DBD.
- the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% similarity to an EBNA1 DBD described herein, wherein the sequence comprises no more than one T cell epitope present in the EBNA1 DBD. In some embodiments, the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% identity to an EBNA1 DBD described herein, wherein the sequence comprises no more than one T cell epitope present in the EBNA1 DBD.
- the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% similarity to an EBNA1 DBD described herein, wherein the sequence comprises no more than two T cell epitope present in the EBNA1 DBD. In some embodiments, the EBNA1 homolog comprises a DBD, wherein the DBD comprises a sequence having at least about 50% identity to an EBNA1 DBD described herein, wherein the sequence comprises no more than two T cell epitope present in the EBNA1 DBD.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 282. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 282. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 282.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 320. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 320. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 320.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 283. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 283. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 283.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 321. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 321. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 321.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 194. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 194. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 194.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 205. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 205. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 205.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 195. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 195. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 195.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 207. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 207. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 207.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 196. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 196. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 196.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 209. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 209. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 209.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 197. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 197. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 197.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 211. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 211. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 211.
- the EBNA1 homolog comprises an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 311. In some embodiments, the EBNA1 homolog comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 311. In some embodiments, the EBNA1 homolog comprises SEQ ID NO: 311.
- the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 313. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 313. In some embodiments, the EBNA1 homolog comprises an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 313.
- the DNA binding protein comprises a DBD of an EBNA1 homolog described herein, or a variant thereof or a fragment thereof.
- the first, second, and third sequence motifs correspond to regions of the EBNA1 homolog having substantial sequence similarity to regions in wild-type EBNA1.
- w 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, or 58, 59, 60.
- x 6, 7, 8, 9, or 10.
- y 4, 5, 6, 7, or 8.
- z 48, 49, 50, 51, 52, or 53.
- the first sequence motif is at least about 5, 6, 7, 8, 9, 10, 11, or 12 amino acid residues in length. In some embodiments, the first sequence motif if about 10 amino acid residues in length.
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 514 to 523 of wild-type EBNA1 (e.g., amino acid residues 514 to 523 of SEQ ID NO: 1).
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 514 to 523 of wild-type EBNA1 (e.g., amino acid residues 514 to 523 of SEQ ID NO: 1).
- the first sequence motif comprises or consists of amino acid residues 514 to 523 of SEQ ID NO: 1). [00258] In some embodiments, the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 514 to 523 as defined in a consensus sequence set forth in Table 7. In some embodiments, the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 514 to 523 as defined in a consensus sequence set forth in Table 7.
- the first sequence motif comprises or consists of amino acid residues 514 to 523 as defined in a consensus sequence set forth in Table 7. [00259] In some embodiments, the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to KX 48 X 49 X 50 YX 51 LRRX 52 (SEQ ID NO: 357), X 48 , X49, X50, X51, and X52 are defined as in consensus 1 set forth in Table 7.
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to KX 48 X 49 X 50 YX 51 LRRX 52 (SEQ ID NO: 284), X 48 , X 49 , X 50 , X 51 , and X 52 are defined as in consensus 2 set forth in Table 7.
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to KX48X49X50YX51LRRX52 (SEQ ID NO: 357), X 48 , X 49 , X 50 , X 51 , and X 52 are defined as in consensus 1 set forth in Table 7.
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to KX 48 X 49 X 50 YX 51 LRRX 52 (SEQ ID NO: 284), X 48 , X 49 , X 50 , X 51 , and X 52 are defined as in consensus 2 set forth in Table 7.
- the first sequence motif comprises or consists of KX 48 X 49 X 50 YX 51 LRRX 52 (SEQ ID NO: 357), wherein X 48 , X 49 , X 50 , X 51 , and X52 are defined as in consensus 1 set forth in Table 7.
- the first sequence motif comprises or consists of KX 48 X 49 X 50 YX 51 LRRX 52 (SEQ ID NO: 284), wherein X 48 , X 49 , X50, X51, and X52 are defined as in consensus 2 set forth in Table 7.
- X48 is T, I, N, or W;
- X 49 is C, S, P;
- X 50 is V, L, C, or I;
- X 51 is N or S; and
- X 52 is C,G, or A.
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the first sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the first sequence motif comprises or consists of a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), KNSCYNLRRC (SEQ ID NO: 292), and KWPLYSLRRA (SEQ ID NO: 293).
- the second sequence motif is at least about 5, 6, 7, 8, 9, 10, 11, or 12 amino acid residues in length. In some embodiments, the second sequence motif if about 10 amino acid residues in length.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 532 to 541 of wild-type EBNA1 (e.g., amino acid residues 532 to 541 of SEQ ID NO: 1).
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 532 to 541 of wild-type EBNA1 (e.g., amino acid residues 532 to 541 of SEQ ID NO: 1).
- the second sequence motif comprises or consists of amino acid residues 532 to 541 of SEQ ID NO: 1). [00262] In some embodiments, the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 532 to 541 of a consensus sequence listed in Table 7. In some embodiments, the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 532 to 541 of a consensus sequence listed in Table 7.
- the second sequence motif comprises or consists of amino acid residues 532 to 541 of a consensus sequence listed in Table 7.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 358), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 1 as set forth in Table 7.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 2 as set forth in Table 7.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 358), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 1 as set forth in Table 7.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 2 as set forth in Table 7.
- the second sequence motif comprises or consists of RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 358), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 1 as set forth in Table 7.
- the second sequence motif comprises or consists of RX 61 X 62 X 63 LX 64 RLPX 65 (SEQ ID NO: 285), wherein X 61 ; X 62 ; X 63 ; X 64 ; and X 65 are as in consensus 2 as set forth in Table 7.
- X 61 is A, L, S, or I; X 62 is T or S; X 63 is P or T; X 64 is G, S, or F; and X 65 is Y or F.
- X 61 is L, A, or S; X 62 is T; X 63 is P, or T; X 64 is S, or G; and X 65 is F, or Y; G is G.
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the second sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the second sequence motif comprises or consists of a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), RLTPLSRLPY (SEQ ID NO: 299), and RISPLFRLPY (SEQ ID NO: 300).
- the third sequence motif is at least about 5, 6, 7, 8, 9, 10, 11, or 12 amino acid residues in length. In some embodiments, the third sequence motif if about 10 amino acid residues in length.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 548 to 557 of wild-type EBNA1 (e.g., amino acid residues 548 to 557 of SEQ ID NO: 1).
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 548 to 557 of wild-type EBNA1 (e.g., amino acid residues 532 to 541 of SEQ ID NO: 1).
- the third sequence motif comprises or consists of amino acid residues 548 to 557 of SEQ ID NO: 1). [00266] In some embodiments, the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 548 to 557 of a consensus sequence listed in Table 7. In some embodiments, the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 548 to 557 of a consensus sequence listed in Table 7.
- the third sequence motif comprises or consists of amino acid residues 548 to 557 of a consensus sequence listed in Table 7.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 359), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 1 set forth in Table 7.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 2 set forth in Table 7.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 359), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 1 set forth in Table 7.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 2 set forth in Table 7.
- the third sequence motif comprises or consists of GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 359), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 1 set forth in Table 7.
- the third sequence motif comprises or consists of GPX 71 PX 72 PX 73 X 74 ES (SEQ ID NO: 286), wherein X 71 ; X 72 ; X 73 ; and X 74 are defined as in consensus 2 set forth in Table 7.
- X 71 is Q or E; X 72 is G or T; X 73 is L, M, or I; and X 74 is R, K, M, or L.
- X 71 is Q, or E; X 72 is G, or T; X 73 is L, or M; and X 74 is R, K, or M.
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- GPQPGPLRES SEQ ID NO: 301
- GPQPGPLKES SEQ ID NO: 302
- GPQPGPMRES SEQ ID NO: 303
- GPEPTPLMES SEQ ID NO: 304
- GPQPGPILES SEQ ID NO: 305
- the third sequence motif comprises or consists of a sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- GPQPGPLRES SEQ ID NO: 301
- GPQPGPLKES SEQ ID NO: 302
- GPQPGPMRES SEQ ID NO: 303
- GPEPTPLMES SEQ ID NO: 304
- GPQPGPILES SEQ ID NO: 305
- the third sequence motif comprises or consists a selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), GPEPTPLMES (SEQ ID NO: 304), and GPQPGPILES (SEQ ID NO: 305).
- w 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, or 58, 59, 60.
- x 6, 7, 8, 9, or 10.
- y 4, 5, 6, 7, or 8.
- z 48, 49, 50, 51, 52, or 53.
- [Xaa1] w comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 459 to 513 of wild-type EBNA1 (e.g., amino acid residues 459 to 513 of SEQ ID NO: 1).
- [Xaa1]w comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 513 of wild-type EBNA1 (e.g., amino acid residues 459 to 513 of SEQ ID NO: 1).
- [Xaa1]w comprises or consist of amino acid residues 459 to 513 of wild-type EBNA1 (e.g., amino acid residues 459 to 513 of SEQ ID NO: 1).
- [Xaa1]w comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 459 to 513 of a consensus sequence listed in Table 7.
- [Xaa1]w comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 459 to 513 of a consensus sequence listed in Table 7.
- [Xaa1] w comprises or consist of amino acid residues 459 to 513 of a consensus sequence listed in Table 7.
- [Xaa1] w comprises or consists of X1X2X3GGX4X5X6X7X8RGX9X10X11X12X13X14X15KX16X17X18X19X20X21X22X23X24X25LLX2 6 RX 27 X 28 X 29 X 30 X 31 X 32 TX 33 X 34 X 35 X 36 X 37 WX 38 X 39 X 40 X 41 X 42 X 43 X 44 X 45 X 46 X 47 (SEQ ID NO: 360), wherein X1; X2; X3; X4; X5; X6; X7; X8; X9; X10; X11; X12; X13; X14; X15; X16; X17; X18; X 19 ; X 20 ; X 21 ; X 22 ; X 23 ; X 24
- [Xaa1] w comprises or consists of SEQ ID NO: 306, wherein X 1 ; X 2 ; X3; X4; X5; X6; X7 ; X8; X9; X10; X11; X12; X13; X14; X15; X16; X17; X18; X19; X20; X21; X22; X23; X 24 ; X 25 ; X 26 ; X 27 ; X 28 ; X 29 ; X 30 ; X 31 ; X 32 ; X 33 ; X 34 ; X 35 ; X 36 ; X 37 ; X 38 ; X 39 X 40 ; X 41 ; X 42 ; X 43 ; X44; X45; X46; and X47 are defined as in consensus 2 set forth in Table 7.
- X 1 is R, G, P, or K
- X 2 is K or P
- X 3 is K or R
- X 4 is W, V, or -
- X 5 is F or -
- X 6 is G, -, or Y
- X 7 is K, -, R, or V
- X8 is H, -, R, or G
- X9 is Q, E, or C
- X10 is G or P
- X11 is G, A, or R
- X12 is S, K, R, Y, A, or G
- X13 is -, C, or G
- X14 is N, H, F, or S
- X15 is P, G, K, or -
- X16 is F or Y
- X17 is E, T, D, or Q
- X 18 is N, T, K, G, or S
- X 19 is I, T, M, or L
- X 20 is A or G
- X 38 is V, M, P, K, G, or C;
- X 39 is A, N, F, Y, or C;
- X 40 is G or A;
- X 41 is V or L;
- X42 is F, M, L, or I;
- X43 is V, A, or I;
- X44 is Y or V;
- X45 is G or N;
- X46 is G, L, P, or Y;
- X47 is S, -, or C; and “-” is a deletion.
- X 1 is R, G, K, or P;
- X 2 is K, or P;
- X 3 is R, or K;
- X4 is W, or -;
- X5 is F;
- X6 is G, or Y;
- X7 is R, K, or V;
- X8 is G, R, or H;
- X9 is Q, C, or E;
- X 10 is G;
- X 11 is G, or R;
- X 12 is S, A, R, Y, or G;
- X 13 is G, or -;
- X 14 is N, S, or F;
- X 15 is P, K, or -;
- X16 is F, or Y;
- X17 is E, D, or Q;
- X18 is N, T, G, S, or K;
- X19 is I, M, or L;
- X20 is A, or G;
- X 21 is
- [Xaa2]x comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 524 to 531 of wild-type EBNA1 (e.g., amino acid residues 524 to 531 of SEQ ID NO: 1).
- [Xaa2] x comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 524 to 531 of wild-type EBNA1 (e.g., amino acid residues 524 to 531 of SEQ ID NO: 1). In some embodiments, [Xaa2] x comprises or consist of amino acid residues 524 to 531 of wild-type EBNA1 (e.g., amino acid residues 524 to 531 of SEQ ID NO: 1).
- [Xaa2] x comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 524 to 531 of a consensus sequence listed in Table 7.
- [Xaa2]x comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 524 to 531 of a consensus sequence listed in Table 7.
- [Xaa2] x comprises or consist of amino acid residues 524 to 531 of a consensus sequence listed in Table 7.
- [Xaa2]x comprises or consists of X 53 X 54 X 55 X 56 X 57 X 58 X 59 X 60 (SEQ ID NO: 361), wherein X 53 ; X 54 ; X 55 ; X 56 ; X 57 ; X 58 ; X 59 ; and X60 are defined in consensus 1 as set forth in Table 7.
- [Xaa2]x comprises or consists of X 53 X 54 X 55 X 56 X 57 X 58 X 59 X 60 (SEQ ID NO: 307), wherein X 53 ; X 54 ; X 55 ; X 56 ; X 57 ; X58; X59; and X60 are defined in consensus 2 as set forth in Table 7.
- X53 is T, L, I, or M; X 54 is A, G, or S; X 55 is L, C, V, or I; X 56 is A or C; X 57 is I, A, V, or C; X 58 is P or N; X59 is Q, E, W, or G; and X60 is C, V, or G.
- [Xaa3] y comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 542 to 547 of wild-type EBNA1 (e.g., amino acid residues 542 to 547 of SEQ ID NO: 1).
- [Xaa3]y comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 542 to 547 of wild-type EBNA1 (e.g., amino acid residues 542 to 547 of SEQ ID NO: 1). In some embodiments, [Xaa3]y comprises or consist of amino acid residues 542 to 547 of wild-type EBNA1 (e.g., amino acid residues 542 to 547 of SEQ ID NO: 1).
- [Xaa3]y comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 542 to 547 of a consensus sequence listed in Table 7. In some embodiments, [Xaa3]y comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 542 to 547 of a consensus sequence listed in Table 7. In some embodiments, [Xaa3] y comprises or consist of amino acid residues 542 to 547 of a consensus sequence listed in Table 7.
- [Xaa3] y comprises or consists of GX 66 X 67 X 68 X 69 X 70 (SEQ ID NO: 362), wherein X66; X67; X68; X69; and X70 are defined in consensus 1 as set forth in Table 7.
- [Xaa3] y comprises or consists of GX 66 X 67 X 68 X 69 X 70 (SEQ ID NO: 308), wherein X66; X67; X68; X69; and X70 are defined in consensus 2 as set forth in Table 7.
- X 66 is M, S, Y, H, I, or T; X 67 is A, S, or T; X 68 is P, F, or W; X69 is G or E; and X70 is P, T, A, or G.
- X66 is M, I, Y, T, or H; X67 is A, T, or S; X 68 is P, or W; X 69 is G, or E; and X 70 is P, G, A, or T.
- [Xaa4]z comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 558 to 607 of wild-type EBNA1 (e.g., amino acid residues 558 to 607 of SEQ ID NO: 1).
- [Xaa4]z comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 558 to 607 of wild-type EBNA1 (e.g., amino acid residues 558 to 607 of SEQ ID NO: 1). In some embodiments, [Xaa4]z comprises or consist of amino acid residues 558 to 607 of wild-type EBNA1 (e.g., amino acid residues 558 to 607 of SEQ ID NO: 1).
- [Xaa4]z comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% similarity to amino acid residues 558 to 607 of a consensus sequence listed in Table 7.
- [Xaa4] z comprises or consist of an amino acid sequence having at least about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to amino acid residues 558 to 607 of a consensus sequence listed in Table 7.
- [Xaa4]z comprises or consist of amino acid residues 558 to 607 of a consensus sequence listed in Table 7.
- [Xaa4]z comprises or consists of X75X76X77X78FX79X80FX81X82X83X84X85X86X87X88X89X90X91X92X93X94X95X96X97X98X99X100 X101PX102PX103X104X105X106X107VX108X109X110X111FX112X113X114X115X116X117LP (SEQ ID NO: 363), wherein X75; X76; X77; X78; X79; X80; X81; X82; X83; X84; X85; X86 ; X87; X88; X89; X 90 ; X 91 ; X 92 ; X 93; X 94 ; X 95 ; X 96 ; X 97 ; X 98 ; X 99 ; X 100 ; X 101
- X75 is I, G, S, T, or C; X76 is V, W, D, or E; X77 is C, or S; X78 is Y; X79 is M, L, or I; X80 is V, Y, or F; X81 is L, or V; X82 is Q, N, or P; X83 is T, S, or C; X84 is H, W, P, M, H, or G; X 85 is I, P, E, L, or Q; X 86 is F, or S; X 87 is A, or G; X 88 is E, or L; X 89 is V, W, or C; X90 is L, or V; X91 is K; X92 is D, or Q; X93 is A, or C; X94 is I, or V; X95 is K, L, G, or R; X 96 is D, or V; X 97 is L, or Y; X X 97
- the DBD comprises or consists of a sequence as defined in consensus 1 set forth in Table 7. In some embodiments, the DBD comprises or consists of a sequence as defined in consensus 2 set forth in Table 7. In some embodiments, the DBD comprises or consists of SEQ ID NO: 310. In some embodiments, the DBD comprises or consists of SEQ ID NO: 364. [00283] In some embodiments, the DBD comprises or consists of a sequence having at least about 50% identity to a wild-type EBNA1 DBD. In some embodiments, the EBNA1 comprises a sequence comprising at least about 50%, about 60%, about 70%, about 80%, about 90% or about 95% identity to a wild-type EBNA1.
- the EBNA1 homolog comprises a sequence comprising at least about 50% to about 70% identity to a wild-type EBNA1. In some embodiments, the EBNA1 homolog comprises a sequence comprising at least about 55% to about 65% identity to a wild-type EBNA1. [00284] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 215.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 215. In some embodiments, the DBD comprises or consists of SEQ ID NO: 215. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 349.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 349. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 349. [00285] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 216.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 216. In some embodiments, the DBD comprises or consists of SEQ ID NO: 216. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 350.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 350. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 350. [00286] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 217.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 217. In some embodiments, the DBD comprises or consists of SEQ ID NO: 217. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 351.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 351. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 351. [00287] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 218.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 218. In some embodiments, the DBD comprises or consists of SEQ ID NO: 218. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 352.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 352. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 352. [00288] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 219.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 219. In some embodiments, the DBD comprises or consists of SEQ ID NO: 219. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 353.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 353. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 353. [00289] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 220.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 220. In some embodiments, the DBD comprises or consists of SEQ ID NO: 220. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 354.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 354. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 354. [00290] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 100% identity to SEQ ID NO: 322.
- the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 322. In some embodiments, the DBD comprises or consists of SEQ ID NO: 322. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 323.
- the DBD comprises or consists of an amino acid sequence encoded by a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 323. In some embodiments, the DBD comprises or consists of an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 323. [00291] In some embodiments, the DBD is at least about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, or about 200 amino acid residues in length. In some embodiments, the DBD is about 120 to about 180 amino acid residues in length.
- the DBD is about 130 to about 160 amino acid residues in length. In some embodiments, the DBD is about 140 to about 160 amino acid residues in length. Table 7: Consensus sequence for EBNA1 and Exemplary EBNA1 Homologs Chimeric DNA Binding Proteins [00292]
- the disclosure provides a chimeric DNA binding protein comprising one or more chromatin binding domains and at least one EBNA1 DBD, or a variant or a fragment thereof.
- the disclosure provides an mRNA comprising an ORF encoding a chimeric DNA binding protein comprising one or more chromatin binding domains and at least one EBNA1 DBD.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising one or more chromatin-binding domains operably linked to at least one EBNA1 DBD.
- the DNA binding protein is chimeric, wherein the DNA binding protein comprises a DBD of an EBNA1 homolog described herein, or a variant or a fragment thereof, and a chromatin binding domain.
- the DNA binding protein comprises the DBD and a chromatin binding domain derived from the EBNA1 homolog, wherein the DBD and the chromatin binding domain are arranged in a different order, orientation, and/or spacing as compared to domains comprising a substantially similar function and/or sequence present in the EBNA1 homolog.
- the DNA binding protein comprises a DBD of an EBNA1 homolog, or a variant or a fragment thereof, and a chromatin binding domain, wherein the DBD and the chromatin binding domain are mutually heterologous (i.e., do not occur together in the wild-type EBNA1 homolog).
- the chromatin binding domain is operably linked to the DBD.
- the DNA binding protein is 100 to 500 residues, 100 to 600 residues, 100 to 700 residues, 100 to 800 residues, 100 to 900 residues, 100 to 1,000 residues, 200 to 500 residues, 200 to 600 residues, 200 to 700 residues, 200 to 800 residues, 200 to 900 residues, 200 to 1,000 residues, 300 to 500 residues, 300 to 600 residues, 300 to 700 residues, 300 to 800 residues, 300 to 900 residues, or 300 to 1,000 residues in length.
- the DNA binding protein comprises one or more chromatin binding domains described herein operably linked to a polypeptide comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 8, wherein the DNA binding protein is at least about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, or about 200 amino acid residues in length.
- the DNA binding protein comprises one or more chromatin binding domains described herein operably linked to a polypeptide comprising SEQ ID NO: 8, wherein the DNA binding protein is at least about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, or about 200 amino acid residues in length.
- the one or more chromatin binding domain is selected from EBNA1 domain A or a portion thereof, EBNA domain B or a portion thereof.
- the DNA binding protein comprises one or more heterologous chromatin binding domains described herein.
- the DNA binding protein comprises a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) described herein operably linked to a DBD of an EBNA1 homolog described herein, wherein the DNA binding protein is at least about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, or about 200 amino acid residues in length.
- the chromatin binding domain is derived from the EBNA1 homolog.
- the EBNA1 homolog-derived chromatin binding domain is an arginine-glycine rich sequence (i.e., at least about 50% arginine-glycine content).
- the EBNA1 homolog- derived chromatin binding domain is about 30 to about 80 amino acids. In some embodiments, the EBNA1 homolog-derived chromatin binding domain is about 40 to about 70 amino acids. In some embodiments, the EBNA1 homolog-derived chromatin binding domain is at least 50% similar to CBD Domain A or CBD Domain B from EBNA1. In some embodiments, the EBNA1 homolog-derived chromatin binding domain is selected from Table 25. In some embodiments, the EBNA1 homolog-derived chromatin binding domain is at least 70% similar to a sequence selected from Table 25. In some embodiments, the chromatin binding domain comprises a portion of the EBNA1 homolog N-terminal to the DBD.
- the chromatin binding domain comprises a EBNA1 domain A or a portion thereof and/or an EBNA1 domain B or a portion thereof.
- the DNA binding protein comprises a heterologous chromatin binding domain described herein.
- EBNA1 and EBNA1 Homolog Chromatin Binding Domains [00297]
- the DNA binding protein comprises (i) one or more chromatin binding domains selected from EBNA1 domain A or a portion thereof, EBNA domain B or a portion thereof, and a combination thereof; and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from EBNA1 domain A or a portion thereof and EBNA domain B or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or portion thereof, wherein B is selected from EBNA1 domain A or a portion thereof and EBNA domain B or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain B or portion thereof, wherein B is selected from EBNA1 domain A or a portion thereof and EBNA domain B or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or portion thereof, wherein B is an EBNA1 domain A or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) one or more chromatin-binding domains comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or portion thereof, wherein B is an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) at least one EBNA1 DBD, wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises from N-terminus to C-terminus: the one or more chromatin binding domains and an EBNA1 DBD.
- the DNA binding protein comprises from N-terminus to C-terminus: the EBNA1 DBD and the one or more chromatin binding domains. [00304] In some embodiments, the DNA binding protein further comprises one or more NLSs. In some embodiments, the DNA binding protein comprises at least one NLSs N-terminal to the one or more chromatin binding domains and the EBNA1 DBD. In some embodiments, the DNA binding protein comprises at least one NLSs C-terminal to the one or more chromatin binding domains and the EBNA1 DBD. In some embodiments, the NLS is inserted between the one or more chromatin binding domains and the EBNA1 DBD.
- the EBNA1 DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 145, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 amino acid residues present in SEQ ID NO: 18.
- the EBNA1 DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 18.
- the EBNA1 DBD comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 93.
- the EBNA1 DBD comprises SEQ ID NO: 18.
- the EBNA1 DBD comprises an amino acid sequence encoded by SEQ ID NO: 93.
- the EBNA1 DBD comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence corresponding to residues about 459 to about 607 of SEQ ID NO: 1.
- the EBNA1 DBD comprises an amino acid sequence corresponding to residues about 459 to about 607 of SEQ ID NO: 1.
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising an EBNA1 domain A or a portion thereof and/or an EBNA1 domain B or a portion thereof; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a chromatin binding domain comprising an EBNA1 domain A or a portion thereof and/or an EBNA1 domain B or a portion thereof
- a DBD of an EBNA1 homolog described herein wherein (i) and (ii) are operably linked.
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a homolog of an EBNA1 domain A or a portion thereof and/or a homolog of an EBNA1 domain B or a portion thereof; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- the homolog of an EBNA1 domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from SEQ ID NOs: 337-340 and 347.
- the homolog of an EBNA1 domain A comprises a sequence selected from SEQ ID NOs: 337-340 and 347.
- homolog of an EBNA1 domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from SEQ ID NOs: 341-346.
- the homolog of an EBNA1 domain A comprises a sequence selected from SEQ ID NOs: 341-346.
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from an EBNA1 domain A or a portion thereof and an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a and B are each independently selected from an EBNA1 domain A or a portion thereof and an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or a portion thereof, wherein B is selected from an EBNA1 domain A or a portion thereof and an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; (ii) and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- A is an EBNA1 domain A or a portion thereof
- B is selected from an EBNA1 domain A or a portion thereof and an EBNA1 domain B or a portion thereof, wherein L
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain B or portion thereof, wherein B is selected from EBNA1 domain A or a portion thereof and an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain B or portion thereof, wherein B is selected from EBNA1 domain A
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or portion thereof, wherein B is an EBNA1 domain A or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- A is an EBNA1 domain A or portion thereof
- B is an EBNA1 domain A or a portion thereof
- L if present, is a spacer between A and B
- a DBD of an EBNA1 homolog described herein wherein (i) and (ii
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain A or portion thereof, wherein B is an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- A is an EBNA1 domain A or portion thereof
- B is an EBNA1 domain B or a portion thereof
- L if present, is a spacer between A and B
- a DBD of an EBNA1 homolog described herein wherein (i) and (ii
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A is an EBNA1 domain B or portion thereof, wherein B is an EBNA1 domain A or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- A is an EBNA1 domain B or portion thereof
- B is an EBNA1 domain A or a portion thereof
- L if present, is a spacer between A and B
- a DBD of an EBNA1 homolog described herein wherein (i) and (ii
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from a homolog of an EBNA1 domain A or a portion thereof and a homolog of an EBNA1 domain B or a portion thereof, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a and B are each independently selected from a homolog of an EBNA1 domain A or a portion thereof and a homolog of an EBNA1 domain B or a portion thereof, wherein L, if present, is a space
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected from SEQ ID NOs: 337-347, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a and B are each independently selected from an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a sequence selected
- the DNA binding protein comprises (i) a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) comprising a sequence of linked amino acids comprising the formula N ⁇ -[A]-[L]-[B]-C ⁇ , wherein A and B are each independently selected from SEQ ID NOs: 337-347, wherein L, if present, is a spacer between A and B; and (ii) a DBD of an EBNA1 homolog described herein, wherein (i) and (ii) are operably linked.
- a chromatin binding domain e.g., 1, 2, 3, 4, or more chromatin binding domain(s)
- a and B are each independently selected from SEQ ID NOs: 337-347, wherein L, if present, is a spacer between A and B
- L if present, is a spacer between A and B
- a DBD of an EBNA1 homolog described herein wherein (i) and (ii) are oper
- the DNA binding protein comprises from N-terminus to C-terminus: a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)) and a DBD of an EBNA1 homolog described herein.
- the DNA binding protein comprises from N-terminus to C-terminus: a DBD of an EBNA1 homolog described herein and a chromatin binding domain (e.g., 1, 2, 3, 4, or more chromatin binding domain(s)).
- the DNA binding protein further comprises an NLS.
- the DNA binding protein comprises an NLS N-terminal to the chromatin binding domain and the DBD.
- the DNA binding protein comprises an NLS C-terminal to the chromatin binding domain and the DBD. In some embodiments, the NLS is inserted between the chromatin binding domain and the DBD.
- the EBNA1 domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 14. In some embodiments the EBNA1 domain A comprises a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 14.
- the EBNA1 domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 15. In some embodiments, the EBNA1 domain A comprises SEQ ID NO: 14. In some embodiments, the EBNA1 domain A comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity an amino acid sequence corresponding to residues about 33 to about 89 of SEQ ID NO: 1. In some embodiments, the EBNA1 domain A comprises an amino acid sequence corresponding to residues about 33 to about 89 of SEQ ID NO: 1.
- the EBNA1 domain A comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 89. In some embodiments, the EBNA1 domain A comprises an amino acid sequence encoded by SEQ ID NO: 89. [00325] In some embodiments the EBNA1 domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 16.
- the EBNA1 domain B comprises a contiguous sequence of at least about 55, about 50, about 45, about 40, about 35, or about 30 amino acid residues present in SEQ ID NO: 16. In some embodiments, the EBNA1 domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 16. In some embodiments, the EBNA1 domain B comprises SEQ ID NO: 16. In some embodiments, the EBNA1 domain B comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity an amino acid sequence corresponding to residues about 328 to about 376 of SEQ ID NO: 1.
- the EBNA1 domain B comprises an amino acid sequence corresponding to residues about 328 to about 376 of SEQ ID NO: 1. In some embodiments, the EBNA1 domain B comprises an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 91. In some embodiments, the EBNA1 domain B comprises an amino acid sequence encoded by SEQ ID NO: 91.
- the disclosure provides a DNA binding protein comprising one or more heterologous chromatin-binding domains operably linked to at least one EBNA1 DBD.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising one or more heterologous chromatin- binding domains operably linked to at least one EBNA1 DBD.
- the DNA binding protein comprising a heterologous chromatin-binding domain (e.g., 1, 2, 3, 4, or more heterologous chromatin binding domain(s)) operably linked to a DBD of an EBNA1 homolog described herein.
- the one or more heterologous chromatin binding domains bind to the nuclear matrix (i.e., the network of fibers within the nucleus). In some embodiments, the one or more heterologous chromatin binding domains comprise a polypeptide of the nuclear matrix. In some embodiments, the one or more heterologous chromatin binding domains binds to euchromatin, heterochromatin, or both. In some embodiments, the one or more heterologous chromatin binding domains comprise a polypeptide that binds to human chromatin. In some embodiments, the one or more heterologous chromatin binding domains binds to human euchromatin, human heterochromatin, or both.
- the one or more heterologous chromatin binding domains comprise one or more elements of human chromatin. In some embodiments, the one or more heterologous chromatin binding domains binds to genomic DNA, a histone protein, a nucleosome, or a combination thereof. [00328] In some embodiments, the DNA binding protein comprises from N-terminus to C-terminus: the heterologous chromatin binding domain and the DBD. In some embodiments, the DNA binding protein comprises from N-terminus to C-terminus: the DBD and the heterologous chromatin binding domain.
- the DNA binding protein comprises from N-terminus to C-terminus: the one or more heterologous chromatin binding domains and an EBNA1 DBD. In some embodiments, the DNA binding protein comprises from N-terminus to C-terminus: the EBNA1 DBD and the one or more heterologous chromatin binding domains. [00330] In some embodiments, the DNA binding protein further comprises one or more NLSs. In some embodiments, the DNA binding protein comprises at least one NLS N-terminal to the one or more heterologous chromatin binding domains and the EBNA1 DBD.
- the DNA binding protein comprises at least one NLS C-terminal to the one or more heterologous chromatin binding domains and the EBNA1 DBD. In some embodiments, the NLS is inserted between the one or more heterologous chromatin binding domains and the EBNA1 DBD. [00331] In some embodiments, the DNA binding protein further comprises an NLS (e.g., 1, 2, 3, 4, or more NLSs). In some embodiments, the DNA binding protein comprises an NLS N-terminal to the heterologous chromatin binding domain and the DBD. In some embodiments, the DNA binding protein comprises an NLS C-terminal to the heterologous chromatin binding domain and the DBD.
- NLS e.g. 1, 2, 3, 4, or more NLSs
- the NLS is inserted between the heterologous chromatin binding domain and the DBD.
- the one or more heterologous chromatin binding domains are selected from a bromodomain, a PHD finger domain, a chromodomain, a MBT domain, a tudor domain, a PWWP domain, an ADD domain, a Zf-CW domain, an ankyrin repeat domain, a WD40 domain, and a combination thereof.
- the one or more heterologous chromatin binding domains comprises an AT-hook.
- the AT-hook is from HMGA1, HMGA2, AF-17, SETBP1, TTF-I interacting peptide 5, SC1, X box-binding regulatory factor, LIM/homeodomain protein LH-2, Retinoblastoma-binding protein 1, ELF3, DFS70, ZNF213, Peregrin, Methyl-CpG-binding protein 2, and MLLT10.
- the one or more heterologous chromatin binding domains comprises an AT-hook sequence motif.
- the AT-hook sequence motif comprises an amino acid sequence selected from any one of SEQ ID NOs: 180-191.
- the one or more heterologous chromatin binding domains comprise at least two AT-hook sequence motifs, wherein the at least two AT-hook sequence motifs are operably- linked. In some embodiments, the one or more heterologous chromatin binding domains comprises 2, 3, 4, 5, 6, 7, 8, 9, or 10 AT-hook sequence motifs, wherein the AT-hook sequence motifs are operably-linked. In some embodiments, the at least two AT-hook sequence motifs are the same. In some embodiments, the at least two AT-hook sequence motifs are different. In some embodiments, the at least two AT-hook sequence motifs comprise an amino acid sequence selected from any one of SEQ ID NOs: 188-191.
- the at least two AT-hook sequence motifs are operably linked by a linker.
- the linker is a peptide linker described herein or known in the art for linking peptide sequences.
- the linker is SEQ ID NO: 155.
- the one or more heterologous chromatin binding domains comprises an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 156.
- the one or more heterologous chromatin binding domains comprises SEQ ID NO: 156.
- the one or more heterologous chromatin binding domains comprises an HMGA1 chromatin binding domain.
- the HMGA1 chromatin binding domain comprises an amino acid sequence set forth in SEQ ID NO: 154.
- the one or more heterologous chromatin binding domains comprises an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 154.
- the one or more heterologous chromatin binding domains comprises SEQ ID NO: 154.
- the one or more heterologous chromatin binding domains comprises a histone protein or portion thereof.
- the one or more heterologous chromatin binding domains comprises an H1 protein or portion thereof.
- the H1 histone protein or portion thereof is an H1.1 protein or portion thereof, an H1.2 protein or portion thereof, an H1.3 protein or portion thereof, an H1.4 protein or portion thereof, an H1.5 protein or portion thereof, an H1.6 protein or portion thereof, an H1.7 protein or portion thereof, an H1.8 protein or a portion thereof, an H1.9 protein or a portion thereof, or an H1.10 protein or a portion thereof.
- the H1 protein comprises an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 22.
- the H1 protein comprises an amino acid sequence set forth in SEQ ID NO: 22. In some embodiments, the H1 protein comprises an amino acid sequence having at least about 40%, about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence selected from SEQ ID NOs: 95-97. In some embodiments, the H1 protein comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to an amino acid sequence selected from SEQ ID NOs: 95-97. In some embodiments, the H1 protein comprises an amino acid sequence selected from SEQ ID NOs: 95-97.
- the one or more heterologous chromatin binding domains comprises an IL-33 chromatin-binding domain (CBD).
- the IL- 33 CBD comprises an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 21.
- the IL-33 CBD comprises an amino acid sequence set forth in SEQ ID NO: 21.
- the one or more heterologous chromatin binding domains comprises a chromatin binding domain of a Karposi’s sarcoma-associated herpesvirus (KSHV) latency-associated nuclear antigen (LANA).
- KSHV sarcoma-associated herpesvirus
- the LANA CBD comprises an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 20. In some embodiments, the LANA CBD comprises an amino acid sequence set forth in SEQ ID NO: 20.
- the one or more heterologous chromatin binding domains comprises a chromatin binding domain of a human papillomavirus H (HPV) E2 protein.
- the E2 CBD comprises an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 23.
- the E2 CBD comprises SEQ ID NO: 23.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity an amino acid sequence selected from SEQ ID NOs: 7-11.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence selected from SEQ ID NOs: 7-11.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity an amino acid sequence selected from SEQ ID NOs: 7-11, 150, and 153. In some embodiments, the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence selected from SEQ ID NOs: 7-11, 150, and 153.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity a nucleotide sequence selected from SEQ ID NOs: 25-29.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence selected from SEQ ID NOs: 25-29.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity a nucleotide sequence selected from SEQ ID NOs: 25-29, 149, and 152.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence selected from SEQ ID NOs: 25-29, 149, and 152.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity a nucleotide sequence selected from SEQ ID NOs: 25-29. In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence selected from SEQ ID NOs: 25-29.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity a nucleotide sequence selected from SEQ ID NOs: 25-29.
- the mRNA comprises an ORF comprising a nucleotide sequence selected from SEQ ID NOs: 25-29, 149, and 152.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 10. In some embodiments, the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising SEQ ID NO: 10.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 26.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by SEQ ID NO: 26.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 26.
- the mRNA comprises an ORF comprising SEQ ID NO: 26.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 9.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising SEQ ID NO: 9.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 28.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by SEQ ID NO: 28.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 28.
- the mRNA comprises an ORF comprising SEQ ID NO: 28.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 150.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising SEQ ID NO: 150.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 149.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by SEQ ID NO: 149.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 149.
- the mRNA comprises an ORF comprising SEQ ID NO: 149.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 153.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising SEQ ID NO: 153. [00356] In some embodiments, the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 152. In some embodiments, the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein comprising an amino acid sequence encoded by SEQ ID NO: 152.
- the mRNA comprises an ORF comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 152.
- the mRNA comprises an ORF comprising SEQ ID NO: 152.
- Linkers [00358]
- the one or more domains of a chimeric DNA binding protein described herein e.g., one or more chromatin binding domains and an EBNA1 DBD
- the one or more domains of a chimeric DNA binding protein are operably linked by direct linking end-to-end, wherein the one or more domains comprise a chromatin binding domain described herein and an EBNA1 DBD described herein. In some embodiments, the one or more domains of a chimeric DNA binding protein are operably linked by direct linking end-to-end, wherein the one or more domains comprise a chromatin binding domain described herein and a DBD of an EBNA1 homolog described herein. In some embodiments, one or more domains of a chimeric DNA binding protein described herein (e.g., one or more chromatin binding domains and an EBNA1 DBD) are operably linked by a linker.
- a linker e.g., one or more chromatin binding domains and an EBNA1 DBD
- the one or more domains of a chimeric DNA binding protein are operably linked by a linker, wherein the one or more domains comprise a chromatin binding domain described herein and an EBNA1 DBD described herein. In some embodiments, the one or more domains of a chimeric DNA binding protein are operably linked by a linker, wherein the one or more domains comprise a chromatin binding domain described herein and a DBD of an EBNA1 homolog described herein. [00359] In some embodiments, the linker is a peptide linker. In some embodiments, the linker provides for a three-dimensional arrangement of domains in a chimeric protein described herein that enables the component domains to bind their respective targets.
- the linker is selected to permit the independent interaction of each domain with its intended target. In some embodiments, the linker is selected to permit an interaction of the chromatin binding domain with an intended target (e.g., chromatin) and of the DBD with an intended target (e.g., a DBE present in a recombinant expression vector described herein). For example, in some embodiments, the linker is selected to permit an interaction of the one or more chromatin binding domains with an intended target (e.g., chromatin) and of the EBNA1 DBD with an intended target (e.g., a DBE present in a recombinant expression vector described herein).
- an intended target e.g., chromatin
- the linker is selected to permit an interaction of the one or more chromatin binding domains with an intended target (e.g., chromatin) and of the EBNA1 DBD with an intended target (e.g., a DBE present in a recombinant expression vector described herein
- the linker comprises a peptide sequence from a wild-type EBNA1 or wild-type EBNA1 homolog (e.g., a peptide sequence joining two domains in the wild-type EBNA1 or EBNA1 homolog).
- the linker comprises a peptide sequence from a wild-type EBNA1 polypeptide (e.g., a peptide sequence joining two domains in a wild-type EBNA1 polypeptide).
- the linker comprises a heterologous amino acid sequence. Linkers may be designed by modeling or identified by experimental trial.
- a linker joining two domains of a chimeric protein described herein spans a distance of less than about 10 ⁇ , e.g., as determined by a method for protein structural characterization such as x-ray crystallography or NMR.
- the linker provides an arrangement of domains with an intervening distance of more than about 10 ⁇ (e.g., about 10 ⁇ , 20 ⁇ , 30 ⁇ , 40 ⁇ , 50 ⁇ , 60 ⁇ , 70 ⁇ , 80 ⁇ , 90 ⁇ , 100 ⁇ , or higher).
- the linker comprises a folded domain.
- a linker comprising a folded domain provides a longer distance between a chromatin binding domain and a DBD of a chimeric DNA protein described herein. In some embodiments, the linker comprising a folded domain that retains the chromatin binding domain and the DBD of a chimeric DNA protein in a particular configuration suitable for binding to their respective targets. In some embodiments, a linker comprising a folded domain provides a longer distance between one or more chromatin binding domains and an EBNA1 DBD of a chimeric DNA protein described herein.
- a linker comprising a folded domain provides a longer distance between one or more chromatin binding domains and an EBNA1 homolog DBD of a chimeric DNA protein described herein. In some embodiments, the linker comprising a folded domain retains the one or more chromatin binding domains and an EBNA1 DBD of a chimeric DNA protein in a particular configuration suitable for binding to their respective targets. In some embodiments, the linker comprising a folded domain retains the one or more chromatin binding domains and an EBNA1 homolog DBD of a chimeric DNA protein in a particular configuration suitable for binding to their respective targets. [00361] In some embodiments, the linker is a peptide linker.
- the peptide linker is a gly-ser linker.
- gly-ser linker refers to a peptide comprising glycine and serine residues.
- An exemplary gly-ser polypeptide linker comprises the amino acid sequence Ser(Gly4Ser)n (SEQ ID NO: 30).
- n l.
- n 2.
- n 3, i.e., Ser(Gly4Ser)3 (SEQ ID NO: 31).
- n 4, i.e., Ser(Gly4Ser)4 (SEQ ID NO: 32).
- n 5.
- Another exemplary gly-ser polypeptide linker comprises the amino acid sequence Gly4SerGly3Ser (SEQ ID NO: 155).
- Additional domains [00362] In some embodiments, a DNA binding protein of the disclosure comprises one or more chromatin binding domains, an EBNA1 DBD, and at least one additional domain. In some embodiments, a DNA binding protein of the disclosure comprises a chromatin binding domain described herein, a DBD of an EBNA1 homolog described herein, and an additional domain.
- the at least one additional domain is an NLS. In some embodiments, the at least one additional domain is a ligand binding domain. In some embodiments, the at least one additional domain is a protein-binding domain. In some embodiments, the at least one additional domain is a transactivation domain. In some embodiments, the at least one additional domain is a transcription factor binding domain. [00363] In some embodiments, the one or more NLSs is selected from a monopartite NLS, a bipartite NLS, a non-classical NLS, and a combination thereof. [00364] In some embodiments, the one or more NLSs comprises an EBNA1 NLS.
- the EBNA1 NLS comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 17. In some embodiments, the EBNA1 NLS comprises SEQ ID NO: SEQ ID NO: 17. In some embodiments, the EBNA1 NLS comprises an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity an amino acid sequence corresponding to residues about 376 to about 386 of SEQ ID NO: 1. In some embodiments, the EBNA1 NLS comprises an amino acid sequence corresponding to residues about 376 to about 386 of SEQ ID NO: 1.
- the EBNA1 NLS comprises an amino acid sequence encoded by a nucleotide sequencing having at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 92. In some embodiments, the EBNA1 NLS comprises an amino acid sequence encoded by SEQ ID NO: 92. [00365] In some embodiments, the one or more NLSs is selected from any known in the art or described herein. In some embodiments, the NLS is positioned in any portion of a DNA binding protein described herein, e.g., internal or operably linked to the N- or C-terminus.
- the NLS functions in trafficking the DNA binding protein into the nucleus upon introduction of an mRNA encoding the DNA binding protein to a cell.
- a NLS has a plurality of basic amino acids, referred to as a bipartite basic repeat (reviewed in Garcia- Bustos et al, Biochimica et Biophysica Acta (1991) 1071, 83-101).
- NLSs include: the NLS of the SV40 virus large T- antigen, having the amino acid sequence PKKKRKV (SEQ ID NO: 35); the NLS from nucleoplasmin (e.g.
- the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO: 36)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO: 37) or RQRRNELKRSP (SEQ ID NO: 38); the hRNPA1 M9 NLS having the sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 39); the sequence RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO: 40) of the IBB domain from importin-alpha; the sequences VSRKRPRP (SEQ ID NO: 41) and PPKKARED (SEQ ID NO: 42) of the myoma T protein; the sequence PQPKKKPL (SEQ ID NO: 43) of human p53; the sequence SALIKKKKKMAP (SEQ ID NO: 44) of mouse c- abl IV; the sequence
- the one or more NLSs are selected from a c-Myc NLS, SV40 NLS, a nucleoplasmin NLS, a 53BP1 NLS, an ING4 NLS, an IER5 NLS, and an ERK5 NLS.
- the DNA binding protein comprises one or more domains to facilitate their purification, e.g. "histidine tags" or a glutathione-S-transferase domain.
- the DNA binding protein comprises one or more epitope tags, wherein the one or more epitope tags comprise amino acid sequences recognized by known monoclonal antibodies for the detection of proteins within cells or the capture of proteins by antibodies in vitro. Delivery Format [00370]
- a system of the disclosure comprises the DNA binding protein as a polypeptide.
- the system comprises a nucleic acid (e.g., mRNA) encoding the DNA binding protein .
- the system comprises a recombinant expression vector comprising the nucleic acid encoding the DNA binding protein.
- the disclosure provides an mRNA comprising an open- reading frame (ORF) encoding a DNA binding protein described herein and one or more additional components.
- the mRNA comprises a 5 ⁇ untranslated region (5 ⁇ UTR), an ORF encoding a DNA binding protein described herein, and a 3 ⁇ UTR.
- the mRNA further comprises a 5 ⁇ cap structure, a Kozak consensus sequence, and a polyA sequence (i.e., a polyadenylation signal).
- the Kozak consensus sequence is immediate upstream of and adjacent to the ORF.
- the Kozak consensus sequence facilitates binding of the mRNA to ribosomes, thereby promoting its translation.
- the mRNA is at least about 0.9kb, 1kb, 1.5kb, 2kb, 2.5kb, 3kb, 3.5kb, 4kb, 4.5kb, or 5kb in length.
- the mRNA is about 1kb to about 1.5kb, about 1kb to about 2kb, about 1kb to about 3kb, about 2kb to about 4kb, about 2kb to about 5kb, about 3kb to about 5kb, about 3kb to about 6kb, about 4kb to about 5kb, or about 4kb to about 6kb in length.
- (A) 5 ⁇ UTR and 3 ⁇ UTR [00373] The 5 ⁇ UTR and 3 ⁇ UTR are non-coding regions of an mRNA that do not contribute to the encoded protein sequence.
- the UTRs can contribute to recruitment of ribosomes and efficiency of translation (see, e.g., Leppek et al (2016) Nat Rev Mol Cell Biol 19:158).
- the 5 ⁇ UTR and/or 3 ⁇ UTR comprises one or more secondary structures and/or one or more sequence motifs that modulate translation (e.g., by modulating recruitment of ribosomes, promoting mRNA degradation, directing cellular localization).
- the 5 ⁇ UTR and/or 3 ⁇ UTR comprises a sequence of at least about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, or about 100 nucleotides in length.
- the 5 ⁇ UTR and/or 3 ⁇ UTR comprises a sequence of about 20 to about 100, about 20 to about 150, about 20 to about 200, about 50 to about 100, about 50 to 150, about 50 to about 200, about 50 to about 300, about 100 to about 200, about 100 to about 300, about 100 to about 400, about 100 to about 500, or about 100 to about 600 nucleotides in length.
- the 5′ UTR is about 50 to about 500 nucleotides in length.
- the 3′ UTR is about 50 to about 500 nucleotides in length.
- the 3 ⁇ UTR is 100 to 600 residues, 100 to 700 residues, 100 to 800 residues, 100 to 900 residues, 100 to 1,000 residues, 200 to 500 residues, 200 to 600 residues, 200 to 700 residues, 200 to 800 residues, 200 to 900 residues, 200 to 1,000 residues, 300 to 500 residues, 300 to 600 residues, 300 to 700 residues, 300 to 800 residues, 300 to 900 residues, or 300 to 1,000 residues in length.
- the 5 ⁇ UTR and/or 3 ⁇ UTR is naturally-occurring.
- the 5 ⁇ UTR and/or 3 ⁇ UTR is derived from a human gene.
- the 5 ⁇ UTR and/or 3 ⁇ UTR is one derived from an alpha-globin gene, a beta-globin gene, albumin, HSD17B4, or eukaryotic elongation factor 1a.
- the 5 ⁇ UTR and/or 3 ⁇ UTR is one derived from a virus.
- the virus is orthopoxvirus or cytomegalovirus.
- the 5′ UTR includes one or more elements that affect an mRNA's stability or translation.
- a 3′ UTR includes one or more of a polyA signal, a binding site for proteins that affect an mRNA's stability of location in a cell, and/or one or more binding sites for miRNAs.
- the mRNA comprises an N-terminal (5’) cap.
- a 5’ cap structure or cap species is a compound including two nucleoside moieties joined by a linker.
- the 5 ⁇ cap is selected from a naturally occurring cap, a non-naturally occurring cap, a cap analog, and an anti-reverse cap analog (ARCA).
- the cap comprises one or more modified nucleosides and/or internucleoside linkers.
- a natural mRNA cap may include a nucleotide (N) and a guanine (G) nucleotide methylated at the 7 position joined by a triphosphate linkage at their 5’ positions, e.g., m7G(5’)ppp(5’)N, commonly written as m7GpppN.
- This cap is a cap-0 where nucleotide N does not contain 2OMe, or cap-1 where nucleotide N contains 2OMe, or cap-2 where nucleotides N and N+l contain 2OMe.
- This cap may also be of the structure 3’-O-Me- m7GpppN as incorporated by the anti-reverse-cap analog (ARCA), and may also include similar cap-0, cap-1, and cap-2, etc., structures.
- the 5’ cap is m7G(5’)ppp(5’)(2’OMeA)pG.
- Other examples of 5’ caps include, but are not limited to, m7G(5′)ppp (5′(A,G(5′)ppp(5′)A and G(5′)ppp(5′)G.
- the 5'cap is a CleanCap® (TriLink Biotechnologies) capping structure.
- Non-limiting examples of CleanCap® capping structures include CleanCap® Reagent GG (m7G(5')ppp(5')(2'OMeG)pG, CleanCap® Reagent AU (m7G(5')ppp(5')(2'OMeA)pU, and CleanCap® Reagent AG (m7(3'OMeG)(5')ppp(5')(2'OMeA)pG.
- the 5' cap may regulate nuclear export; prevent degradation by exonucleases; promote translation; and promote 5' proximal intron excision.
- Stabilizing elements for caps include phosphorothioate linkages, boranophosphate modifications, and methylene bridges.
- caps may also contain a non-nucleic acid entity that acts as the binding element for eukaryotic translation initiation factor 4E, eIF4E.
- C Polyadenylation signal
- the mRNA comprises a polyA tail.
- the polyA tail serves to improve the stability of the mRNA.
- the polyA tail promotes or increases nuclear export of the mRNA.
- the polyA tail promotes or increases translation of the mRNA.
- the polyA signal is operably linked to the 3 ⁇ terminus of the mRNA.
- the polyA signal is directly appended to the 3 ⁇ terminus of the 3 ⁇ UTR of the mRNA.
- the polyA signal comprises entirely adenosine nucleosides, analogs, or derivatives thereof.
- the polyA tail includes about 10 to about 300 adenosine nucleotides. In some embodiments, the polyA tail includes about 10 to about 200 adenosine nucleotides. In some embodiments, the polyA tail includes about 10 to about 150 adenosine nucleotides. In some embodiments, the polyA tail includes about 10 to about 125 adenosine nucleotides.
- the polyA tail includes about 10 to about 100 adenosine nucleotides. In some embodiments, the polyA tail includes about 10 to about 50 adenosine nucleotides. [00388] In some embodiments, the polyA tail comprises modifications to prevent exonuclease degradation (e.g., phosphorothioate linkages and/or modifications to the nucleobase). [00389] In some embodiments, the polyA tail comprises a 3 ⁇ cap comprising one or more modified nucleobases and/or synthetic moieties.
- the disclosure provides an mRNA comprising an ORF encoding a DNA binding protein described herein, wherein the mRNA comprises one or more modifications suitable for delivery, tolerability, and stability within cells, e.g., following in vivo or in vitro administration.
- the mRNA comprises one or more modification selected from a modified sugar moiety, a modified internucleoside linkage, a modified nucleoside, a modified nucleotide, and a combination thereof.
- nucleoside refers to a molecule comprising a purine or pyrimidine base covalently linked to a ribose or deoxyribose sugar (e.g., adenosine, guanosine, cytidine, uridine, and thymidine).
- nucleotide refers to a nucleoside comprising one or more phosphate groups joined in ester linkages to the sugar moiety (e.g., nucleoside monophosphates, disphosphates, and triphosphates).
- polynucleotide refers to a polymer of nucleotides/nucleosides joined together, e.g., by a phosphodister linkage between 5 ⁇ and 3 ⁇ carbon atoms.
- internucleoside linkage refers to a linkage joining one nucleotide/nucleoside unit of a polynucleotide to another nucleotide/nucleoside unit.
- the linkage is a phosphodiester linkage.
- the linkage is a modified linkage, e.g., a phosphorothioate linkage.
- the one or more modifications provide an mRNA exhibiting one or more of the following properties: reduced immunogenicity, nuclease resistance, improved cell uptake, increased half-life, increased translation efficiency, and/or are not toxic to cells or mammals, e.g., following contact with cells in vivo or ex vivo or in vitro.
- the mRNA comprises one or more nucleotide modification and/or nucleoside modifications having reduced immune stimulation properties, e.g., stimulation of innate immune pathways, by exogenous mRNA (see, e.g., Kariko, K, et al (2005) IMMUNITY 23:165; Anderson, et al (2011) NUCLEIC ACIDS RES 39:9329; Warren et al (2010) CELL STEM CELL 7:618).
- the mRNA comprises a chemical modification of one or more nucleosides/nucleotides.
- one or more uridines of the mRNA are chemically-modified or replaced with a chemically-modified nucleoside.
- the chemically-modified nucleoside is selected from: pseudouridine, Nl- methylpseudouridine, and 5- methoxyuridine.
- the chemically-modified nucleoside is any one described in US Pub No. 2020/0172935; US Pat No. 10,881730; or WO/2020/056304, each of which is incorporated by reference herein. [00395]
- 100% of the uridines of the mRNA are chemically- modified.
- about 95% of the uridines of the mRNA are chemically- modified. In some embodiments, about 90% of the uridines of the mRNA are chemically- modified. In some embodiments, about 85% of the uridines of the mRNA are chemically- modified. In some embodiments, about 80% of the uridines of the mRNA are chemically- modified. [00396] In some embodiments, 100% of the uridines of the mRNA are chemically- modified and/or replaced with Nl-methylpseudouridine. In some embodiments, about 95% of the uridines of the mRNA are chemically-modified and/or replaced with Nl- methylpseudouridine.
- the modified nucleobase is Nl-methylpseudouridine, and the mRNA of the disclosure is fully modified with Nl-methylpseudouridine.
- Nl- methylpseudouridine represents from 75-100% of the uracils in the mRNA. In some embodiments, Nl-methylpseudouridine represents 100% of the uracils in the mRNA.
- an mRNA of the disclosure is modified in the coding region (e.g., an open reading frame encoding a DNA binding protein described herein). In some embodiments, the mRNA is modified in regions besides a coding region. For example, in some embodiments, a 5' UTR and/or a 3' UTR are provided, wherein either or both may independently contain one or more different nucleoside modifications.
- a system of the disclosure comprises a DNA binding protein described herein, or a nucleic acid or a recombinant expression vector encoding the DNA binding protein, described herein and a recombinant expression vector described herein comprising a transgene and a DNA binding polynucleotide.
- the mRNA encodes a DNA binding protein comprising EBNA1, or a variant or a fragment thereof.
- the mRNA encodes a fragment of EBNA1, wherein the fragment comprises an EBNA1 DBD, or a variant thereof.
- the mRNA encodes a chimeric DNA binding protein described herein.
- the chimeric DNA binding protein comprise an EBNA1 DBD and a heterologous chromatin binding domain described herein.
- the chimeric DNA binding protein comprise a variant of an EBNA1 DBD, wherein the variant has substantially equivalent binding to an EBV OriP as compared to a wild-type EBNA1 DBD, and a heterologous chromatin binding domain described herein.
- the chimeric DNA binding protein comprise a fragment of an EBNA1 DBD, wherein the fragment has substantially equivalent binding to an EBV OriP as compared to a wild-type EBNA1 DBD, and a heterologous chromatin binding domain described herein.
- the mRNA comprises an ORF encoding an EBNA1 DBD described herein.
- the mRNA comprises an ORF comprising (i) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 89; and (ii) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 93, wherein (i) and (ii) are operably linked.
- the mRNA comprises an ORF comprising (i) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 91; and (ii) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 93, wherein (i) and (ii) are operably linked.
- the mRNA comprises an ORF comprising (i) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 89; (ii) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 91; and (iii) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 93, wherein (i), (ii), and (iii) are operably linked.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 3.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 3[C211 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 3.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 24. In some embodiments, the ORF comprises SEQ ID NO: 24. [00407] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 7.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 7[C558 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 7. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 25. In some embodiments, the ORF comprises SEQ ID NO: 25.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 10.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 10[C581 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 10.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 26. In some embodiments, the ORF comprises SEQ ID NO: 26. [00409] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 8.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 8[C569 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 8. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 27. In some embodiments, the ORF comprises SEQ ID NO: 27.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 9.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 9[C577 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 9.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 28. In some embodiments, the ORF comprises SEQ ID NO: 28. [00411] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 130.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 130[C804 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 130. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 129. In some embodiments, the ORF comprises SEQ ID NO: 129.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 133.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 133[C805 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 133.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 132. In some embodiments, the ORF comprises SEQ ID NO: 132. [00413] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 136.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 136[C806 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 136. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 134. In some embodiments, the ORF comprises SEQ ID NO: 134.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 139.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 139[C807 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 139.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 138. In some embodiments, the ORF comprises SEQ ID NO: 138. [00415] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 142.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 142[C778 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 142. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 141. In some embodiments, the ORF comprises SEQ ID NO: 141.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 145.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 145[C779 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 145.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 144. In some embodiments, the ORF comprises SEQ ID NO: 144. [00417] In some embodiments, the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 150.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 150[C570 mRNA]. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 150. In some embodiments, the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 149. In some embodiments, the ORF comprises SEQ ID NO: 149.
- the mRNA comprises an ORF comprising a nucleotide sequence encoding an amino acid sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 153.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 153[C566 mRNA].
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 153.
- the ORF comprises nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 152. In some embodiments, the ORF comprises SEQ ID NO: 152. [00419] In some embodiments, the mRNA encodes an EBNA1 homolog described herein. In some embodiments, the mRNA encodes a fragment of an EBNA1 homolog described herein, wherein the fragment comprises a DBD of the EBNA1 homolog.
- the mRNA encodes a variant of an EBNA1 homolog described herein, wherein the variant comprises a DBD having substantially equivalent binding to the DNA binding polynucleotide as compared to a DBD of the wild-type EBNA1 homolog. In some embodiments, the mRNA encodes a DBD of an EBNA1 homolog described herein. In some embodiments, the mRNA encodes a variant of a DBD described herein (e.g., a DBD of an EBNA1 homolog described herein), wherein the variant has substantially equivalent binding to the DNA binding polynucleotide as compared to the wild-type DBD.
- the mRNA encodes a fragment of a DBD described herein (e.g., a DBD of an EBNA1 homolog described herein), wherein the fragment has substantially equivalent binding to the DNA binding polynucleotide as compared to the wild-type DBD.
- the mRNA encodes a chimeric DNA binding protein described herein.
- the chimeric DNA binding protein comprise a DBD of an EBNA1 homolog described herein and a heterologous chromatin binding domain described herein.
- the chimeric DNA binding protein comprise a variant of a DBD described herein (e.g., a DBD of an EBNA1 homolog described herein), wherein the variant has substantially equivalent binding to the DNA binding polynucleotide as compared to the wild-type DBD, and a heterologous chromatin binding domain described herein.
- the chimeric DNA binding protein comprise a fragment of a DBD described herein (e.g., a DBD of an EBNA1 homolog described herein), wherein the fragment has substantially equivalent binding to the DNA binding polynucleotide as compared to the wild- type DBD, and a heterologous chromatin binding domain described herein.
- the mRNA comprises an ORF encoding a DBD of an EBNA1 homolog described herein.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 215.
- the ORF comprises a nucleotide sequence encoding SEQ ID NO: 215.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 349.
- the ORF comprises SEQ ID NO: 349. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 201. In some embodiments, the ORF comprises SEQ ID NO: 201. [00423] In some embodiments, the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 216. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 216.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 350. In some embodiments, the ORF comprises SEQ ID NO: 350. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 203. In some embodiments, the ORF comprises SEQ ID NO: 203.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 217. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 217. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 351. In some embodiments, the ORF comprises SEQ ID NO: 351.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 205. In some embodiments, the ORF comprises SEQ ID NO: 205. [00425] In some embodiments, the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 218. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 218.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 352. In some embodiments, the ORF comprises SEQ ID NO: 352. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 207. In some embodiments, the ORF comprises SEQ ID NO: 207.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 219. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 219. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 353. In some embodiments, the ORF comprises SEQ ID NO: 353.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 209. In some embodiments, the ORF comprises SEQ ID NO: 209. [00427] In some embodiments, the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 220. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 220.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 354. In some embodiments, the ORF comprises SEQ ID NO: 354. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 211. In some embodiments, the ORF comprises SEQ ID NO: 211.
- the ORF comprises a nucleotide sequence encoding an amino acid sequence having at least about 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 322. In some embodiments, the ORF comprises a nucleotide sequence encoding SEQ ID NO: 322. In some embodiments, the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 323. In some embodiments, the ORF comprises SEQ ID NO: 323.
- the ORF comprises a nucleotide sequence having at least about 70%, about 75%, 80%, about 85%, about 90%, about 95%, or about 99% identity to SEQ ID NO: 323. In some embodiments, the ORF comprises SEQ ID NO: 323.
- Recombinant Expression Vector [00429] In some embodiments, the disclosure provides a recombinant expression vector comprising at least one transgene and a polynucleotide comprising one or more DBEs that bind to a DNA binding protein described herein. In some embodiments, the DBE binds to a DNA binding protein comprising an EBNA1 DBD, or a fragment or variant thereof.
- the DBE binds to a DNA binding protein comprising a DBD of an EBNA1 homolog, or a fragment or a variant thereof.
- the DBE is an EBV DBE or a fragment or a variant thereof.
- the polynucleotide comprises one or more DBEs of an EBV OriP sequence, wherein the polynucleotide comprises an arrangement of the one or more DBEs that is distinct from their arrangement in a wild-type EBV OriP sequence.
- the one or more DBEs are arranged such that they bind to an DNA binding protein described herein.
- the DBE is an NHP LCV DBE or a fragment or a variant thereof.
- the polynucleotide comprises one or more DBEs of an NHP LCV, wherein the polynucleotide comprises an arrangement of the one or more DBEs that is distinct from their arrangement in a genome of the NHP LCV.
- the one or more DBEs are operably linked.
- the one or more DBEs are contiguous.
- the one or more DBEs are interspersed with a nucleotide spacer.
- the recombinant expression vector is a DNA, e.g., a linear or circular single-stranded or double-stranded DNA.
- the recombinant expression vector is a linear single-stranded DNA. In some embodiments, the recombinant expression vector is a linear double-stranded DNA. In some embodiments, the recombinant expression vector is a linear, covalently closed double-stranded DNA. In some embodiments, the recombinant expression vector is a circular single-stranded DNA. In some embodiments, the recombinant expression vector is a circular double-stranded DNA. In some embodiments, the recombinant expression vector is a plasmid.
- the recombinant expression vector comprises a nucleic acid sequence comprising at least one transgene, wherein the at least one transgene encodes a protein to be expressed in the open reading frame of that gene.
- the recombinant expression vector comprises a nucleic acid sequence comprising a transgene (e.g., 1, 2, 3, 4, or more transgene(s)), wherein the transgene encodes a protein to be expressed in the open reading frame of that gene.
- the recombinant expression vector comprises one or more additional elements for regulating expression of the at least one transgene.
- the recombinant expression vector comprises an additional element (e.g., 1, 2, 3, 4 or more additional element(s)) for regulating expression of the transgene.
- the disclosure provides a recombinant expression vector comprising at least one transgene, a polynucleotide comprising one or more DNA binding elements (DBEs) of a DNA binding protein described herein, and one or more polynucleotides to enable and/or regulate expression of the transgene when introduced to a cell.
- the one or more polynucleotides comprises a promoter and/or enhancer.
- the one or more regulatory elements are operably linked to the transgene to enable expression.
- a transgene-encoding recombinant expression vector of the disclosure comprises a DNA binding polynucleotide (e.g., 1, 2, 3, 4 or more DNA binding polynucleotide(s)) comprising an array of sequence elements, wherein each sequence element comprises a DBE that binds a DNA binding protein described herein, or a fragment or a variant thereof.
- a DNA binding polynucleotide e.g., 1, 2, 3, 4 or more DNA binding polynucleotide(s)
- each sequence element comprises a DBE that binds a DNA binding protein described herein, or a fragment or a variant thereof.
- a recombinant expression vector of the disclosure comprises a polynucleotide comprising one or more DBEs that bind to the EBNA1 DBE of a DNA binding protein described herein.
- the one or more DBEs comprise an EBV OriP DBE or a functional variant thereof.
- the polynucleotide comprises a single EBV OriP DBE or a functional variant thereof.
- the polynucleotide comprises more than one EBV OriP DBEs or a functional variant thereof.
- the EBV viral genome comprises a region of about 1.7kb that is termed the OriP.
- wild-type EBV OriP refers to the native OriP present in the EBV genome that functions in viral replication, episomal maintenance, and other latent viral life cycle functions.
- Publicly available databases provide sequence information for the EBV genome.
- the sequence for a representative EBV reference genome is accessible via the NCBI Reference Sequence: NC_007605.1.
- NC_007605.1 The portion of the EBV reference genome corresponding to the OriP is set forth by coordinates about 7315 to about 9312 (the Family of Repeats corresponding to coordinates about 7421 to about 8030, and the Dyad Symmetry corresponding to coordinates about 9021 to about 9135).
- the OriP corresponds to nucleotides about 107 to about 1821 of SEQ ID NO: 2.
- the OriP comprises two distinct binding sites for the EBNA1 DBD. The first is referred to as “the family of repeats” or “FR” and the second is referred to as the “Dyad Symmetry” or “DS.”
- the FR comprises a nucleotide sequence corresponding to nucleotides about 107 to about 731 of SEQ ID NO: 2.
- the DS comprises a nucleotide sequence corresponding to nucleotides about 1707 to about 1821 of SEQ ID NO: 2.
- the FR comprises a series of repeats of a 30 bp sequence (referred to herein as the “FR repeat”), comprising (i) a palindromic 12 bp DBE (referred to herein as the “OriP DBE”); and (ii) an 18 bp spacer (referred to herein as the “FR spacer”).
- the FR comprises 21 FR repeats.
- the DS comprises a series of repeats of the OriP DBE.
- the DS comprises 4 OriP DBEs. Exemplary OriP DBE nucleotide sequences are provided in Table 8 and exemplary FR spacer sequences are provided in Table 9.
- An exemplary FR repeat sequence combines (i) a nucleotide sequence from Table 8 linked to (ii) a nucleotide sequence from Table 9, wherein the 3 ⁇ terminus of (i) is linked to the 5 ⁇ terminus of (ii).
- Table 8 Exemplary Nucleotide Sequences for OriP DBE
- Table 9 Exemplary Nucleotide Sequences for FR spacer [00437]
- the DBE comprises or consists of a sequence having at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% to a sequence set forth in Table 8.
- the DBE comprises or consists of a sequence having 1, 2, 3, 4, 5, or more mismatches relative to a sequence set forth in Table 8. In some embodiments, the DBE comprises or consists of a sequence set forth in Table 8. [00438]
- the DNA binding polynucleotide comprises an EBV DBE described herein (e.g., 1-50 EBV DBEs described herein), or a fragment or a variant thereof.
- the EBV DBE comprises SEQ ID NO: 51.
- the EBV DBE comprises a mismatch (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 51.
- the EBV DBE comprises a mismatch (e.g., 1, 2, 3, 4, or 5 mismatches) relative to a sequence set forth in Table 8.
- the EBV DBE comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 51.
- the EBV DBE comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one sequence set forth in Table 8.
- the EBV DBE comprises a nucleotide sequence set forth in SEQ ID NO: 51. In some embodiments, the EBV DBE comprises a nucleotide sequence set forth in Table 8. (ii) DBE of an NHP LCV [00439] In some embodiments, the DBE is derived from an NHP LCV. Methods to identify a DBE in the genome of an NHP LCV are known in the art. In some embodiments, the LCV genome is sequenced and assembled according to methods described herein. In some embodiments, the LCV genome is sequenced using a short read method (e.g., Illumina-based sequencing).
- a short read method e.g., Illumina-based sequencing
- an LCV genome sequenced using a short read method will lack repeating sequence elements in a first sequence assembly, whereas an LCV genome generated using a long read method will comprise repeating sequence elements in the first sequence assembly.
- the LCV genome is sequenced using a short read method and unknown portions of the genome are further extended by generating primers near the ends of the known part of the genome and obtaining the sequence of the unknown region using known sequencing methods (e.g., Sanger sequencing).
- the unknown portion of the genome is amplified by PCR and sequenced using nanopore sequencing.
- the assembled LCV genome is searched for a repeating sequence element comprising substantial sequence identity to an EBV DBE or a DBE described herein.
- a repeat-finding software is used to identify a repeating sequence element in the assembled LCV genome.
- An exemplary repeat-finding software is tandem repeats finder (see Benson (1999) Nucleic Acids Res 27:573).
- a motif discovery tool is used to identify a repeating sequence element in the assembled LCV genome.
- An exemplary motif discovery tool is MEME (available via meme- suite.org/meme/).
- the assembled LCV genome is searched for a palindromic sequence or a substantially palindromic sequence using the motif discovery tool.
- the DBE is a repeating sequence element identified in the genome of an NHP LCV described herein.
- the repeating sequence element is a palindrome. In some embodiments, the repeating sequence element is substantially palindromic. In some embodiments, at least about 50%, about 60%, about 70%, about 80%, or about 90% of the nucleotides in the repeating sequence element are palindromic. In some embodiments, the repeating sequence element is about 5 to about 50 nucleotides in length. In some embodiments, the repeating sequence element is about 5 to about 40 nucleotides in length. In some embodiments, the repeating sequence element is about 5 to about 30 nucleotides in length. In some embodiments, the repeating sequence element is about 5 to about 20 nucleotides in length.
- the repeating sequence element is about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 nucleotides in length.
- the repeating sequence element, a variant thereof e.g., a variant comprising 1, 2, 3, or 4 mismatches relative to the repeating sequence element
- a fragment thereof e.g., a fragment comprising a deletion of 1, 2, 3, or 4 nucleotides at the 5’end, internally, or 3’end
- the DBE is a repeating sequence element identified in the genome of Callitrichine gammaherpesvirus 3.
- the repeating sequence element identified in the genome of Callitrichine gammaherpesvirus 3 is selected from (i) CGCCAACAAACGTTG (SEQ ID NO: 317), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 317, or a nucleotide sequence having at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 317; (ii) CAACACCCAGTCACGCAGTCTCAAGGGTCCT (SEQ ID NO: 318), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 318, or a nucleotide sequence having at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 318; (iii) TTTGTTGGCGCCAACAAA (SEQ ID NO:
- the DBE comprises or consists of a sequence having at least about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% to a sequence selected from SEQ ID NOs: 317-319, 263, and 336. In some embodiments, the DBE comprises or consists of a sequence having 1, 2, 3, 4, 5, or more mismatches relative to a sequence selected from SEQ ID NOs: 317-319, 263, and 336. In some embodiments, the DBE comprises or consists of a sequence selected from SEQ ID NOs: 317- 319, 263, and 336.
- the disclosure provides a recombinant expression vector comprising a transgene (e.g., 1, 2, 3, 4, or more transgene(s)) and a DNA binding polynucleotide (e.g., 1, 2, 3, 4, or more DNA binding polynucleotide(s)).
- a transgene e.g., 1, 2, 3, 4, or more transgene(s)
- a DNA binding polynucleotide e.g., 1, 2, 3, 4, or more DNA binding polynucleotide(s)
- the DNA binding polynucleotide comprises an array of sequence motifs (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more operably linked sequence motifs), wherein each sequence motif comprises a DBE that binds to a DNA binding protein described herein, and/or a fragment thereof or a variant thereof (e.g., a fragment of the DBE or a variant having 1 or 2 mismatches relative to the DBE, wherein the fragment or the variant retains specific binding to the DNA binding protein of the system).
- the recombinant expression vector comprises one DNA binding polynucleotide.
- the recombinant expression vector comprises more than one DNA binding polynucleotides (e.g., 2, 3, 4, 5, or 6 operably linked DNA binding polynucleotides).
- the DNA binding protein binds to the recombinant expression vector (or the portion thereof comprising the DNA binding polynucleotide) with a binding affinity (KD) of 10 pM to 10 ⁇ M.
- the DNA binding protein binds to the recombinant expression vector (or the portion thereof comprising the DNA binding polynucleotide) with an apparent binding affinity (KD apparent) that is about 10-fold to about 1,000,000-fold higher than its binding affinity (KD) to a single sequence motif (e.g., the DBE, or a fragment or a variant thereof) of the DNA binding polynucleotide.
- KD apparent apparent binding affinity
- sequence motifs are arranged such that the DNA binding polynucleotide binds to a plurality of DNA binding proteins.
- the sequence motifs are arranged to provide multiple binding sites for the DNA binding protein, thereby yielding a DNA binding polynucleotide that binds to a plurality of DNA binding proteins (e.g., as compared to a DNA binding polynucleotide comprising a single sequence motif).
- the arrangement of the sequence motifs to provide multiple binding sites for the DNA binding protein results in increased propensity to form a complex comprising the DNA binding polynucleotide and a plurality of DNA binding proteins, e.g., as compared to a DNA binding polynucleotide comprising a single sequence motif.
- the sequence motifs of the array are operably linked.
- sequence motifs of the array are contiguous. In some embodiments, the sequence motifs of the array are interspersed with a spacer sequence. [00446] In some embodiments, the sequence element comprises a DBE of EBV described herein, or a fragment or a variant thereof. In some embodiments, the sequence element comprises a DBE of an NHP LCV described herein, or a fragment or a variant thereof.
- the sequence elements of the array are arranged in such a manner (e.g., having a number, orientation, sequence similarity, and/or spacing) that the DNA binding polynucleotide binds to a DNA binding protein described herein, e.g., as determined by a method of measuring binding interactions described herein.
- the array comprises at least 2, at least 3, or at least 4 sequence elements.
- the array comprises about 4 to about 60, about 4 to about 50, about 4 to about 40, about 4 to about 30, about 4 to about 20, about 4 to about 10, about 8 to about 60, about 8 to about 50, about 8 to about 40, about 8 to about 30, about 8 to about 20, about 8 to about 10, about 12 to about 60, about 12 to about 50, about 12 to about 40, about 12 to about 30, about 12 to about 20, about 16 to about 60, about 16 to about 50, about 16 to about 40, about 16 to about 30, about 16 to about 20, about 20 to about 60, about 20 to about 50, about 20 to about 40, or about 20 to about 30 sequence elements.
- the array comprises about 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 sequence elements.
- the array comprises about 4 to about 20 sequence elements. In some embodiments, the array comprises about 10 to about 20 sequence elements. In some embodiments, the DNA binding polynucleotide comprises about 20 sequence elements. In some embodiment, each sequence element comprises a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), or a fragment or a variant thereof. In some embodiments, the sequence elements are the same. In some embodiments, the sequence elements are different. [00448] In some embodiments, the sequence elements are linked by a phosphate linkage or an analog thereof (e.g., a phosphorothioate linkage). In some embodiments, the sequence elements are linked by a spacer sequence described herein.
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), and wherein the 5 ⁇ sequence element and the 3 ⁇ sequence element are selected from the DBE and a fragment or a variant thereof.
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element, an internal sequence element, and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element, the internal sequence element, and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), wherein the 5 ⁇ sequence element, the internal sequence element, and the 3 ⁇ sequence element are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment thereof, and/or a variant thereof.
- a DBE described herein e.g., a DBE of EBV or a DBE of an NHP LCV
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element, at least two internal sequence elements, and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element, the at least two internal sequence elements, and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), and wherein the 5 ⁇ sequence element, the at least two internal sequence elements, and the 3 ⁇ sequence element are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment thereof, and/or a variant thereof.
- a DBE described herein e.g., a DBE of EBV or a DBE of an NHP LCV
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element, about 2 to about 48 internal sequence elements, and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), and wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment thereof, and/or a variant thereof.
- a DBE described herein e.g., a DBE of EBV or a DBE of an NHP LCV
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element, about 2 to about 18 internal sequence elements, and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), and wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment thereof, and/or a variant thereof.
- a DBE described herein e.g., a DBE of EBV or a DBE of an NHP LCV
- the array comprises 5 ⁇ to 3 ⁇ a 5 ⁇ sequence element, about 8 to about 15 internal sequence elements, and a 3 ⁇ sequence element, wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are operably linked (e.g., by a phosphate linkage or by a spacer sequence), and wherein the 5 ⁇ sequence element, the internal sequence elements, and the 3 ⁇ sequence element are selected from a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment thereof, and/or a variant thereof.
- the sequence elements are the same.
- the sequence elements are each from a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV).
- the sequence elements are each a fragment of the DBE (e.g., a fragment comprising 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region relative to the DBE).
- the sequence elements are each a variant of the DBE (e.g., a variant comprising 1 or 2 mismatches relative to the DBE or having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to the DBE).
- a majority (i.e., more than 50%) of the sequence elements are the same.
- the majority (i.e., more than 50%) of the sequence elements are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), wherein if the majority is less than 100%, the remaining sequence elements are each individually selected from a fragment of the DBE (e.g., a fragment comprising 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region relative to the DBE) and a variant of the DBE (e.g., a variant comprising 1 or 2 mismatches relative to the DBE or having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to the DBE).
- a proportion (e.g., about 20% to about 100%) of the sequence elements are the same.
- a proportion (e.g., about 20% to about 100%) of the sequence elements are a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), wherein if the proportion is less than 100%, the remaining sequence elements are each individually selected from a fragment of the DBE (e.g., a fragment comprising 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region relative to the DBE) and a variant of the DBE (e.g., a variant comprising 1 or 2 mismatches relative to the DBE or having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to the DBE).
- a fragment of the DBE e.g., a fragment comprising 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region relative to the DBE
- the array comprises about 4 to about 50 sequence elements, wherein the sequence elements are the same. In some embodiments, the array comprises about 10 to about 50 sequence elements, wherein the sequence elements are the same. In some embodiments, the array comprises about 10 to about 40 sequence elements, wherein the sequence elements are the same. In some embodiments, the array comprises about 10 to about 30 sequence elements, wherein the sequence elements are the same. In some embodiments, the array comprises about 10 to about 20 sequence elements, wherein the sequence elements are the same. In some embodiments, the array comprises about 15 to about 25 sequence elements, wherein the sequence elements are the same.
- the sequence elements each comprise a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV). In some embodiments, the sequence elements each comprise a fragment of a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV). In some embodiments, the sequence elements each comprise a fragment of a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), the fragment comprising one or more deletions at the 5 ⁇ end, the 3 ⁇ end, and/or an internal region relative to the DBE.
- a DBE described herein e.g., a DBE of EBV or a DBE of an NHP LCV
- the sequence elements each comprise a fragment of a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV)
- the fragment comprising one or more deletions at the 5 ⁇ end,
- sequence elements each comprise a fragment of a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), the fragment comprising 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region relative to the DBE.
- sequence elements each comprise a variant of a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), the variant comprising 1 or 2 mismatches relative to the DBE.
- the sequence elements each comprise a variant of the DBE that binds the DNA binding protein, the variant having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to the DBE.
- the sequence elements are consecutive (i.e., directly linked by a phosphate linkage or an analog thereof (e.g., a phosphorothioate linkage)).
- the sequence elements are operably linked by a linker.
- the linker comprises a nucleotide spacer sequence.
- the nucleotide spacer sequence is at least 1, 2, 3, 4, or 5 nucleotide(s) in length.
- the nucleotide spacer sequence is up to 60, 65, 50, 45, 40, 35, 30, 35, 20, 25, 20, 15, or 10 nucleotides in length.
- the nucleotide spacer sequence is 1-60 nucleotides, 1-50 nucleotides, 1-40 nucleotides, 1-30 nucleotides, 1-20 nucleotides, 1-15 nucleotides, 5-60 nucleotides, 5-50 nucleotides, 5-40 nucleotides, 5-30 nucleotides, 5-20 nucleotides, 5-15 nucleotides, 10-60 nucleotides, 10-50 nucleotides, 10-40 nucleotides, 10-30 nucleotides, 10-20 nucleotides, 10-15 nucleotides, 15-60 nucleotides, 15-50 nucleotides, 15-40 nucleotides, 15-30 nucleotides, 15-20 nucleotides, 20-60 nucleotides, 20-50 nucle
- the nucleotide spacer sequence is 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 nucleotides.
- the nucleotide spacer sequence has AT-content of at least about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.
- the nucleotide spacer sequence comprises SEQ ID NO: 61.
- the nucleotide spacer sequence comprises a sequence having at least about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% identity.
- the nucleotide spacer sequence is set forth in Table 9. [00460]
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and an array of DBEs, wherein the sequence of each DBE in the array is selected from an OriP DBE consensus sequence, a fragment of an OriP DBE consensus sequence, and a variant of an OriP DBE consensus sequence.
- the OriP DBE consensus sequence is calculated from the alignment of the DBE repeat units in the OriP FR and/or OriP DS. In some embodiments, the OriP DBE consensus sequence is set forth in SEQ ID NO: 51. [00461] In some embodiments, the DBEs in the array are the same sequence. In some embodiments, the DBEs in the array are not the same sequence. In some embodiments, a majority (i.e., more than 50%) of the DBEs in the array are the same sequence. In some embodiments, a portion (e.g., greater than about 20%) of the DBEs in the array are the same sequence.
- the DBEs in the array are consecutive (i.e., directly linked by a phosphate linkage or an analog thereof (e.g., a phosphorothioate linkage)).
- the DBEs in the array are operably linked by a linker.
- the linker comprises a nucleotide spacer sequence.
- the nucleotide spacer sequence is at least 1, 2, 3, 4, or 5 nucleotide(s) in length.
- the nucleotide spacer sequence is up to 60, 65, 50, 45, 40, 35, 30, 35, 20, 25, 20, 15, or 10 nucleotides in length.
- the nucleotide spacer sequence is 1-60 nucleotides, 1-50 nucleotides, 1-40 nucleotides, 1-30 nucleotides, 1-20 nucleotides, 1-15 nucleotides, 5-60 nucleotides, 5-50 nucleotides, 5-40 nucleotides, 5-30 nucleotides, 5-20 nucleotides, 5-15 nucleotides, 10-60 nucleotides, 10-50 nucleotides, 10-40 nucleotides, 10-30 nucleotides, 10-20 nucleotides, 10-15 nucleotides, 15-60 nucleotides, 15-50 nucleotides, 15-40 nucleotides, 15-30 nucleotides, 15-20 nucleotides, 20-60 nucleotides, 20-50 nucleotides, 20-40 nucleotides, or 20- 30 nucleotides.
- the nucleotide spacer sequence is 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 nucleotides.
- the nucleotide spacer sequence has AT-content of at least about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, or about 95%.
- the nucleotide spacer sequence comprises SEQ ID NO: 61.
- the nucleotide spacer sequence comprises a sequence having at least about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% identity. In some embodiments, the nucleotide spacer sequence is set forth in Table 9.
- the DNA binding polynucleotide comprises a sequence according to the formula 5 ⁇ -([D1]-[L1])-([Dn]-[Ln])x-3 ⁇ , wherein [D1] and [Dn] each represent a sequence element, wherein the sequence element comprises a DBE described herein (e.g., a DBE of EBV or a DBE of an NHP LCV), a fragment of the DBE (e.g., a fragment comprising about 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region of the DBE), and/or a variant of the DBE (e.g., a variant comprising 1 or 2 mismatches relative to the DBE or having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to the DBE), wherein [L1] and [Ln] each represent a linker to the 3 ⁇ adjacent DBE selected from a phosphate linkage or
- ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of about 20-50 nucleotides. In some embodiments, ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of about 20-40 nucleotides. In some embodiments, ([D1]- [L1]) and ([Dn]-[Ln]) each have a length of about 20-30 nucleotides. In some embodiments, ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides.
- the DNA binding polynucleotide comprises a sequence according to the formula 5 ⁇ -([D1]-[L1])-([Dn]-[Ln])x-3 ⁇ , wherein [D1] and [Dn] each represent a DBE selected from an OriP DBE consensus sequence (e.g., SEQ ID NO: 51), a fragment of an OriP DBE consensus sequence (e.g., a fragment comprising about 1 or 2 deletions at the 5 ⁇ end, the 3 ⁇ end, or an internal region of SEQ ID NO: 51), and/or a variant of an OriP DBE consensus sequence (e.g., a variant comprising 1 or 2 mismatches relative to SEQ ID NO: 51 or having at least about 80%, about 85%, about 90%, about 95%, or about 99% sequence identity to SEQ ID NO: 51), wherein [L1] and [Ln] each represent a linker to the 3 ⁇ adjacent DBE selected from a phosphate
- ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of about 20-50 nucleotides. In some embodiments, ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of about 20-40 nucleotides. In some embodiments, ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of about 20-30 nucleotides. In some embodiments, ([D1]-[L1]) and ([Dn]-[Ln]) each have a length of 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides.
- the DNA binding polynucleotide comprises a DBE described herein, wherein the DBE is an EBV DBE or an NHP LCV DBE described herein. In some embodiments, the DNA binding polynucleotide comprises more than one DBE described herein. In some embodiments, the DNA binding polynucleotide comprises 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 DBEs described herein. In some embodiments, the DNA binding polynucleotide comprises 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 DBEs described herein.
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and one or more OriP DBEs.
- the one or more OriP DBEs each comprise a consensus sequence as set forth in SEQ ID NO: 51.
- the one or more OriP DBEs each comprises a nucleotide sequence comprising one or more mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 51.
- the one or more OriP DBEs each comprises a nucleotide sequence comprising one or more mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to any one sequence set forth in Table 8.
- the one or more OriP DBEs each comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 51.
- the one or more OriP DBEs each comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one sequence set forth in Table 8.
- the one or more OriP DBEs each comprises a nucleotide sequence set forth in SEQ ID NO: 51.
- the one or more OriP DBEs each comprises a nucleotide sequence set forth in Table 8. [00467]
- the polynucleotide comprises one OriP DBE described herein. In some embodiments, the polynucleotide comprises more than one OriP DBE described herein.
- the polynucleotide comprises 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 OriP DBEs described herein. In some embodiments, the polynucleotide comprises 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 OriP DBEs described herein. [00468] In some embodiments, the polynucleotide comprises an array of at least about 4 to about 50 DBEs, wherein each DBE comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any sequence set forth in Table 8.
- the DNA binding polynucleotide comprises at least about 4 to about 50 DBEs, wherein each DBE independently comprises a nucleotide sequence set forth in Table 8. In some embodiments, the DNA binding polynucleotide comprises at least about 4 to about 50 DBEs, wherein each DBE independently comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity a sequence set forth in SEQ ID NOs: 317-319, 335, and 336.
- the DNA binding polynucleotide comprises at least about 4 to about 50 DBEs, wherein each DBE independently comprises a sequence selected from SEQ ID NOs: 317-319, 335, and 336.
- the polynucleotide comprises one or more FR repeat sequences.
- the FR repeat sequence comprises an OriP DBE described herein operably linked to an FR spacer.
- the FR spacer comprises nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 61.
- the FR spacer comprises nucleotide sequence comprising SEQ ID NO: 61. In some embodiments, the FR spacer comprises a nucleotide sequence comprising one or mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 61. In some embodiments, the FR spacer comprises a nucleotide sequence set forth in Table 9.
- the polynucleotide comprises one or more FR repeat sequences, wherein the one or more FR repeat sequences each comprises a nucleotide sequence comprising from 5 ⁇ to 3 ⁇ : (i) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a nucleotide sequence set forth in Table 8; and (ii) a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to a nucleotide sequence set forth in Table 9, wherein (i) and (ii) are operably linked.
- the one or more FR repeat sequences each comprises a nucleotide sequence comprising from 5 ⁇ to 3 ⁇ : (i) a nucleotide sequence set forth in Table 8; and (ii) a nucleotide sequence set forth in Table 9, wherein (i) and (ii) are operably linked. In some embodiments, (i) and (ii) are directly linked. In some embodiments, the one or more FR repeat sequences each comprises TAGCATATGCTACCCAGATATAGATTAGGA (SEQ ID NO: 68).
- the one or more FR repeat sequences each comprises a nucleotide sequence comprising one or mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 68. In some embodiments, the one or more FR repeat sequences each comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 68.
- the polynucleotide comprises a sequence represented by the formula: 5 ⁇ -(C-D)n-3 ⁇ , wherein (i) C comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one nucleotide sequence set forth in Table 8; (ii) D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one nucleotide sequence set forth in Table 9; and (iii) n is an integer of 1 to 50.
- C comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 51
- D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one nucleotide sequence set forth in Table 9, and n is 1-50.
- C comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 51
- D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 61
- n is 1-50.
- C comprises a nucleotide sequence having one or more mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 51
- D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one nucleotide sequence set forth in Table 9, and n is 1-50.
- C comprises a nucleotide sequence having one or more mismatches (e.g., 1, 2, 3, 4, or 5 mismatches) relative to SEQ ID NO: 51
- D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 61
- n is 1-50.
- C comprises SEQ ID NO: 51
- D comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to any one nucleotide sequence set forth in Table 2, and n is 1-50.
- C comprises SEQ ID NO: 51
- D comprises SEQ ID NO: 61
- n is 1-50.
- n is 1-40.
- n is 1-30.
- n is 1-25 or 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1.
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a nucleotide sequence comprising an FR region or portion thereof.
- the FR region corresponds to nucleotides about 107 to about 731 of SEQ ID NO: 2.
- the FR region comprises SEQ ID NO: 69.
- the polynucleotide comprises at least one transgene and a nucleotide sequence comprising an FR region or portion thereof, wherein the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 600 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 107 to about 731 of SEQ ID NO: 2.
- the polynucleotide comprises at least one transgene and a nucleotide sequence comprising an FR region or portion thereof, wherein the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 600 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 69.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, about 150, about 200, about 250, about 300, about 350, about 400, about 450, about 500, or about 600 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 107 to about 731 of SEQ ID NO: 2.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, about 150, about 200, about 250, about 300, about 350, about 400, about 450, about 500, or about 600 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 69.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence corresponding to nucleotides about 107 to about 731 of SEQ ID NO: 2. In some embodiments, the nucleotide sequence has 100% identity to the nucleotide sequence corresponding to nucleotides about 107 to about 731 of SEQ ID NO: 2. In some embodiments, the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence of SEQ ID NO: 69.
- the nucleotide sequence has 100% identity to the nucleotide sequence of SEQ ID NO: 69.
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a nucleotide sequence comprising a DS region or portion thereof.
- the DS region corresponds to nucleotides about 1707 to about 1821 of SEQ ID NO: 2.
- the DS region comprises SEQ ID NO: 70.
- the polynucleotide comprises at least one transgene and a nucleotide sequence comprising a DS region or portion thereof, wherein the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 115 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 1707 to about 1821 of SEQ ID NO: 2.
- the polynucleotide comprises at least one transgene and a nucleotide sequence comprising an FR region or portion thereof, wherein the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 115 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 70.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, or about 110 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 1707 to about 1821 of SEQ ID NO: 2.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, or about 110 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 70. In some embodiments, the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence corresponding to nucleotides about 1707 to about 1821 of SEQ ID NO: 2.
- the nucleotide sequence has 100% identity to the nucleotide sequence corresponding to nucleotides about 1707 to about 1821 of SEQ ID NO: 2. In some embodiments, the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence of SEQ ID NO: 70. In some embodiments, the nucleotide sequence has 100% identity to the nucleotide sequence of SEQ ID NO: 70.
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a nucleotide sequence comprising (i) a nucleotide sequence comprising an FR region or a portion thereof described herein, and (ii) a nucleotide sequence comprising a DS region or portion thereof described herein, wherein (i) and (ii) are operably linked.
- (i) and (ii) are directly linked.
- (i) and (ii) are linked by a nucleotide spacer sequence.
- (i) is upstream of (ii).
- (ii) is upstream of (i).
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a nucleotide sequence comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 1700 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 107 to about 1821 of SEQ ID NO: 2.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1000, about 1100, about 1200, about 1300, about 1400, about 1500, about 1600, or about 1700 contiguous nucleotides present in the nucleotide sequence corresponding to nucleotides about 107 to about 1821 of SEQ ID NO: 2.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence corresponding to nucleotides about 107 to about 1821 of SEQ ID NO: 2. In some embodiments, the nucleotide sequence comprises the nucleotide sequence corresponding to nucleotides about 107 to about 1821 of SEQ ID NO: 2.
- the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a nucleotide sequence comprising a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50 to about 1700 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 71.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to about 50, about 60, about 70, about 80, about 90, about 100, about 200, about 300, about 400, about 500, about 600, about 700, about 800, about 900, about 1000, about 1100, about 1200, about 1300, about 1400, about 1500, about 1600, or about 1700 contiguous nucleotides present in the nucleotide sequence of SEQ ID NO: 71.
- the nucleotide sequence has at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the nucleotide sequence of SEQ ID NO: 71. In some embodiments, the nucleotide sequence comprises the nucleotide sequence of SEQ ID NO: 71. [00477] In some embodiments, the disclosure provides a recombinant expression vector comprising a polynucleotide comprising at least one transgene and a fragment of an EBV OriP.
- a “fragment of an EBV OriP” refers to a truncated wild-type EBV OriP (e.g., a wild-type EBV OriP comprising a truncation at the 5 ⁇ end, the 3 ⁇ end, and/or an internal region).
- the fragment of an EBV OirP comprises a truncated sequence shorter than wild-type EBV OriP (e.g., wild-type EBV OriP having a sequence corresponding to nucleotides 107-1821 of SEQ ID NO: 2).
- the truncated sequence comprises a deletion at the 5 ⁇ end of wild-type EBV OriP (e.g., wild-type EBV OriP having a sequence corresponding to nucleotides 107-1821 of SEQ ID NO: 2). In some embodiments, the truncated sequence comprises a deletion at the 3 ⁇ end of wild-type EBV OriP (e.g., wild- type EBV OriP having a sequence corresponding to nucleotides 107-1821 of SEQ ID NO: 2).
- the truncated sequence comprises a deletion at an internal region wild- type EBV OriP (e.g., wild-type EBV OriP having a sequence corresponding to nucleotides 107- 1821 of SEQ ID NO: 2).
- the truncated sequence comprises a deletion of the DS (e.g., the DS having a sequence corresponding to nucleotides 1707-1821 of SEQ ID NO: 2) or a portion thereof.
- the truncated sequence comprises a deletion of the FR (e.g., the FR having a sequence corresponding to nucleotides 107-731 of SEQ ID NO: 2) or a portion thereof.
- the truncated sequence comprises a deletion of the DS (e.g., the DS having a sequence corresponding to nucleotides 1707-1821 of SEQ ID NO: 2) and a deletion of a portion of the FR (e.g., a portion of the FR having a sequence corresponding to nucleotides 107-731 of SEQ ID NO: 2).
- the polynucleotide is about 0.1 kb, about 0.2 kb, about 0.3 kb, about 0.4 kb, about 0.5 kb, about 0.6 kb, about 0.7 kb, about 0.8 kb, about 0.9 kb, about 1 kb, about 1.2 kb, about 1.3 kb, about 1.4 kb, about 1.5 kb, about 1.6 kb, about 1.7 kb, about 1.8 kb, about 1.9 kb, or about 2 kb in length.
- Transgene [00479] In some embodiments, the disclosure provides a recombinant expression vector encoding a single transgene.
- the recombinant expression vector encodes more than one transgene.
- the transgene comprises a nucleotide sequence that encodes genetic information for the synthesis of an RNA or polypeptide.
- the transgene encodes a polypeptide or a non-coding RNA (ncRNA).
- expression of the transgene initiates and/or regulates a biological response in the host cell.
- the product of the transgene functions to inhibit expression of one or more genes.
- the product of the transgene functions to regulate cell signaling.
- the product of the transgene functions to regulate cell-cell interactions.
- the product of the transgene functions to regulate a metabolic process. In some embodiments, the product of the transgene functions to regulate cell death. In some embodiments, the product of the transgene functions to regulate cell motility. In some embodiments, the transgene encodes a transcriptional and/or translational product for replacement of a defective gene in a host cell. [00482] In some embodiments, the transgene encodes an antigen, ribozyme, enzyme, peptide, structural protein, structural RNA, shRNA, siRNA, miRNA, gRNA/sgRNA (guide RNA/single guide RNA for CRISPR-based gene editing), a transcription factor, or a signaling molecule.
- the transgene encodes a nucleic acid having a specific function, such as binding a target molecule or catalyzing a specific reaction.
- the transgene encodes a nucleic acid that acts as an effector, inhibitor, modulator, and stimulator of a specific activity possessed by a target molecule, or a de novo activity.
- the nucleic acid interacts with a macromolecule, e.g., a DNA, RNA, polypeptides, or carbohydrate chains.
- the nucleic acid interacts with the mRNA or the genomic DNA encoding a target polypeptide.
- the nucleic acid interacts with the target polypeptide itself. In some embodiments, the nucleic acid interacts with a target nucleic acid based on sequence homology. In some embodiments, the specific recognition between the nucleic acid molecule and the target nucleic is based on the formation of tertiary structure that allows specific recognition to take place. In some embodiments, the transgene encodes a nucleic acid for reducing expression or function of a target protein. [00484] In some embodiments, the nucleic acid is selected from a ribosomal RNA, a transfer RNA, an immunostimulatory RNA, a guide RNA, and a small RNA.
- the nucleic acid is an antisense molecule, siRNA, miRNA, aptamers, ribozymes, triplex forming molecules, RNAi, or guide RNA.
- the transgene sequence encodes a rRNA.
- the transgene sequence encodes a siRNA.
- the transgene sequence encodes a shRNA.
- the transgene sequence encodes an miRNA.
- the transgene sequence encodes a tRNA.
- the small RNA is selected from an antisense oligonucleotide, a small interfering RNA, a short hairpin RNA, a microRNA, a small nucleolar RNA, and a small nuclear RNA.
- the transgene encodes an antisense oligonucleotide.
- Antisense oligonucleotide are designed to interact with a target nucleic acid molecule through either canonical or non-canonical base pairing. The interaction of the antisense oligonucleotide and the target molecule is designed to promote the destruction of the target molecule through, for example, RNAse H mediated RNA-DNA hybrid degradation.
- the antisense oligonucleotide is designed to interrupt a processing function that normally would take place on the target molecule, such as transcription or replication.
- the transgene encodes an aptamer.
- Aptamers are molecules that interact with a target molecule.
- aptamers are small nucleic acids ranging from 15-50 bases in length that fold into defined secondary and tertiary structures, such as stem-loops or G-quartets. Aptamers can bind small molecules as well as large molecules.
- the transgene encodes a ribozyme.
- Ribozymes are nucleic acid molecules that are capable of catalyzing a chemical reaction, either intra- molecularly or inter-molecularly.
- the transgene encodes a ribozyme found in natural systems, such as a hammerhead ribozyme.
- the transgene encodes a ribozymes that is not found in a natural system, but which has been engineered to catalyze specific reactions de novo.
- the ribozyme cleaves RNA or DNA substrates. Ribozymes typically cleave nucleic acid substrates through recognition and binding of the target substrate with subsequent cleavage.
- the transgene encodes a triplex forming oligonucleotide molecule.
- Triplex forming functional nucleic acid molecules are molecules that can interact with either double-stranded or single-stranded nucleic acid.
- triplex molecules interact with a target region, a structure called a triplex is formed in which there are three strands of DNA forming a complex dependent on both Watson-Crick and Hoogsteen base-pairing.
- the transgene encodes an external guide sequence.
- EGSs External guide sequences
- RNAse P aids in processing transfer RNA (tRNA) within a cell.
- Bacterial RNAse P can be recruited to cleave virtually any RNA sequence by using an EGS that causes the target RNA:EGS complex to mimic the natural tRNA substrate.
- EGS/RNAse P-directed cleavage of RNA can be utilized to cleave desired targets within eukaryotic cells.
- the transgene encodes a nucleic acid capable of inducing gene silencing through RNA interference (siRNA).
- siRNA RNA interference
- Expression of a target gene can be effectively silenced in a highly specific manner through RNA interference.
- An RNA polynucleotide with interference activity of a given gene will down-regulate the gene by causing degradation of the specific messenger RNA (mRNA) with the corresponding complementary sequence and preventing the production of protein (see Sledz and Williams, Blood, 106(3):787-794 (2005)).
- RNA molecule When an RNA molecule forms complementary Watson-Crick base pairs with an mRNA, it induces mRNA cleavage by accessory proteins.
- the source of the RNA can be viral infection, transcription, or introduction from exogenous sources.
- dsRNA double stranded RNA
- dsRNA double stranded small interfering RNAs 21-23 nucleotides in length that contain 2 nucleotide overhangs on the 3' ends
- siRNA double stranded small interfering RNAs
- a siRNA triggers the specific degradation of homologous RNA molecules, such as mRNAs, within the region of sequence identity between both the siRNA and the target RNA.
- Sequence specific gene silencing can be achieved in mammalian cells using synthetic, short double-stranded RNAs that mimic the siRNAs produced by the enzyme dicer (Elbashir, et al, Nature, 411 :494-498 (2001)) (Ui-Tei, et al, FEBSLett, 479:79-82 (2000)).
- siRNA can be chemically or in vitro-synthesized or can be the result of short double-stranded hairpin-like RNAs (shRNAs) that are processed into siRNAs inside the cell.
- WO 02/44321 describes siRNAs capable of sequence-specific degradation of target mRNAs when base- paired with 3' overhanging ends, and is herein specifically incorporated by reference for the method of making these siRNAs.
- Synthetic siRNAs are generally designed using algorithms and a conventional DNA/RNA synthesizer.
- the transgene encodes one or more siRNAs.
- the transgene encodes an shRNA or miRNA.
- the transgene encodes a polypeptide.
- the polypeptide is selected from an intracellular polypeptide, a secreted polypeptide, a membrane-bound polypeptide, and a transmembrane polypeptide.
- the polypeptide is selected from a hormone, an antibiotic, an enzyme, a signaling protein, a structural protein, an antibody or antigen binding portion thereof, or a receptor.
- the transgene encodes an antibody.
- the transgene encodes a fragment of an antibody, e.g., one that retains antigen binding capabilities.
- the transgene encodes a light chain of an antibody.
- the transgene encodes a heavy chain of an antibody.
- the transgene encodes a VH.
- the transgene encodes a VL.
- the transgene encodes a VH.
- the transgene encodes a Fab. In some embodiments, the transgene encodes a scFv. [00496] In some embodiments, the transgene encodes an immunomodulatory polypeptide or RNA. In some embodiments, the immunomodulatory polypeptide or RNA functions to stimulate an immune response. In some embodiments, the immunomodulatory polypeptide or RNA functions to dampen an immune response. In some embodiments, the immunomodulatory polypeptide or RNA functions to enhance the antitumor activity of lymphocytes. In some embodiments, the immunomodulatory polypeptide or RNA functions to enhance the immunogenicity of tumor cells.
- the transgene encodes an immunomodulatory polypeptide for inducing or modulating an immune response (e.g., a CD8+ T cell response; a vaccine response).
- the transgene encodes an immunomodulatory polypeptide selected from a cytokine, a chemokine, an immune cell activator, a multispecific immune cell engager, an antibody or antigen-binding fragment, or a TME modulator.
- the immunomodulatory polypeptide is a multispecific immune cell engager.
- the multispecific immune cell engager is a bispecific T cell engager.
- the immunomodulatory polypeptide is an antibody or antigen binding fragment thereof.
- the antibody or antigen binding fragment thereof is a tumor-targeting antibody or antigen binding fragment thereof.
- the antibody or antigen binding fragment thereof is an immune checkpoint inhibitor. T cell activation and effector functions are balanced by co- stimulatory and inhibitory signals, referred to as "immune checkpoints.” Inhibitory ligands and receptors that regulate T cell effector functions are overexpressed on tumor cells. Subsequently, agonists of co-stimulatory receptors or antagonists of inhibitory signals, result in the amplification of antigen-specific T cell responses.
- Immune checkpoint inhibitors enhances endogenous anti-tumor activity by blocking immune checkpoints.
- the immune checkpoint inhibitor is an antagonist of inhibitory signals, e.g., an antibody or antigen binding fragment that targets, for example, PD-1, PD-L1, CTLA-4, LAG3, B7-H3, B7-H4, or TIM3. These ligands and receptors are reviewed in Pardoll, D., Nature.12: 252-264, 2012. [00502]
- the antibody or antigen binding fragment is immunostimulatory.
- the antibody or antigen binding fragment is selected from anti-CD28, anti-CD3, or single chain/antibody fragments of these molecules [00503]
- the immunomodulatory polypeptide is a cytokine.
- Cytokines are a class of small proteins (e.g., about 5 to about 20 kDa) that are released by cells and affect the behavior of cells via cell signaling. Cytokines are produced by various cell types, including, without limitation, immune cells such as macrophages, B lymphocytes, T lymphocytes, and mast cells, and endothelial cells, fibroblasts, and a variety of stromal cells. A cytokine may be produced by more than cell type.
- Cytokines include, without limitation, chemokines, interferons, interleukins, lymphokines, and tumour necrosis factor.
- the cytokine comprises IL-12, IL-15, IL-15Ra, IL-18, IL- 2, CCL5, CXCL9, CXCL10, GM-CSF, IFN-gamma, IFN-alpha, FLT3-ligand, TNF-alpha, CD40L, or fragments or variants thereof.
- the transgene encodes a vaccine antigen.
- An antigen can include any protein or peptide that is foreign to the subject organism.
- Preferred antigens can be presented at the surface of antigen presenting cells (APC) of a subject for surveillance by immune effector cells, such as leucocytes expressing the CD4 receptor (CD4 T cells) and Natural Killer (NK) cells.
- APC antigen presenting cells
- the antigen is of viral, bacterial, protozoan, fungal, or animal origin.
- the antigen is a cancer antigen.
- Cancer antigens can be antigens expressed only on tumor cells and/or required for tumor cell survival. Certain antigens are recognized by those skilled in the art as immuno-stimulatory (i.e., stimulate effective immune recognition) and provide effective immunity to the organism or molecule from which they derive.
- the antigen is a viral antigen.
- the antigen is a bacterial antigen. In some embodiments, the antigen is an allergen or environmental antigen. In some embodiments, the antigen is a tumor antigen.
- the transgene encodes one or more reprogramming factors or transdifferentiation factors.
- reprogramming factors factors, e.g. proteins, RNAs, etc., for example, Oct3/4, Sox2, Klf4, c-Myc, Nanog, Lin-28, miR302/367, that reprogram somatic cells to become induced pluripotent stem cells (iPS cells), e.g. human iPS cells.
- transdifferentiation factors it is meant factors, e.g.
- the transgene encodes a polypeptide or RNA that directs the development of stem or progenitor cells into desired cell fates.
- the transgene encodes a protein of a genome editing system (for example, an RNA-guided nuclease such as a Cas9 protein, a zinc finger nuclease or a TALEN).
- a recombinant expression vector of the disclosure comprises a nucleic acid sequence in a form suitable for expression of at least one transgene in a host cell.
- the recombinant expression vector comprises one or more regulatory sequences, selected on the basis of the host cells to be used for expression, which is operatively linked to the at least one transgene to be expressed.
- "operably linked" is intended to mean that the at least one transgene is linked to the regulatory sequence(s) in a manner which allows for expression of the at least one transgene.
- the term "regulatory sequence” is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transfected, the level of expression of protein desired, and the like.
- the recombinant expression vector comprises a replicon and control sequences that are derived from species compatible with the host cells are used in connection with these hosts.
- Expression vectors for use in mammalian cells ordinarily include an origin of replication, a promoter located in front of the gene to be expressed, along with any necessary ribosome binding sites, RNA splice sites, polyadenylation site, and transcriptional terminator sequences.
- a recombinant expression vector of the disclosure comprises promoter.
- promoter refers to an untranslated sequence located upstream (5') to the start codon of a transgene (e.g., within about 100 to 1000 bp) that modulates the transcription and translation of the transgene to which it is operably linked.
- the promoter is operably linked to the at least one transgene.
- transcription of the transgene is initiated and regulated by the promoter to which it is operably linked.
- an expression cassette comprising a promoter operably linked to a transgene will express the transgene if the RNA synthesis initiates at the promoter.
- the promoters and corresponding protein or polypeptide expression may be ubiquitous, wherein transcription is active in a wide range of cells, tissues and species or cell-type specific, tissue-specific, or species specific. [00510] Such promoters typically fall into two classes, inducible and constitutive.
- Inducible promoters are promoters that initiate increased levels of transcription from DNA under their control in response to some change in extracellular environment culture conditions, e.g., the presence or absence of a nutrient, drug, change in temperature, or change in expression of a protein in a cell, e.g. the tetracycline-inducible promoters. Constitutive, or ubiquitously acting, promoters are always active, e.g. the CMV-3-actin promoter/enhancer. In some embodiments, the promoters is constitutive. A “constitutive promoter” refers to one that is continually active. In some embodiments, the promoter is inducible.
- tissue-specific promoter refers to a promoter that has activity in target cell type and/or target tissue. Tissue-specific promoters are known in the art, see, e.g., Zheng, et al (2009) Methods Mol Biol 434:205.
- a recombinant expression vector of the disclosure comprising a transgene operably linked to a tissue-specific promoter has reduced expression in a non-target tissue as compared to the target tissue or as compared expression in the non-target tissue of a recombinant expression vector comprising the transgene operably linked to a control promoter.
- the promoter is a cancer-specific promoter.
- a “cancer- specific promoter” refers to a promoter that has activity in cancer cells as compared to non- cancer cells, but without specificity to cancer of a particular tissue.
- a recombinant expression vector of the disclosure comprising a transgene operably linked to a cancer-specific promoter has increased expression in cancer cells as compared to non-cancer cells.
- cancer-specific promoters include, but are not limited to, promoters of gene selected from hTERT, EGFR, Her2/Neu, VEGFR, folate receptor, CD71, tumor resistance antigen 1-60, cyclooxygenase, cytokeratin 18, cytokeratin 19, and survivin (see, e.g., Montano-Samaniego, et al (2020) Front Oncol 10:605380).
- the promoter is a tumor-specific promoter.
- a “tumor-specific promoter” refers to a promoter that has activity in cancer cells as compared to non- cancer cells, but with specificity for certain types of cancer cells.
- Exemplary cancer-specific promoters are known in the art and include, but are not limited to, promoters of genes selected from alpha-fetoprotein, thyroid transcription factor 1 (TTF-1), glypican-3 protein (GPC3), human secretory leukocyte protease inhibitor (hSLPI), ERBB2, Mucin 1 (MUC1), L-plastin, ⁇ lactalbumin (LALBA), cyclooxygenase 2 (COX2), epithelial glycoprotein (EPG2), A33, uPAR, carcinoembryonic antigen (CEA), breast cancer 1 (BRCA1) and BRCA2 (see, e.g., Montano-Samaniego, et al (2020) Front Oncol 10:605380).
- TTF-1 thyroid transcription factor 1
- GPC3 glypican-3 protein
- hSLPI human secretory leukocyte protease inhibitor
- MUC1 Mucin 1
- LALBA ⁇ lactalbumin
- the promoter is an alpha-fetoprotein promoter.
- Transcription by higher eukaryotes of transgenes in expression cassettes may be increased by inserting an enhancer sequence into the vector.
- Enhancers are cis-acting elements of DNA, usually about from 10 to 300 bp, which act on a promoter to increase its transcription. Enhancers are relatively orientation- and position-independent, having been found 5' and 3' to the transcription unit, within an intron, as well as within the coding sequence itself.
- the recombinant expression vector comprises an enhancer. In some embodiments, the enhancer is contiguous with the promoter sequence.
- the enhancer and the promoter are operably linked by a nucleotide spacer. Enhancer sequences influence promoter-dependent gene expression.
- the enhancer is a human CMV enhancer.
- the enhancer is a mouse CMV enhancer.
- Any suitable promoter region or promoter sequence are suitable for use in the recombinant expression vectors of the disclosure, so long as the promoter region promotes expression of the at least one transgene.
- the promoter is a CAG promoter.
- the promoter is a CMV promoter.
- the promoter is a human EF1alpha promoter.
- Expression cassettes may also contain sequences necessary for the termination of transcription and for stabilizing the mRNA. Such sequences are commonly available from the 5' and, occasionally 3', untranslated regions of eukaryotic or viral DNAs or cDNAs. These regions contain nucleotide segments transcribed as polyadenylated fragments in the untranslated portion of the mRNA encoding the transgene of interest.
- the recombinant expression vector comprises a polynucleotide comprising a polyadenylation sequence/polyA signal. In some embodiments, the polyadenylation sequence/polyA signal is any one described herein or known in the art.
- the polyadenylation sequence is a bovine growth hormone polyA signal.
- the recombinant expression vector comprises an intronic sequence operably linked to the same promoter that will mediate the expression of the transgene.
- the intronic sequence is located in any configuration relative to the transgene.
- the intronic sequence may be located upstream, or 5', of the transgene, i.e. between the promoter and the initiation codon for the transgene.
- the intronic sequence may be located within the transgene, i.e. flanked by two exons of the transgene.
- the intronic sequence may be located downstream of the transgene, e.g.
- nucleic acid refers to both RNA and DNA, including mRNA, cDNA, genomic DNA, synthetic (e.g., chemically synthesized) DNA, and nucleic acid analogs. Methods of making nucleic acids are known in the art.
- Isolated nucleic acids can be chemically synthesized, either as a single nucleic acid molecule (e.g., using automated DNA synthesis in the 3′ to 5′ direction using phosphoramidite technology) or as a series of oligonucleotides.
- One or more pairs of long oligonucleotides e.g., >100 nucleotides
- nucleic acids e.g., mRNA and/or recombinant expression vectors
- IVT in vitro transcription
- synthetic and/or chemical synthesis methods or a combination thereof.
- Enzymatic (IVT) solid-phase, liquid-phase, combined synthetic methods, small region synthesis, and ligation methods are utilized.
- one or more nucleic acids (e.g., mRNA and/or recombinant expression vectors) of the disclosure are synthesized by enzymatic methods (e.g., in vitro transcription, IVT).
- one or more nucleic acids (e.g., mRNA and/or recombinant expression vectors) of the disclosure are made using IVT enzymatic synthesis methods. Methods of making polynucleotides by IVT are known in the art and are described in International Application PCT/US2013/30062.
- nucleic acids e.g., mRNA and/or recombinant expression vectors
- RNAs of greater length are produced two or more molecules that are ligated together.
- nucleic acid modifications such as those described herein, are introduced during or after chemical synthesis and/or enzymatic generation of the nucleic acids, e.g., modifications that enhance stability, reduce the likelihood or degree of innate immune response, and/or enhance other attributes, as described in the art.
- non-natural modified nucleobases are introduced into a nucleic acid (e.g., mRNA and/or recombinant expression vectors) of the disclosure, during synthesis or post-synthesis.
- modifications are on intemucleoside linkages, purine or pyrimidine bases, or sugar.
- the modification is introduced at the terminal of a polynucleotide; with chemical synthesis or with a polymerase enzyme. Examples of modified nucleic acids and their synthesis are disclosed in PCT application No. PCT/US2012/058519.
- a recombinant expression vector described herein is expressed in a host cell using standard techniques of molecular biology. Standard recombinant DNA and molecular cloning techniques used herein are well known in the art and are described in Sambrook, J., Fritsch, E.F. and Maniatis, T., Molecular Cloning: A Laboratory Manual, 2nd ed.; Cold Spring Harbor Laboratory: Cold Spring Harbor, N.Y., (1989) and by Silhavy, T.J., Bennan, M.L.
- the method is used to demonstrate a binding interaction between a DNA-binding protein described herein and a recombinant expression vector described herein comprising polynucleotide comprising one or more DBEs for binding the DNA binding protein.
- the method is used to measure transfection efficiency (e.g., in vitro or in vivo transfection efficiency) of a system described herein comprising a DNA binding protein described herein, or a nucleic acid or a recombinant expression vector encoding the DNA binding protein, and a recombinant expression vector comprising a DNA binding polynucleotide described herein that binds to the DNA binding protein, e.g., as compared to the recombinant expression vector alone.
- the method is used to measure transfection efficiency (e.g., in vitro or in vivo transfection efficiency) of a system described herein comprising an mRNA encoding a DNA binding protein and a recombinant expression vector comprising polynucleotide comprising one or more DBEs for binding the DNA binding protein, e.g., as compared to the recombinant expression vector alone.
- Characterization of Binding Interactions [00530]
- the disclosure provides an mRNA encoding a DNA binding protein, wherein the DNA binding protein has a binding affinity to a target.
- the DNA binding protein comprises a chromatin binding domain and a DBD described herein (e.g., a DBD of an EBNA1 homolog described herein), wherein the chromatin binding domain bind to a component of the nuclear matrix (e.g., chromatin, a nuclear matrix protein, genomic DNA, histone, nucleosome, or combination thereof) and the DBD binds to a DNA binding polynucleotide described herein.
- a component of the nuclear matrix e.g., chromatin, a nuclear matrix protein, genomic DNA, histone, nucleosome, or combination thereof
- the DNA binding protein comprises one or more chromatin binding domains and at least one EBNA1 DBD, wherein the one or more chromatin binding domains bind to one or more components of the nuclear matrix (e.g., chromatin, a nuclear matrix protein, genomic DNA, histone, nucleosome, or combination thereof) and the EBNA1 DBD binds to a polynucleotide comprising one or more EBV DBEs described herein.
- binding interactions between the chromatin binding domain and the DBD described herein and their respective target are determined using any method of structural characterization known in the art.
- binding interactions between the one or more chromatin binding domains and at least one EBNA1 DBD DNA binding protein described herein and their respective target are determined using any method of structural characterization known in the art.
- a representation, or model, of the three dimensional structure of a multi-component complex structure, for which a crystal has been produced can be determined using techniques which include molecular replacement or SIR/MIR (single/multiple isomorphous replacement) (see, e.g., Brunger (1997), Meth.
- AMoRe/Mosflm Navaza (1994), Acta Cryst. A50: 157-163; CCP4 (1994), Acta Cryst. D50: 760-763
- XPLOR see, Brunger et al. (1992), X-PLOR Version 3.1.
- a method of measuring binding affinity is used to characterize binding between a DNA binding protein described herein and a DNA binding polynucleotide. In some embodiments, a method of measuring binding affinity is used to characterize binding between a DNA binding protein described herein and a polynucleotide comprising at least one EBV DBE described herein.
- a number of well-characterized assays are available for determining the binding affinity, usually expressed as dissociation constant, for DNA-binding proteins and the cognate DNA sequences to which they bind.
- assays usually require the preparation of purified protein and binding site (usually a synthetic oligonucleotide) of known concentration and specific activity. Examples include electrophoretic mobility-shift assays, DNaseI protection or "footprinting", and filter-binding. These assays can also be used to estimate the association and dissociation rate constants. These values may be determined with greater precision using a BIAcore instrument.
- the synthetic oligonucleotide is bound to the assay "chip," and purified DNA-binding protein is passed through the flow-cell. Binding of the protein to the DNA immobilized on the chip is measured as an increase in refractive index.
- a chromatin binding measurement is used to characterize binding between the one or more chromatin binding domains of a DNA binding protein described herein and one or more components of the nuclear matrix.
- the chromatin binding assay is performed by introducing to a cell: (i) a DNA binding protein described herein or mRNA encoding the DNA binding protein, wherein the DNA binding protein is operably linked to a fluorescent reporter protein described herein (e.g., GFP), and (ii) a nuclear DNA stain (e.g., DAPI, Hoeschst DNA stain); and measuring co-localization of the DNA binding protein with the nuclear stain using fluorescence microscopy, wherein increased co-localization of signal from the fluorescent reporter protein with signal from the nuclear stain indicates increased chromatin binding (see, e.g., method described in Schneider, et al (2013) PNAS 110:9487).
- a DNA binding protein described herein or mRNA encoding the DNA binding protein wherein the DNA binding protein is operably linked to a fluorescent reporter protein described herein (e.g., GFP), and (ii) a nuclear DNA stain (e.g., DAPI, Hoeschst DNA stain);
- a chromatin binding measurement is performed using a chromatin immunoprecipitation (ChIP) assay, wherein the DNA binding protein or mRNA encoding the DNA binding protein is introduced to a cell culture, the cell culture is cross-linked (e.g., by treatment with formaldehyde), chromatin is digested to generate sheared complexes (e.g., by shearing mechanically or enzymatically), the sheared complexes are immunoprecipitated using an antibody or antigen binding fragment specific to the DNA binding protein or an epitope tag fused to the DNA binding protein, and the immunoprecipitated DNA complexes are characterized using, for example, real time PCR, sequencing or microarray hybridization.
- ChIP chromatin immunoprecipitation
- a system of the disclosure comprising a DNA binding protein described herein, or a nucleic acid or a recombinant expression vector encoding the DNA binding protein, and a recombinant expression vector described herein comprising a transgene and a DNA binding polynucleotide is characterized by measuring transfection efficiency in vitro or in vivo.
- a system of the disclosure comprising an mRNA described herein and a recombinant expression vector described herein is characterized by measuring transfection efficiency in vitro or in vivo.
- the system is introduced to a cell or a population of cells in vitro.
- Cells according to the present disclosure include any cell into which foreign nucleic acids can be introduced and expressed as described herein. It is to be understood that the basic concepts of the present disclosure described herein are not limited by cell type.
- Cells according to the present disclosure include eukaryotic cells, prokaryotic cells, animal cells, plant cells, insect cells, fungal cells, archaeal cells, eubacterial cells, a virion, a virosome, a virus-like particle, a parasitic microbe, an infectious protein and the like.
- Cells include eukaryotic cells such as yeast cells, plant cells, and animal cells. Other suitable cells are known to those skilled in the art.
- the cell is a proliferating or dividing cell.
- the cell is a non-dividing or non-proliferating cell.
- a cell proliferation marker assay comprises detection of cell proliferation using an antibody or antigen binding fragment thereof that binds an antigen present in actively dividing cells, such as an antibody or antigen binding fragment that binds Ki-67.
- a DNA synthesis assay is performed by measuring incorporation of 3H-thymine in cells using a scintillation counter or by measuring incorporation of bromodeoxyuridine (BrdU) into cells using detection by an anti-BrdU antibody and detection by immunohistochemistry, intracellular ELISA, and flow cytometry.
- the system is introduced to the cell or population of cells by any conventional transfection procedure.
- transformation and “transfection” are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid into a host cell, including calcium phosphate or calcium chloride co- precipitation, DEAE-dextran-mediated transfection, lipofection (e.g., using commercially available reagents such as, for example, LIPOFECTIN® (Invitrogen Corp., San Diego, CA), LIPOFECTAMINE® (Invitrogen), FUGENE® (Roche Applied Science, Basel, Switzerland), JETPEITM (Polyplus-transfection Inc., New York, NY), EFFECTENE® (Qiagen, Valencia, CA), DREAMFECTTM (OZ Biosciences, France) and the like), or electroporation (e.g., in vivo electroporation).
- LIPOFECTIN® Invitrogen Corp., San Diego, CA
- LIPOFECTAMINE®
- Suitable methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989), and other laboratory manuals.
- the system is introduced by transfection, transduction, injection, microinjection, gene gun, nucleofection, nanoparticle bombardment, transformation, conjugation, by application of the nucleic acid in a gel, oil, or cream, by electroporation, using lipid-based transfection reagents, or by any other suitable transfection method.
- lipid-based transfection reagents or by any other suitable transfection method.
- the transfection is performed over an appropriate time period to enable expression of the system components.
- expression of a transgene is used to determine the efficiency of transfection.
- the transgene encodes a detectable marker. Examples of detectable markers include various radioactive moieties, enzymes, prosthetic groups, fluorescent markers, luminescent markers, bioluminescent markers, metal particles, protein-protein binding pairs, protein-antibody binding pairs and the like. Detectable markers are commercially available from a variety of sources.
- the detectable marker is a reporter protein, e.g., a fluorescent protein, a bioluminescent protein, or an enzyme.
- detectable fluorescent proteins include, but are not limited to, yellow fluorescent protein (YFP), green fluorescence protein (GFP), cyan fluorescence protein (CFP), umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride, phycoerythrin and the like.
- detectable bioluminescent proteins include, but are not limited to, luciferase (e.g., bacterial, firefly, click beetle and the like), luciferin, aequorin and the like.
- detectable enzyme systems include, but are not limited to, galactosidases, glucorinidases, phosphatases, peroxidases, cholinesterases and the like.
- Biotin, or a derivative thereof may also be used as a detectable label, and subsequently bound by a detectably labeled avidin/streptavidin derivative (e.g. phycoerythrin-conjugated streptavidin), or a detectably labeled anti-biotin antibody.
- Digoxigenin may be expressed subsequently bound by a detectably labeled anti-digoxigenin antibody (e.g. fluoresceinated anti-digoxigenin).
- any member of a conjugate pair may be incorporated into a detection oligonucleotide provided that a detectably labeled conjugate partner can be bound to permit detection.
- the term antibody refers to an antibody molecule of any class, or any sub-fragment thereof, such as an Fab.
- suitable labels for detection include one or more protein tags.
- protein tag refers to a heterologous polypeptide sequence linked to a polymerase of the invention.
- Protein tags include, but are not limited to, Avi tag (GLNDIFEAQKIEWHE) (SEQ ID NO: 72), calmodulin tag (KRRWKKNFIAVSAANRFKKISSSGAL) (SEQ ID NO: 73), FLAG tag (DYKDDDDK) (SEQ ID NO: 74), HA tag (YPYDVPDYA) (SEQ ID NO: 75), His tag (HHHHHH) (SEQ ID NO: 76), Myc tag (EQKLISEEDL) (SEQ ID NO: 77), S tag (KETAAAKFERQHMDS) (SEQ ID NO: 78, SBP tag (MDEKTTGWRGGHVVEGLAGELEQLRARLEHHPQGQREP) (SEQ ID NO: 79), Softag 1 (SLAELLNAGLGGS) (SEQ ID NO: 80), Softag 3 (TQDPSRVG) (SEQ ID NO: 81), V5 tag (GKPIPNPLLGLDST) (SEQ ID NO: 82),
- the detectable marker is detected using a microscope, a spectrophotometer, a tube luminometer or plate luminometer, x- ray film, magnetic fields, a scintillator, a fluorescence activated cell sorting (FACS) apparatus, a microfluidics apparatus, a bead-based apparatus or the like.
- the detectable marker is detected by flow cytometry.
- the detectable marker is detected by spectroscopy.
- detection of the reporter protein is used to determine a level of expression of the transgene.
- expression of the transgene is compared between a test cell culture contacted with a system described herein and a control cell culture contacted with a recombinant expression vector only.
- the amount of protein produced may be measured by an enzyme-linked immunosorbent assay (ELISA).
- the amount of protein produced may be measured by Western blot analysis.
- the amount of protein produced may be measured by immunostaining.
- the amount of protein produced may be measured by time-resolved Forster Resonance Energy Transfer (TR-FRET).
- the amount of protein produced may be measured by immunohistochemistry (IHC).
- the level of expression is measured by more than one of these or other methods.
- expression of the transgene when introduced to a cell or population of cells using a system described herein is increased compared to its introduction using a recombinant expression vector alone. In some embodiments, expression of the transgene when introduced to a cell or population of cells using a system described herein is increased by at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100-fold compared to expression of the transgene introduced using the recombinant expression vector alone.
- expression of the transgene when introduced to a cell or population of cells using a system described herein is increased by at least about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75- fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110- fold, about 120-fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, or about 200-fold compared to expression of the transgene introduced using the recombinant expression vector alone.
- expression of the transgene when introduced to a cell or population of cells using a system described herein is increased by at least about 2-fold, about 3-fold, about 4-fold, about 5-fold, about 6-fold, about 7-fold, about 8-fold, about 9-fold, or about 10-fold as compared to introducing the system to the cell or population of cells in the presence of a mitosis inhibitor (e.g., aphidicolin).
- a mitosis inhibitor e.g., aphidicolin.
- an mRNA of the disclosure encoding a DNA binding protein comprising an EBNA1 DNA binding domain has reduced immunogenicity compared to an mRNA encoding a wild-type EBNA1 protein.
- a DNA binding protein of the disclosure comprising an EBNA1 DNA binding domain has reduced immunogenicity compared to a wild-type EBNA1 protein.
- a DNA binding protein comprising an EBNA1 homolog, or a fragment or a variant thereof has reduced immunogenicity compared to an EBNA1 protein.
- a DNA binding protein of the disclosure comprising a DBD of an EBNA1 homolog described herein has reduced immunogenicity compared to a DNA binding protein comprising an EBNA1 DBD.
- a recombinant expression vector of the disclosure comprising a polynucleotide comprising one or more DBEs of an EBV OriP has reduced immunogenicity as compared to a recombinant expression vector comprising an EBV OriP.
- a polynucleotide comprising one or more DBEs of an EBV OriP has reduced immunogenicity as compared to a polynucleotide comprising an EBV OriP.
- the method comprises measuring immunogenicity of a system described herein or component thereof, or a delivery system described herein or component thereof, using an in vitro assay.
- the in vitro assay comprises an HLA binding assay.
- the method comprises identifying one or more MHC Type I (MHC I) epitopes in a DNA binding protein described herein using a computational approach, generating the peptide sequence(s) of the one or more MHC I epitopes, and assessing the peptide sequence(s) for binding to one or more HLA allotypes (see, e.g., Steere, et al (2006) J Exp Med 203:961). Computational approaches to predict MHC I epitopes are known in the art. See Schaap-Johnson, et al (2021) Front Immunol 12:712488.
- a DNA binding protein described herein comprising an EBNA1 DBD has fewer predicted MHC I epitopes than a wild-type EBNA1 polypeptide. In some embodiments, a DNA binding protein described herein comprising an EBNA1 DBD has fewer MHC I epitopes that bind an HLA allotype as compared to a wild-type EBNA1 polypeptide as measured using an HLA binding assay. [00548] In some embodiments, the in vitro assay comprises a cell-based immunogenicity assay.
- the method comprises contacting a cell population (e.g., primary hepatocytes, primary splenocytes, immortalized cancer cell lines, or PBMCs) with a system described herein or component thereof, or a delivery system described herein or component thereof, co-culturing the contacted population with HLA-matched PBMCs or isolated T cells (e.g., pan-T cells or memory T cells), and measuring one or more markers of T cell activation (e.g., cytokine production, expansion).
- the in vitro assay comprises a PBMC assay.
- the method comprises isolating a population of PBMCs from a donor, contacting the population with a system described herein or component thereof, or a delivery system described herein or component thereof, co-culturing the contacted population with HLA-matched T cells, and measuring one or more markers of T cell activation (e.g., cytokine production, expansion).
- a population of PBMCs contacted with a system described herein results in lower T cell activation as compared to a control population of PBMCs.
- a population of PBMCs contacted with an mRNA encoding a DNA binding protein described herein results in lower T cell activation as compared to a control population of PBMCs contacted with an mRNA encoding a wild-type EBNA1.
- a population of PBMCs contacted with a DNA binding protein described herein results in lower T cell activation as compared to a control population of PBMCs contacted with a wild-type EBNA1.
- a population of PBMCs contacted with a recombinant expression vector described herein results in lower T cell activation as compared to a control population of PBMCs contacted with a recombinant expression vector comprising an EBV OriP.
- a population of PBMCs contacted with a recombinant expression vector described herein results in lower T cell activation as compared to a control population of PBMCs contacted with a recombinant expression vector lacking a transgene.
- the in vitro assay comprises a MAPPs assay.
- the method comprises contacting a population of antigen presenting cells with a DNA binding protein described herein or an mRNA encoding a DNA binding protein and determining the identity of antigens presented by HLA molecules present on the population using, e.g., LC/MS.
- a DNA binding protein described herein has fewer antigenic peptides as compared to a wild-type EBNA1.
- Delivery Vehicles e.g., the disclosure provides nanoparticle compositions (e.g., lipid nanoparticles (LNPs), polymeric nanoparticle) comprising an DNA binding protein, a nucleic acid, an mRNA and/or a recombinant expression vector described herein.
- nanoparticle compositions e.g., lipid nanoparticles (LNPs), polymeric nanoparticle
- the disclosure provides nanoparticle compositions (e.g., lipid nanoparticles (LNPs), polymeric nanoparticle) comprising an mRNA and/or a recombinant expression vector described herein.
- the mRNA and the recombinant expression vector are formulated together in the same nanoparticle.
- the mRNA and the recombinant expression vector are formulated in separate nanoparticles.
- a nanoparticle composition comprises a lipid.
- Lipid nanoparticles include, but are not limited to, liposomes and micelles.
- Nanoparticles are ultrafine particles typically ranging between 1 and 100 to 500 nanometers (nm) in size with a surrounding interfacial layer and often exhibiting a size-related or size -dependent property. Nanoparticle compositions are myriad and encompass lipid nanoparticles (LNPs), liposomes (e.g., lipid vesicles), and lipoplexes.
- LNPs lipid nanoparticles
- liposomes e.g., lipid vesicles
- lipoplexes lipid nanoparticles
- a nanoparticle composition can be a liposome having a lipid bilayer with a diameter of 500 nm or less.
- nanoparticle compositions are vesicles including one or more lipid bilayers.
- a nanoparticle composition includes two or more concentric bilayers separated by aqueous compartments.
- Lipid bilayers can be functionalized and/or crosslinked to one another.
- Lipid bilayers can include one or more ligands, proteins, or channels.
- the system or system components are formulated in lipid nanoparticles having a diameter from about 10 to about 100 nm.
- the system or system components e.g., mRNA and/or a recombinant expression vector described herein
- the nanoparticles have a diameter from about 10 to 500 nm.
- the nanoparticle has a diameter greater than 100 nm.
- the largest dimension of a nanoparticle composition is 1 ⁇ m or shorter (e.g., 1 ⁇ m, 900 nm, 800 nm, 700 nm, 600 nm, 500 nm, 400 nm, 300 nm, 200 nm, 175 nm, 150 nm, 125 nm, 100 nm, 75 nm, 50 nm, or shorter).
- a nanoparticle composition can be relatively homogenous.
- a polydispersity index can be used to indicate the homogeneity of a nanoparticle composition, e.g., the particle size distribution of the nanoparticle composition.
- a small (e.g., less than 0.3) polydispersity index generally indicates a narrow particle size distribution.
- a nanoparticle composition can have a polydispersity index from about 0 to about 0.25, such as 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.10, 0.11, 0.12, 0.13, 0.14, 0.15, 0.16, 0.17, 0.18, 0.19, 0.20, 0.21, 0.22, 0.23, 0.24, or 0.25.
- the polydispersity index of a nanoparticle composition disclosed herein can be from about 0.10 to about 0.20.
- the zeta potential of a nanoparticle composition can be used to indicate the electrokinetic potential of the composition.
- the zeta potential can describe the surface charge of a nanoparticle composition.
- Nanoparticle compositions with relatively low charges, positive or negative, are generally desirable, as more highly charged species can interact undesirably with cells, tissues, and other elements in the body.
- the zeta potential of a nanoparticle composition disclosed herein can be from about -10 mV to about +20 mV.
- the term “encapsulation efficiency” of a polynucleotide describes the amount of the polynucleotide that is encapsulated by or otherwise associated with a nanoparticle composition after preparation, relative to the initial amount provided.
- encapsulation can refer to complete, substantial, or partial enclosure, confinement, surrounding, or encasement. Encapsulation efficiency is desirably high (e.g., close to 100%). The encapsulation efficiency can be measured, for example, by comparing the amount of the polynucleotide in a solution containing the nanoparticle composition before and after breaking up the nanoparticle composition with one or more organic solvents or detergents.
- the nanoparticle composition comprises an mRNA described herein.
- the nanoparticle composition comprises a recombinant expression vector described herein.
- the nanoparticle composition comprises an mRNA and a recombinant expression vector described herein.
- the disclosure provides lipid-based nanoparticle (LNP) compositions comprising: (a) one or more nucleic acid molecules (e.g., mRNA and/or recombinant expression vector) described herein; and (b) one or more lipid moieties selected from amino lipids, helper lipids, structural lipids, phospholipids, ionizable lipids, PEG lipids, lipoid, and cholesterol or cholesterol derivatives.
- nucleic acid molecules e.g., mRNA and/or recombinant expression vector
- lipid moieties selected from amino lipids, helper lipids, structural lipids, phospholipids, ionizable lipids, PEG lipids, lipoid, and cholesterol or cholesterol derivatives.
- the disclosure provides lipid-based nanoparticle (LNP) compositions comprising: (a) one or more nucleic acid molecules (e.g., mRNA and/or recombinant expression vector) described herein; and (b) one or more lipid moieties selected from the group consisting of ionizable lipids, amino lipids, anionic lipids, neutral lipids, amphipathic lipids, helper lipids, structural lipids, PEG lipids, and lipoids, and optionally (c) targeting moieties.
- a suitable delivery vehicle is a lipid nanoparticle (LNP).
- LNPs are typically characterized as microscopic vesicles having an interior aqua space sequestered from an outer medium by a membrane of one or more bilayers.
- Bilayer membranes of liposomes are typically formed by amphiphilic molecules, such as lipids of synthetic or natural origin that comprise spatially separated hydrophilic and hydrophobic domains.
- Lipid Formulations (i) Ionizable lipids [00560]
- the LNP composition disclosed herein comprises one or more one or more ionizable lipids.
- the term “ionizable lipid” has its ordinary meaning in the art and may refer to a lipid comprising one or more charged moieties.
- an ionizable lipid may be positively charged or negatively charged.
- the one or more ionizable lipids are selected from the group consisting of 3- (didodecylamino)-N 1 ,N 1 ,4-tridodecyl- 1 -piperazineethanamine (KL 10), N 1 -[2- (didodecylamino)ethyl]-Nl,N4,N4-tridodecyl-l,4-piperazinediethanamine (KL22), 14,25-ditridecyl- 15,18,21,24-tetraaza-octatriacontane (KL25), l,2-dilinoleyloxy- N,N-dimethylaminopropane (DLin- DMA), 2,2-dilinoleyl-4-dimethyla
- the ionizable lipid may be selected from, but not limited to, an ionizable lipid described in International Publication Nos. WO2013086354 and WO2013116126.
- the lipid nanoparticle may include one or more (e.g., 1, 2, 3, 4, 5, 6, 7, or 8) cationic and/or ionizable lipids.
- Such cationic and/or ionizable lipids include, but are not limited to, 3-(didodecylamino)-Nl,Nl,4-tridodecyl-l-piperazineethanamine (KL10), Nl-[2- (didodecylamino)ethyl]-Nl,N4,N4-tridodecyl-l,4-piperazinediethanamine (KL22), 14,25-ditridecyl- 15,18,21,24-tetraaza-octatriacontane (KL25), l,2-dilinoleyloxy- N,N-dimethylaminopropane (DLin- DMA), 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3] - dioxolane (DLin-K-DMA), heptatriaconta-6,9,28,3l-tetraen-19-yl 4- (di
- LNP composition disclosed herein comprise one or more amino lipids.
- amino lipid and “cationic lipid” are used interchangeably herein to include those lipids and salts thereof having one, two, three, or more fatty acid or fatty alkyl chains and a pH- titratable amino head group (e.g., an alkylamino or dialkylamino head group).
- a pH- titratable amino head group e.g., an alkylamino or dialkylamino head group.
- the cationic lipid is typically protonated (i.e., positively charged) at a pH below the pKa of the cationic lipid and is substantially neutral at a pH above the pKa.
- the cationic lipids can also be termed titratable cationic lipids.
- the one or more cationic lipids include: a protonatable tertiary amine (e.g., pH-titratable) head group; alkyl chains, wherein each alkyl chain independently has 0 to 3 (e.g., 0, 1, 2, or 3) double bonds; and ether, ester, or ketal linkages between the head group and alkyl chains.
- a protonatable tertiary amine e.g., pH-titratable
- alkyl chains wherein each alkyl chain independently has 0 to 3 (e.g., 0, 1, 2, or 3) double bonds
- ether, ester, or ketal linkages between the head group and alkyl chains e.g., 1, 2, or 3
- Such cationic lipids include, but are not limited to, DSDMA, DODMA, DOTMA, DLinDMA, DLenDMA, g-DLenDMA, DLin-K-DMA, DLin-K- C2-DMA (also known as DLin-C2K- DMA, XTC2, and C2K), DLin-K-C3 -DMA, DLin-K-C4- DMA, DLen-C2K-DMA, y-DLen-C2-DMA, C12- 200, cKK-E12, cKK-A12, cKK-012, DLin-MC2-DMA (also known as MC2), and DLin-MC3- DMA (also known as MC3).
- Anionic Lipids suitable for use in lipid nanoparticles include, but are not limited to, phosphatidylglycerol, cardiolipin, diacylphosphatidylserine, diacylphosphatidic acid, N- dodecanoyl phosphatidylethanoloamine, N-succinyl phosphatidylethanolamine, N-glutaryl phosphatidylethanolamine, lysylphosphatidylglycerol, and other anionic modifying groups joined to neutral lipids.
- Neutral Lipids suitable for use in lipid nanoparticles include, but are not limited to, diacylphosphatidylcholine, diacylphosphatidylethanolamine, ceramide, sphingomyelin, dihydrosphingomyelin, cephalin, sterols (e.g., cholesterol) and cerebrosides.
- the lipid nanoparticle comprises cholesterol.
- Lipids having a variety of acyl chain groups of varying chain length and degree of saturation are available or may be isolated or synthesized by well-known techniques.
- lipids having mixtures of saturated and unsaturated fatty acid chains and cyclic regions can be used.
- the neutral lipids used in the disclosure are DOPE, DSPC, DPPC, POPC, or any related phosphatidylcholine.
- the neutral lipid may be composed of sphingomyelin, dihydrosphingomyeline, or phospholipids with other head groups, such as serine and inositol.
- Amphipathic Lipids [00567] In some embodiments, amphipathic lipids are included in nanoparticles.
- amphipathic lipids suitable for use in nanoparticles include, but are not limited to, sphingolipids, phospholipids, fatty acids, and amino lipids.
- the lipid composition of the pharmaceutical composition disclosed herein can comprise one or more phospholipids, for example, one or more saturated or (poly)unsaturated phospholipids or a combination thereof.
- phospholipids comprise a phospholipid moiety and one or more fatty acid moieties.
- a phospholipid moiety can be selected, for example, from the non-limiting group consisting of phosphatidyl choline, phosphatidyl ethanolamine, phosphatidyl glycerol, phosphatidyl serine, phosphatidic acid, 2-lysophosphatidyl choline, and a sphingomyelin.
- a fatty acid moiety can be selected, for example, from the non-limiting group consisting of lauric acid, myristic acid, myristoleic acid, palmitic acid, palmitoleic acid, stearic acid, oleic acid, linoleic acid, alpha-linolenic acid, erucic acid, phytanoic acid, arachidic acid, arachidonic acid, eicosapentaenoic acid, behenic acid, docosapentaenoic acid, and docosahexaenoic acid.
- Particular amphipathic lipids can facilitate fusion to a membrane.
- a cationic phospholipid can interact with one or more negatively charged phospholipids of a membrane (e.g., a cellular or intracellular membrane). Fusion of a phospholipid to a membrane can allow one or more elements (e.g., a therapeutic agent) of a lipid-containing composition (e.g., LNPs) to pass through the membrane permitting, e.g., delivery of the one or more elements to a target tissue.
- a lipid-containing composition e.g., LNPs
- Non-natural amphipathic lipid species including natural species with modifications and substitutions including branching, oxidation, cyclization, and alkynes are also contemplated.
- a phospholipid can be functionalized with or cross-linked to one or more alkynes (e.g., an alkenyl group in which one or more double bonds is replaced with a triple bond).
- alkynes e.g., an alkenyl group in which one or more double bonds is replaced with a triple bond.
- an alkyne group can undergo a copper-catalyzed cycloaddition upon exposure to an azide.
- Such reactions can be useful in functionalizing a lipid bilayer of a nanoparticle composition to facilitate membrane permeation or cellular recognition or in conjugating a nanoparticle composition to a useful component such as a targeting or imaging moiety (e.g., a dye).
- Phospholipids include, but are not limited to, glycerophospholipids such as phosphatidylcholines, phosphatidylethanolamines, phosphatidylserines, phosphatidylinositols, phosphatidy glycerols, and phosphatidic acids. Phospholipids also include phosphosphingolipid, such as sphingomyelin. [00574] In some embodiments, the LNP composition disclosed herein comprises one or more phospholipids.
- the phospholipid is selected from the group consisting of 1,2- dilinoleoyl-sn-glycero-3 -phosphocholine (DLPC), 1 ,2-dimyristoyl-sn- glycero-phosphocholine (DMPC), l,2-dioleoyl-sn-glycero-3 -phosphocholine (DOPC), l,2- dipalmitoyl-sn-glycero-3- phosphocholine (DPPC), l,2-distearoyl-sn-glycero-3 - phosphocholine (DSPC), 1,2-diundecanoyl-sn- glycero-phosphocholine (DUPC), 1 -palmitoyl -2 -oleoyl-sn-glycero-3 -phosphocholine (POPC), 1,2-di- O-octadecenyl-sn-glycero-3- phosphocholine (18:0 Di
- helper lipids In some embodiments, the LNP composition disclosed herein comprise one or more helper lipids.
- helper lipid refers to lipids that enhance transfection (e.g., transfection of an LNP comprising an mRNA that encodes a DNA binding protein and/or a recombinant expression vector comprising a transgene and at least one DNA binding element described herein).
- helper lipids of the LNP compositions disclosed herein there are no specific limitations concerning the helper lipids of the LNP compositions disclosed herein. Without being bound to any particular theory, it is believed that the mechanism by which the helper lipid enhances transfection includes enhancing particle stability. [00577] In some embodiments, the helper lipid enhances membrane fusogenicity.
- the helper lipid of the LNP compositions disclosure herein can be any helper lipid known in the art.
- helper lipids suitable for the compositions and methods include steroids, sterols, and alkyl resorcinols.
- Particularly helper lipids suitable for use in the present disclosure include, but are not limited to, saturated phosphatidylcholine (PC) such as distearoyl-PC (DSPC) and dipalymitoyl-PC (DPPC), dioleoylphosphatidylethanolamine (DOPE), 1,2-Dioleoyl-sn-glycero-3-phosphocholine (DOPC), l,2-dilinoleoyl-sn-glycero-3-phosphocholine (DLPC), cholesterol, 5- heptadecylresorcinol, and cholesterol hemisuccinate.
- PC saturated phosphatidylcholine
- DSPC distearoyl-PC
- DPPC dipalymitoyl-PC
- DOPE dioleoylphosphatidylethanolamine
- DOPC 1,2-D
- the helper lipid of the LNP composition includes cholesterol.
- Structural lipids [00578]
- the LNP composition disclosed herein comprises one or more structural lipids.
- structural lipid refers to sterols and also to lipids containing sterol moieties. Without being bound to any particular theory, it is believed that the incorporation of structural lipids into the LNPs mitigates aggregation of other lipids in the particle.
- Structural lipids can be selected from the group including but not limited to, cholesterol, fecosterol, sitosterol, ergosterol, campesterol, stigmasterol, brassicasterol, tomatidine, tomatine, ursolic acid, alpha-tocopherol, hopanoids, phytosterols, steroids, and mixtures thereof.
- the structural lipid is a sterol.
- “sterols” are a subgroup of steroids consisting of steroid alcohols.
- the structural lipid is a steroid.
- the structural lipid is cholesterol.
- the structural lipid is an analog of cholesterol.
- the lipid component of a lipid nanoparticle composition may include one or more molecules comprising polyethylene glycol, such as PEG or PEG-modified lipids.
- the LNP composition disclosed herein comprise one or more polyethylene glycol (PEG) lipid.
- PEG-lipid refers to polyethylene glycol (PEG)-modified lipids. Such lipids are also referred to as PEGylated lipids.
- PEG-lipids include PEG-modified phosphatidylethanolamine and phosphatidic acid, PEG-ceramide conjugates (e.g., PEG- CerC14 or PEG-CerC20), PEG-modified dialkylamines and PEG- modified l,2-diacyloxypropan-3-amines.
- a PEG lipid can be PEG-c-DOMG, PEG-DMG, PEG-DLPE, PEG-DMPE, PEG-DPPC, or a PEG-DSPE lipid.
- the PEG-lipid includes, but not limited to 1,2-dimyristoyl- sn-glycerol methoxypolyethylene glycol (PEG-DMG), l,2-distearoyl-sn-glycero-3- phosphoethanolamine- N-[amino(polyethylene glycol)] (PEG-DSPE), PEG-disteryl glycerol (PEG- DSG), PEG- dipalmetoleyl, PEG-dioleyl, PEG-distearyl, PEG- diacylglycamide (PEG-DAG), PEG- dipalmitoyl phosphatidylethanolamine (PEG-DPPE), or PEG-1, 2-dimyristyloxlpropyl-3 - amine (PEG- c-DMA).
- PEG-DMG 1,2-dimyristoyl- sn-glycerol methoxypolyethylene glycol
- PEG-DSPE l,2-dist
- the PEG-lipid is selected from the group consisting of a PEG- modified phosphatidylethanolamine, a PEG-modified phosphatidic acid, a PEG-modified ceramide, a PEG-modified dialkylamine, a PEG-modified diacylglycerol, a PEG- modified dialkylglycerol, and mixtures thereof.
- the lipid moiety of the PEG-lipids includes those having lengths of from about C14 to about C22, preferably from about C14 to about C20.
- a PEG moiety for example a mPEG- NEE, has a size of about 1000, 2000, 5000, 10,000, 15,000 or 20,000 daltons.
- the PEG-lipid is PEG2k- DMG.
- the one or more PEG lipids of the LNP composition comprises PEG-DMPE.
- the one or more PEG lipids of the LNP composition comprises PEG-DMG.
- an LNP composition comprise one or more nucleic acid molecules described herein.
- the LNP composition comprises an mRNA described herein (e.g., an mRNA encoding a DNA binding protein).
- the LNP compositions comprise a recombinant expression vector described herein (e.g., a recombinant expression vector comprising a transgene and at least one DNA binding element).
- the LNP composition comprises an mRNA described herein and a recombinant expression vector described herein (e.g., an mRNA encoding a DNA binding protein and/or a recombinant expression vector comprising a transgene and at least one DNA binding element).
- the ratio between the lipid components and the nucleic acid molecules (e.g., mRNA and/or recombinant expression vector) of the LNP composition is sufficient for (i) formation of LNPs with desired characteristics, e.g., size, charge, and (ii) delivery of a sufficient dose of nucleic acid at a dose of the lipid component(s) that is tolerable for in vivo administration as readily ascertained by one of skill in the art.
- a nanoparticle e.g., a lipid nanoparticle
- a targeting moiety that is specific to a cell type and/or tissue type.
- a nanoparticle may be targeted to a particular cell, tissue, and/or organ using a targeting moiety.
- a nanoparticle comprises a targeting moiety.
- targeting moieties include ligands, cell surface receptors, glycoproteins, vitamins (e.g., riboflavin) and antibodies (e.g., full-length antibodies, antibody fragments (e.g., Fv fragments, single chain Fv (scFv) fragments, Fab’ fragments, or F(ab’)2 fragments), single domain antibodies, camelid antibodies and fragments thereof, human antibodies and fragments thereof, monoclonal antibodies, and multispecific antibodies (e.g., bispecific antibodies)).
- the targeting moiety may be a polypeptide.
- the targeting moiety may include the entire polypeptide (e.g., peptide or protein) or fragments thereof.
- a targeting moiety is typically positioned on the outer surface of the nanoparticle in such a manner that the targeting moiety is available for interaction with the target, for example, a cell surface receptor.
- a variety of different targeting moieties and methods are known and available in the art, including those described, e.g., in Sapra et al., Prog. Lipid Res.42(5):439- 62, 2003 and Abra et al., J. Liposome Res.12: 1-3, 2002.
- a lipid nanoparticle may include a surface coating of hydrophilic polymer chains, such as polyethylene glycol (PEG) chains (see, e.g., Allen et al., Biochimica et Biophysica Acta 1237: 99-108, 1995; DeFrees et al., Journal of the American Chemistry Society 118: 6101-6104, 1996; Blume etal., Biochimica et Biophysica Acta 1149: 180- 184,1993; Klibanov et al., Journal of Liposome Research 2: 321- 334, 1992; U.S. Pat. No.
- PEG polyethylene glycol
- a targeting moiety for targeting the lipid nanoparticle is linked to the polar head group of lipids forming the nanoparticle.
- the targeting moiety is attached to the distal ends of the PEG chains forming the hydrophilic polymer coating (see, e.g., Klibanov et al., Journal of Fiposome Research 2: 321-334, 1992; Kirpotin et al., FEBS Fetters 388: 115-118, 1996).
- Standard methods for coupling the targeting moiety or moieties may be used.
- phosphatidylethanolamine which can be activated for attachment of targeting moieties, or derivatized lipophilic compounds, such as lipid-derivatized bleomycin, can be used.
- Antibody-targeted liposomes can be constructed using, for instance, liposomes that incorporate protein A (see, e.g., Renneisen et al., J. Bio. Chem., 265:16337-16342, 1990 and Feonetti et al., Proc. Natl. Acad. Sci. (USA), 87:2448- 2451, 1990).
- protein A see, e.g., Renneisen et al., J. Bio. Chem., 265:16337-16342, 1990 and Feonetti et al., Proc. Natl. Acad. Sci. (USA), 87:2448- 2451, 1990.
- Other examples of antibody conjugation are disclosed in U.S. Pat. No. 6,027,726.
- targeting moieties can also include other polypeptides that are specific to cellular components, including antigens associated with neoplasms or tumors.
- Polypeptides used as targeting moieties can be attached to the liposomes via covalent bonds (see, for example Heath, Covalent Attachment of Proteins to Fiposomes, 149 Methods in Enzymology 111-119 (Academic Press, Inc. 1987)).
- Other targeting methods include the biotin-avidin system.
- a lipid nanoparticle includes a targeting moiety that targets the lipid nanoparticle to a cell including, but not limited to, hepatocytes, colon cells, epithelial cells, hematopoietic cells, epithelial cells, endothelial cells, lung cells, bone cells, stem cells, mesenchymal cells, neural cells, cardiac cells, adipocytes, vascular smooth muscle cells, cardiomyocytes, skeletal muscle cells, beta cells, pituitary cells, synovial lining cells, ovarian cells, testicular cells, fibroblasts, B cells, T cells, reticulocytes, leukocytes, granulocytes, and tumor cells (including primary tumor cells and metastatic tumor cells).
- a targeting moiety that targets the lipid nanoparticle to a cell including, but not limited to, hepatocytes, colon cells, epithelial cells, hematopoietic cells, epithelial cells, endothelial cells, lung cells, bone cells, stem cells, me
- the targeting moiety targets the lipid nanoparticle to a hepatocyte.
- Lipidoids [00586]
- the lipid nanoparticles described herein may be lipidoid-based. The synthesis of lipidoids has been extensively described and formulations containing these compounds are particularly suited for delivery of polynucleotides (see Mahon et al., Bioconjug Chem.201021: 1448-1454; Schroeder et al., J Intern Med. 2010267:9-21; Akinc et al., Nat. Biotechnol. 200826:561-569; Love et al., Proc Natl Acad Sci USA.
- complexes, micelles, liposomes or particles can be prepared containing these lipidoids and therefore, result in an effective delivery of a system or system components (e.g., mRNA and/or a recombinant expression vector) described herein, as determined by, for example, the expression and/or activity of the transgene encoded by a recombinant expression vector described herein, following the injection via localized and systemic routes of administration.
- a system or system components e.g., mRNA and/or a recombinant expression vector
- Pharmaceutical compositions comprising lipidoid complexes can be administered by various means disclosed herein.
- lipidoid formulations for intramuscular or subcutaneous routes may vary significantly depending on the target cell type and the ability of formulations to diffuse through the extracellular matrix into the blood stream. While a particle size of less than 150 nm may be desired for effective hepatocyte delivery due to the size of the endothelial fenestrae (see e.g., Akinc et al., Mol Ther. 2009 17:872-879), use of lipidoid oligonucleotides to deliver the formulation to other cell types including, but not limited to, endothelial cells, myeloid cells, and muscle cells may not be similarly size-limited.
- lipidoid formulations may have a similar component molar ratio.
- Different ratios of lipidoids and other components including, but not limited to, a neutral lipid (e.g., diacylphosphatidylcholine), cholesterol, a PEGylated lipid (e.g., PEG-DMPE), and a fatty acid (e.g., an omega-3 fatty acid) may be used to optimize the formulation of the mRNA or system for delivery to different cell types including, but not limited to, hepatocytes, myeloid cells, muscle cells, etc.
- a neutral lipid e.g., diacylphosphatidylcholine
- cholesterol e.g., a PEGylated lipid
- PEG-DMPE PEGylated lipid
- a fatty acid e.g., an omega-3 fatty acid
- Exemplary lipidoids include, but are not limited to, DLin-DMA, DLin-K-DMA, DLin-KC2-DMA, 98N12-5, C12-200 (including variants and derivatives), DLin-MC3-DMA and analogs thereof.
- lipidoid formulations for the localized delivery of nucleic acids to cells may not require all of the formulation components which may be required for systemic delivery, and as such may comprise the lipidoid and the system or system components (e.g., mRNA and/or a recombinant expression vector) described herein.
- combinations of different lipidoids may be used to improve the efficacy of a system or system components (e.g., mRNA and/or a recombinant expression vector) described herein.
- a system or system components e.g., mRNA and/or a recombinant expression vector
- a system or system components e.g., mRNA and/or a recombinant expression vector described herein may be formulated by mixing the system, or individual components of the system, with the lipidoid at a set ratio prior to addition to cells. In vivo formulations may require the addition of extra ingredients to facilitate circulation throughout the body. After formation of the particle, the system, or individual components of the system, are added and allowed to integrate with the complex.
- the encapsulation efficiency is determined using a standard dye exclusion assay.
- In vivo delivery of the system or system components (e.g., mRNA and/or a recombinant expression vector) described herein may be affected by many parameters, including, but not limited to, the formulation composition, nature of particle PEGylation, degree of loading, oligonucleotide to lipid ratio, and biophysical parameters such as particle size (Akinc et al., Mol Ther.200917:872-879; herein incorporated by reference in its entirety).
- PEG polyethylene glycol
- Formulations with the different lipidoids including, but not limited to penta[3-(l- laurylaminopropionyl)]-triethylenetetramine hydrochloride (TETA-5LAP; aka 98N12-5, see Murugaiah et al., Analytical Biochemistry, 401:61 (2010)), C12-200 (including derivatives and variants), MD1, DLin-DMA, DLin-K- DMA, DLin-KC2-DMA and DLin-MC3-DMA can be tested for in vivo activity.
- the lipidoid referred to herein as “98N12-5” is disclosed by Akinc et al., Mol Ther.200917:872-879).
- the lipidoid referred to herein as “C12-200” is disclosed by Love et al., Proc Natl Acad Sci USA. 2010107: 1864-1869 and Liu and Huang, Molecular Therapy.2010669-670.
- the ability of a lipidoid-formulated a system or system components (e.g., mRNA and/or a recombinant expression vector) described herein to alter expression of a transgene in vitro or in vivo can be determined by any technique known in the art or described herein.
- Other Components [00593]
- the nanoparticles disclosed herein can include one or more components in addition to those described above.
- the lipid composition can include one or more permeability enhancer molecules, carbohydrates, polymers, surface altering agents (e.g., surfactants), or other components.
- a permeability enhancer molecule can be a molecule described by U.S. Patent Application Publication No.2005/0222064.
- Carbohydrates can include simple sugars (e.g., glucose) and polysaccharides (e.g., glycogen and derivatives and analogs thereof).
- LNPs of the present disclosure in which a system or system components (e.g., mRNA and/or a recombinant expression vector) described herein is entrapped within the lipid portion of the particle and is protected from degradation, can be formed by any method known in the art including, but not limited to, a continuous mixing method, a direct dilution process, and an in-line dilution process. Additional techniques and methods suitable for the preparation of the LNPs described herein include coacervation, microemulsions, supercritical fluid technologies, phase-inversion temperature (PIT) techniques.
- PIT phase-inversion temperature
- the LNPs of the present disclosure are produced via a continuous mixing method, e.g., a process that includes providing an aqueous solution of a system or system components (e.g., mRNA and/or a recombinant expression vector) described herein in a first reservoir, providing an organic lipid solution in a second reservoir (wherein the lipids present in the organic lipid solution are solubilized in an organic solvent, e.g., a lower alkanol such as ethanol), and mixing the aqueous solution with the organic lipid solution such that the organic lipid solution mixes with the aqueous solution so as to substantially instantaneously produce a lipid vesicle (e.g., liposome) encapsulating the system or system components (e.g., mRNA and/or a recombinant expression vector) described herein.
- a continuous mixing method e.g., a process that includes providing an aqueous solution of a system or system components (e.g
- the LNPs of the present disclosure are produced via a direct dilution process that includes forming a lipid vesicle (e.g., liposome) solution and immediately and directly introducing the lipid vesicle solution into a collection vessel containing a controlled amount of dilution buffer.
- the collection vessel includes one or more elements configured to stir the contents of the collection vessel to facilitate dilution.
- the amount of dilution buffer present in the collection vessel is substantially equal to the volume of lipid vesicle solution introduced thereto.
- the LNPs of the present disclosure are produced via an in-line dilution process in which a third reservoir containing dilution buffer is fluidly coupled to a second mixing region.
- the lipid vesicle (e.g., liposome) solution formed in a first mixing region is immediately and directly mixed with dilution buffer in the second mixing region.
- provided liposomes comprise one or more cholesterol- based lipids. In some embodiments, provided liposomes comprise one or more polyethylene glycol (peg) modified lipids. In some embodiments, liposomes may comprise one or more cationic lipids that have a net positive charge at a selected pH.
- peg polyethylene glycol
- the disclosure provides a system comprising (i) an mRNA described herein comprising an ORF encoding a DNA binding protein comprising one or more chromatin-binding domains operably linked to an EBNA1 DNA-binding domain, and (ii) a recombinant expression vector described herein comprising at least one transgene operably linked to a polynucleotide comprising one or more EBV DBEs.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide described herein, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the EBNA1 polypeptide comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 1.
- the EBNA1 polypeptide comprises or consists of SEQ ID NO: 1.
- the ORF comprises a nucleotide sequence having about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 12. In some embodiments, the ORF comprises or consists of SEQ ID NO: 12.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; (III) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; (IV) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14 and (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recomb
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 21; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 20; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 23; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 156; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 154; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- DNA binding protein comprises from N-terminus to C- terminus: one or more chromatin binding domains and the DBD. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the DBD and one or more chromatin binding domains. In some embodiments, the DNA binding protein further comprises an NLS described herein (e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS). In some embodiment, the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 17.
- NLS described herein e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS.
- the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about
- the NLS is positioned at the N-terminus, the C-terminus, or between the one or more chromatin binding domains and the DBD.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 3, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 24.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 7, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 25.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 8, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 27.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 9, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 28.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 10, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 26.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 11, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 29.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 130, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 129.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 133, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 132.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 136, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 135.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 139, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 138.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 142, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 141.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 145, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 144.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 150, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 149.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 153, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 152.
- the polynucleotide comprises a nucleotide sequence comprising 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs. In some embodiments, the polynucleotide comprises a nucleotide sequence comprising at least 4 EBV DBEs.
- the polynucleotide comprises a nucleotide sequence comprising 4 to 50, 4 to 40, 4 to 30, 4 to 20, or 4 to 10 EBV DBEs.
- the one or more EBV DBEs each comprise a nucleotide sequence having about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to a nucleotide sequence listed in Table 8 (e.g., SEQ ID NO: 51).
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide described herein, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs each comprise a nucleotide sequence having about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to a nucleotide sequence listed in Table 8 (e.g., SEQ ID NO: 51).
- the EBNA1 polypeptide comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 1. In some embodiments, the EBNA1 polypeptide comprises or consists of SEQ ID NO: 1. In some embodiments, the ORF comprises a nucleotide sequence having about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 12. In some embodiments, the ORF comprises or consists of SEQ ID NO: 12.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; (III) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; (IV) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g.,
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14 and (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recomb
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., a recombinant
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 21; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., a recombinant
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 20; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., a recombinant
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 23; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., a recombinant
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 156; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 154; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.
- DNA binding protein comprises from N-terminus to C- terminus: one or more chromatin binding domains and the DBD. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the DBD and one or more chromatin binding domains. In some embodiments, the DNA binding protein further comprises an NLS described herein (e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS). In some embodiment, the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 17.
- NLS described herein e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS.
- the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about
- the NLS is positioned at the N-terminus, the C-terminus, or between the one or more chromatin binding domains and the DBD.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 3, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs each comprise a nucleotide sequence having about 70%, about 80%, about 85%
- the ORF comprises a nucleotide sequence having at least about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 24.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 7, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs each comprise
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 25.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 8, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 27.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 9, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 28.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 10, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 26.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 11, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBEs
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 29.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 130, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV DBE
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 129.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 133, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 132.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 136, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 135.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 139, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV D
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 138.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 142, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 141.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 145, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV D
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 144.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 150, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV D
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 149.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 153, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising a nucleotide sequence comprising one or more EBV DBEs described herein (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 EBV DBEs), wherein the one or more EBV
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 152.
- the polynucleotide comprises one or more FR repeat sequences (e.g., 1 to 50, 1 to 40, 1 to 30, 1 to 20, 1 to 10 FR repeat sequence).
- the one or more FR repeat sequences each comprise a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to a nucleotide sequence set forth in Table 8 operably linked to a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to a nucleotide sequence set forth in Table 9.
- the one or more FR repeat sequence comprise a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to SEQ ID NO: 68.
- the polynucleotide comprises (i) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide sequence of SEQ ID NO: 69; (ii) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide sequence of SEQ ID NO: 70; or (iii) a combination of (i) and (ii).
- the polynucleotide comprises (i) operably linked to (ii). In some embodiments, (i) is upstream of (ii). In some embodiments, (ii) is upstream of (i). [00654] In some embodiments, the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide described herein, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide sequence of SEQ ID NO: 69; (b) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%,
- the EBNA1 polypeptide comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 1. In some embodiments, the EBNA1 polypeptide comprises or consists of SEQ ID NO: 1. In some embodiments, the ORF comprises a nucleotide sequence having about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 12. In some embodiments, the ORF comprises or consists of SEQ ID NO: 12.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; (III) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; (IV) an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%,
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%,
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence selected from (I) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acid sequence set forth in SEQ ID NO: 14 and (II) an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to the amino acids sequence set forth in SEQ ID NO: 16; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recomb
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 22; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 21; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acids sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 20; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 23; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 156; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,.,
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein, wherein the DNA binding protein comprises (a) one or more chromatin binding domains each comprising an amino acid sequence having at least about at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to the amino acids sequence set forth in SEQ ID NO: 154; and (b) a DBD comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to an amino acids sequence set forth in SEQ ID NO: 18, wherein the one or more chromatin binding domains and the DBD are operably linked; and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,.,
- DNA binding protein comprises from N-terminus to C- terminus: one or more chromatin binding domains and the DBD. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the DBD and one or more chromatin binding domains. In some embodiments, the DNA binding protein further comprises an NLS described herein (e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS). In some embodiment, the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 17.
- NLS described herein e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS.
- the NLS comprises an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about
- the NLS is positioned at the N-terminus, the C-terminus, or between the one or more chromatin binding domains and the DBD.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 3, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide sequence of SEQ ID NO: 69; (b) a nucleotide
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 7, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide sequence of SEQ ID NO: 69; (b) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 25.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 8, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleot
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 27.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 9, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleot
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 28.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 10, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleot
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 26.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 11, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleot
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 29.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 130, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleo
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 129.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 133, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 132.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 136, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 135.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 139, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucle
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 138.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 142, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 141.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 145, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucleotide.
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 144.
- the disclosure provides a system comprising (i) an mRNA comprising an ORF encoding a DNA binding protein comprising an EBNA1 polypeptide comprising an amino acid sequence having about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 150, and (ii) a recombinant expression vector described herein comprising at least one transgene described herein operably linked to a polynucleotide comprising (a) a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to the nucle
- the ORF comprises a nucleotide sequence having about 50%, about 60%, about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity to SEQ ID NO: 149.
- the polynucleotide comprises a nucleotide sequence having at least about 50% identity (e,g,., about 70%, about 80%, about 85%, about 90%, about 95%, about 98%, about 99%, or about 100% identity) to SEQ ID NO: 71.
- the polynucleotide comprises a FR of an EBV OriP or portion thereof.
- the recombinant expression vector lacks a DS of an EBV OriP.
- the mRNA comprises from 5 ⁇ to 3 ⁇ : a 5 ⁇ UTR described herein, the ORF encoding a DNA binding protein, and a 3 ⁇ UTR described herein.
- the mRNA comprises from 5 ⁇ to 3 ⁇ a 5 ⁇ cap described herein, a 5 ⁇ UTR described herein, the ORF encoding a DNA binding protein, and a 3 ⁇ UTR described herein.
- the mRNA comprises one or more modifications.
- the mRNA comprises one or more modified uridines (e.g., a modified uridine selected from pseudouridine, Nl-methylpseudouridine, and 5-methoxyuridine).
- the mRNA is fully modified with pseudorudine.
- the mRNA is fully modified with Nl-methylpseudouridine.
- the mRNA is fully modified with 5-methoxyuridine.
- the recombinant expression vector is a DNA.
- the recombinant expression vector is a plasmid DNA.
- the recombinant expression vector comprises a promoter described herein operably linked to an open reading frame comprising the at least one transgene and the polynucleotide.
- the promoter is a CMV promoter (e.g., a human CMV promoter).
- the promoter is a CMV promoter (e.g., a human EF1 promoter).
- the promoter is a cancer specific promoter.
- the promoter is a tumor specific promoter.
- the recombinant expression vector further comprises an enhancer.
- the enhancer is a CMV enhancer (e.g., a human CMV enhancer or a mouse CMV enhancer).
- the disclosure provides a composition comprising the system.
- the composition comprises a total amount of nucleic acid (i.e., mRNA + recombinant expression vector), wherein the mRNA is about 10% to 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10%, to about 60%, about 10% to about 70%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 60%, about 20% to about 70%, 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20%, to about 60%, about 20% to about 70%, about 30% to about 40%, about 30% to about 50%, about 30% to about 60%, or about 30% to about 70% of the total nucleic acid.
- the mRNA is about 20%, about 30%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, or about 70% of the total nucleic acid. In some embodiments, the mRNA is about 20% of the total nucleic acid. In some embodiments, the mRNA is about 30% of the total nucleic acid. In some embodiments, the mRNA is about 40% of the total nucleic acid. In some embodiments, the mRNA is about 50% of the total nucleic acid. In some embodiments, the mRNA is about 60% of the total nucleic acid. In some embodiments, the mRNA is about 70% of the total nucleic acid.
- the mRNA and the recombinant expression vector are present at a 1:1 w/w ratio.
- the composition comprises an LNP comprising the mRNA.
- the composition comprises an LNP comprising the recombinant expression vector.
- the composition comprises an LNP comprising the mRNA and the recombinant expression vector.
- the LNP comprises an ionizable amino lipid described herein, a phospholipid described herein, a structural lipid described herein, and a PEG-lipid described herein.
- expression of the at least one transgene is increased by at least about 5-fold, about 10- fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80-fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110-fold, about 120-fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, or about 200-fold as compared to introducing a control system or composition to the cell (e.g., a system or composition comprising the recombinant expression vector alone).
- a control system or composition to the cell (e.g., a system or composition comprising the recombinant expression vector alone).
- the disclosure provides a system described herein, or a delivery system (e.g., an LNP) described herein comprising the system or a component thereof.
- a delivery system e.g., an LNP
- the disclosure provides a system comprising (i) an mRNA described herein comprising an ORF encoding a DNA binding protein comprising one or more chromatin-binding domains operably linked to a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is of a NHP LCV described herein, and (ii) a recombinant expression vector described herein comprising a transgene operably linked to a DNA binding polynucleotide comprising a DBE of an EBV, or a variant thereof, and/or a DBE of an NHP LCV, or a variant thereof.
- the LCV is selected from gorilline gammaherpesvirus 1, macacine gammaherpesvirus 4, macacine gammaherpesvirus 10, macacine gammaherpesvirus 13, panine gammaherpesvirus 1, paniine gammaherpesvirus 1, and pongine gammaherpesvirus 2.
- the first sequence motif has at least 80% similarity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), and KNSCYNLRRC (SEQ ID NO: 113).
- the first sequence motif has at least 80% identity to a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), and KNSCYNLRRC (SEQ ID NO: 113).
- the first sequence motif comprises or consists of a sequence selected from KTSLYNLRRG (SEQ ID NO: 287), KTCCYNLRRC (SEQ ID NO: 288), KIPIYNLRRG (SEQ ID NO: 289), KTSCYNLRRC (SEQ ID NO: 290), KTCVYNLRRC (SEQ ID NO: 291), and KNSCYNLRRC (SEQ ID NO: 113).
- the second sequence motif has at least 80% similarity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), and RLTPLSRLPY (SEQ ID NO: 299).
- the second sequence motif has at least 80% identity to a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), and RLTPLSRLPY (SEQ ID NO: 299).
- the second sequence motif comprises or consists of a sequence selected from RLTPLSRLPF (SEQ ID NO: 294), RATPLSRLPY (SEQ ID NO: 295), RSTTLGRLPY (SEQ ID NO: 296), RLTPLGRLPF (SEQ ID NO: 297), RATPLGRLPY (SEQ ID NO: 298), and RLTPLSRLPY (SEQ ID NO: 299).
- the third sequence motif has at least 80% similarity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), and GPEPTPLMES (SEQ ID NO: 304). In some embodiments, the third sequence motif has at least 80% identity to a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), and GPEPTPLMES (SEQ ID NO: 304).
- the third sequence motif comprises or consists of a sequence selected from GPQPGPLRES (SEQ ID NO: 301), GPQPGPLKES (SEQ ID NO: 302), GPQPGPMRES (SEQ ID NO: 303), and GPEPTPLMES (SEQ ID NO: 304).
- [Xaa1] w comprises or consists of X1X2X3GGX4X5X6X7X8RGX9X10X11X12X13X14X15KX16X17X18X19X20X21X22X23X24X25LLX2 6 RX 27 X 28 X 29 X 30 X 31 X 32 TX 33 X 34 X 35 X 36 X 37 WX 38 X 39 X 40 X 41 X 42 X 43 X 44 X 45 X 46 X 47 (SEQ ID NO: 360), wherein X1; X2; X3; X4; X5; X6; X7; X8; X9; X10; X11; X12; X13; X14; X15; X16; X17; X18; X 19 ; X 20 ; X 21 ; X 22 ; X 23 ; X 24
- [Xaa2] x comprises or consists of X 53 X 54 X 55 X 56 X 57 X 58 X 59 X 60 (SEQ ID NO: 361), wherein X53; X54; X55; X56; X57; X58; X59; and X60 are defined in consensus 1 as set forth in Table 7.
- [Xaa3] y comprises or consists of GX 66 X 67 X 68 X 69 X 70 (SEQ ID NO: 362), wherein X66; X67; X68; X69; and X70 are defined in consensus 1 as set forth in Table 7.
- [Xaa4] z comprises or consists of X75X76X77X78FX79X80FX81X82X83X84X85X86X87X88X89X90X91X92X93X94X95X96X97X98X99X100 X101PX102PX103X104X105X106X107VX108X109X110X111FX112X113X114X115X116X117LP (SEQ ID NO: 363), wherein X75; X76; X77; X78; X79; X80; X81; X82; X83; X84; X85; X86 ; X87; X88; X89; X90; X91 ; X92 ; X93; X94; X95; X96 ; X97 ; X98; X99; X100; X101; X102; X103; X104; X105;
- [Xaa1] w comprises or consists of X1X2X3GGX4X5X6X7X8RGX9X10X11X12X13X14X15KX16X17X18X19X20X21X22X23X24X25LLX2 6 RX 27 X 28 X 29 X 30 X 31 X 32 TX 33 X 34 X 35 X 36 X 37 WX 38 X 39 X 40 X 41 X 42 X 43 X 44 X 45 X 46 X 47 (SEQ ID NO: 306), wherein X1; X2; X3; X4; X5; X6; X7; X8; X9; X10; X11; X12; X13; X14; X15; X16; X17; X18; X 19 ; X 20 ; X 21 ; X 22 ; X 23 ; X 24
- [Xaa2] x comprises or consists of X 53 X 54 X 55 X 56 X 57 X 58 X 59 X 60 (SEQ ID NO: 307), wherein X53; X54; X55; X56; X57; X58; X59; and X60 are defined in consensus 2 as set forth in Table 7.
- [Xaa3] y comprises or consists of GX 66 X 67 X 68 X 69 X 70 (SEQ ID NO: 308), wherein X66; X67; X68; X69; and X70 are defined in consensus 2 as set forth in Table 7.
- [Xaa4] z comprises or consists of X75X76X77X78FX79X80FX81X82X83X84X85X86X87X88X89X90X91X92X93X94X95X96X97X98X99X100 X 101 PX 102 PX 103 X 104 X 105 X 106 X 107 VX 108 X 109 X 110 X 111 FX 112 X 113 X 114 X 115 X 116 X 117 LP (SEQ ID NO: 309), wherein X75; X76; X77; X78; X79; X80; X81; X82; X83; X84; X85; X86 ; X87; X88; X89; X90; X91 ; X92 ; X93; X94; X95; X96 ; X97 ; X98; X
- the DBD comprises or consists of a sequence as defined in consensus 1 set forth in Table 7. In some embodiments, the DBD comprises or consists of SEQ ID NO: 364. In some embodiments, the DBD comprises or consists of a sequence as defined in consensus 2 set forth in Table 7. In some embodiments, the DBD comprises or consists of SEQ ID NO: 310. [00695] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 215. In some embodiments, the DBD comprises or consists of SEQ ID NO: 215.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 349. In some embodiments, the ORF comprises or consists of SEQ ID NO: 349. [00696] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 216. In some embodiments, the DBD comprises or consists of SEQ ID NO: 216.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 350. In some embodiments, the ORF comprises or consists of SEQ ID NO: 350. [00697] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 217. In some embodiments, the DBD comprises or consists of SEQ ID NO: 217.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 351. In some embodiments, the ORF comprises or consists of SEQ ID NO: 351. [00698] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 218. In some embodiments, the DBD comprises or consists of SEQ ID NO: 218.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 352. In some embodiments, the ORF comprises or consists of SEQ ID NO: 352. [00699] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 219. In some embodiments, the DBD comprises or consists of SEQ ID NO: 219.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 353. In some embodiments, the ORF comprises or consists of SEQ ID NO: 353. [00700] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 220. In some embodiments, the DBD comprises or consists of SEQ ID NO: 220.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 354. In some embodiments, the ORF comprises or consists of SEQ ID NO: 354. [00701] In some embodiments, the chromatin binding domain is described herein. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the chromatin binding domain and the DBD. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the DBD and the chromatin binding domain.
- the DNA binding protein further comprises an NLS described herein (e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS).
- the NLS is positioned at the N-terminus, the C-terminus, or at an internal region of the DNA binding protein.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 282.
- the DNA binding protein comprises SEQ ID NO: 282.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 320. In some embodiments, the ORF comprises SEQ ID NO: 320. [00703] In some embodiments, the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 192. In some embodiments, the DNA binding protein comprises SEQ ID NO: 192. In some embodiments, the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 201.
- the ORF comprises SEQ ID NO: 201.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 283.
- the DNA binding protein comprises SEQ ID NO: 283.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 321.
- the ORF comprises SEQ ID NO: 321.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 193.
- the DNA binding protein comprises SEQ ID NO: 193.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 203.
- the ORF comprises SEQ ID NO: 203.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 194.
- the DNA binding protein comprises SEQ ID NO: 194.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 205. In some embodiments, the ORF comprises SEQ ID NO: 205. [00707] In some embodiments, the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 195. In some embodiments, the DNA binding protein comprises SEQ ID NO: 195. In some embodiments, the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 207.
- the ORF comprises SEQ ID NO: 207.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 196.
- the DNA binding protein comprises SEQ ID NO: 196.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 209. In some embodiments, the ORF comprises SEQ ID NO: 209.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 197. In some embodiments, the DNA binding protein comprises SEQ ID NO: 197. In some embodiments, the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 211. In some embodiments, the ORF comprises SEQ ID NO: 211.
- the EBV DBE comprises TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 51, or a nucleotide sequence having at least 80% identity to SEQ ID NO: 51.
- the DNA binding polynucleotide comprises at least 4 DBEs, and wherein the DBEs are the same or different.
- the DNA binding polynucleotide comprises 4 to 50, 4 to 40, 4 to 30, 4 to 20, 4 to 10, or 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, or 4 DBEs.
- the at least 4 DBEs are contiguous.
- the at least 4 DBEs are operably linked via a spacer sequence.
- the spacer sequence is about 1-56 nucleotides in length. In some embodiments, the spacer sequence is 25- 35 nucleotides in length. In some embodiments, the spacer sequence comprises an AT-content of at least about 50% or higher. In some embodiments, each DBE comprises a 3 ⁇ spacer sequence, and wherein the length of the DBE and the 3 ⁇ spacer sequence is about 20-50 nucleotides.
- the DNA binding polynucleotide comprises a sequence represented by the formula 5 ⁇ -[D1]-[L1]-[D2]-[L2]-[D3]-[L3]-([Dn]-[Ln])x-3 ⁇ , wherein [D1], [D2], [D 3 ], and [D n ] each comprise TAGCATATGCTA (SEQ ID NO: 51), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to TAGCATATGCTA (SEQ ID NO: 51), or a nucleotide sequence having at least 80% identity to TAGCATATGCTA (SEQ ID NO: 51); wherein [L1], [L2], [L3], and [Ln] are each selected from: a phosphate linkage and a spacer sequence of 1-56 nucleotides, and wherein x indicates the number of ([Dn]-[Ln]) units in the sequence and is an integer of 1-47
- x is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17.
- [D1], [D2], [D 3 ], and [D n ] are the same or different.
- the spacer sequence is 25-35 nucleotides in length.
- [D1]-[L1], [D2]-[L2], [D3]-[L3], and/or [Dn]-[Ln] have a length of about 20 to about 50 nucleotides.
- the spacer sequence comprises a AT-content of greater than 50%.
- the DNA binding polynucleotide comprises SEQ ID NO: 69, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 69. In some embodiments, the DNA binding polynucleotide comprises SEQ ID NO: 71, or a nucleotide sequence having at least about 70% identity to SEQ ID NO: 71.
- the DNA binding polynucleotide comprises a family of repeats (FR) of the EBV OriP or a portion of the FR, and wherein the recombinant expression vector lacks a dyad symmetry (DS) of the EBV OriP.
- FR family of repeats
- DS dyad symmetry
- the disclosure provides a system comprising (i) an mRNA described herein comprising an ORF encoding a DNA binding protein comprising (a) a chromatin-binding domain, and (b) a DBD of an EBNA1 homolog, wherein the EBNA1 homolog is of an NHP LCV, wherein (i)(a) and (i)(b) are operably-linked, and (ii) a recombinant expression vector comprising (a) a transgene, and (b) a DNA binding polynucleotide, wherein the DNA binding polynucleotide comprises a DBE of an NHP LCV, or a variant thereof, wherein the DBE is (I) CGCCAACAAACGTTG (SEQ ID NO: 317), a nucleotide sequence having 1, 2, 3, or 4 mismatches relative to SEQ ID NO: 317, or a nucleotide sequence having at least 80% identity to SEQ ID
- the DNA binding polynucleotide comprises at least 4 DBEs, and wherein the DBEs are the same or different.
- the DNA binding polynucleotide comprises 4 to 50, 4 to 40, 4 to 30, 4 to 20, 4 to 10, or 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, or 4 DBEs.
- the at least 4 DBEs are contiguous.
- the at least 4 DBEs are operably linked via a spacer sequence.
- the spacer sequence is about 1-56 nucleotides in length. In some embodiments, the spacer sequence is 25-35 nucleotides in length.
- the spacer sequence comprises an AT-content of at least about 50% or higher.
- each DBE comprises a 3 ⁇ spacer sequence, and wherein the length of the DBE and the 3 ⁇ spacer sequence is about 20-50 nucleotides.
- the DNA binding polynucleotide comprises a sequence represented by the formula 5 ⁇ -[D1]-[L1]-[D2]-[L2]-[D3]-[L3]-([Dn]-[Ln])x-3 ⁇ , wherein [D1], [D 2 ], [D 3 ], and [D n ] each comprise the DBE or the variant thereof; wherein [L 1 ], [L 2 ], [L 3 ], and [Ln] are each selected from: a phosphate linkage and a spacer sequence of 1-56 nucleotides, and wherein x indicates the number of ([D n ]-[L n ]) units in the sequence and is an integer of 1- 47, and wherein (ii)(a) and (b) are operably-linked.
- x is 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or 17.
- [D1], [D2], [D3], and [Dn] are the same or different.
- the spacer sequence is 25-35 nucleotides in length.
- [D 1 ]-[L 1 ], [D 2 ]-[L 2 ], [D3]-[L3], and/or [Dn]-[Ln] have a length of about 20 to about 50 nucleotides.
- the spacer sequence comprises a AT-content of greater than 50%.
- the DNA binding polynucleotide comprises a nucleotide sequence having at least about 70% identity to SEQ ID NO: 316. In some embodiments, the DNA binding polynucleotide comprises SEQ ID NO: 316. [00715] In some embodiments, the DBD comprises or consists of an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 322. In some embodiments, the DBD comprises or consists of SEQ ID NO: 322.
- the ORF comprises or consists of a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 323. In some embodiments, the ORF comprises or consists of SEQ ID NO: 323. [00716] In some embodiments, the chromatin binding domain is described herein. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the chromatin binding domain and the DBD. In some embodiments, DNA binding protein comprises from N-terminus to C-terminus: the DBD and the chromatin binding domain.
- the DNA binding protein further comprises an NLS described herein (e.g., an EBNA1 NLS, a c-myc NLS, a nucleoplasmin NLS, a SV40 NLS).
- the NLS is positioned at the N-terminus, the C-terminus, or at an internal region of the DNA binding protein.
- the DNA binding protein comprises an amino acid sequence having at least about 90%, about 95%, about 98%, about 99% identity to SEQ ID NO: 311. In some embodiments, the DNA binding protein comprises SEQ ID NO: 311.
- the ORF comprises a nucleotide sequence having at least about 80%, about 85%, about 90%, about 95%, about 98%, or about 99% identity to SEQ ID NO: 313. In some embodiments, the ORF comprises SEQ ID NO: 313. [00718] In some embodiments, the mRNA comprises from 5 ⁇ to 3 ⁇ : a 5 ⁇ UTR described herein, the ORF encoding a DNA binding protein, and a 3 ⁇ UTR described herein. In some embodiments, the mRNA comprises from 5 ⁇ to 3 ⁇ a 5 ⁇ cap described herein, a 5 ⁇ UTR described herein, the ORF encoding a DNA binding protein, and a 3 ⁇ UTR described herein.
- the mRNA comprises one or more modifications.
- the mRNA comprises one or more modified uridines (e.g., a modified uridine selected from pseudouridine, Nl-methylpseudouridine, and 5-methoxyuridine).
- the mRNA is fully modified with pseudorudine.
- the mRNA is fully modified with Nl-methylpseudouridine.
- the mRNA is fully modified with 5-methoxyuridine.
- the recombinant expression vector is a DNA.
- the recombinant expression vector is a plasmid DNA.
- the recombinant expression vector comprises a promoter described herein operably linked to an open reading frame comprising the transgene and the DNA binding polynucleotide.
- the promoter is a CMV promoter (e.g., a human CMV promoter).
- the promoter is a CMV promoter (e.g., a human EF1 promoter).
- the promoter is a cancer specific promoter.
- the promoter is a tumor specific promoter.
- the recombinant expression vector further comprises an enhancer.
- the enhancer is a CMV enhancer (e.g., a human CMV enhancer or a mouse CMV enhancer).
- the disclosure provides a composition comprising the system.
- the composition comprises a total amount of nucleic acid (i.e., mRNA + recombinant expression vector), wherein the mRNA is about 10% to 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10%, to about 60%, about 10% to about 70%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 60%, about 20% to about 70%, 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20%, to about 60%, about 20% to about 70%, about 30% to about 40%, about 30% to about 50%, about 30% to about 60%, or about 30% to about 70% of the total nucleic acid.
- nucleic acid i.e., mRNA + recombinant expression vector
- the mRNA is about 20%, about 30%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, or about 70% of the total nucleic acid. In some embodiments, the mRNA is about 20% of the total nucleic acid. In some embodiments, the mRNA is about 30% of the total nucleic acid. In some embodiments, the mRNA is about 40% of the total nucleic acid. In some embodiments, the mRNA is about 50% of the total nucleic acid. In some embodiments, the mRNA is about 60% of the total nucleic acid. In some embodiments, the mRNA is about 70% of the total nucleic acid.
- the mRNA and the recombinant expression vector are present at a 1:1 w/w ratio.
- the composition comprises an LNP comprising the mRNA.
- the composition comprises an LNP comprising the recombinant expression vector.
- the composition comprises an LNP comprising the mRNA and the recombinant expression vector.
- the LNP comprises an ionizable amino lipid described herein, a phospholipid described herein, a structural lipid described herein, and a PEG-lipid described herein.
- expression of the at least one transgene is increased by at least about 5-fold, about 10-fold, about 20-fold, about 30-fold, about 40-fold, about 45-fold, about 50-fold, about 55-fold, about 60-fold, about 65-fold, about 70-fold, about 75-fold, about 80- fold, about 85-fold, about 90-fold, about 95-fold, about 100-fold, about 110-fold, about 120- fold, about 130-fold, about 140-fold, about 150-fold, about 160-fold, about 170-fold, about 180-fold, about 190-fold, or about 200-fold as compared to introducing a control system or composition to the cell (e.g., a system or composition comprising the recombinant expression vector alone).
- a control system or composition to the cell (e.g., a system or composition comprising the recombinant expression vector alone).
- compositions comprising a system, a system component (e.g., an mRNA or recombinant expression vector) described herein, or a delivery system described herein, and a pharmaceutically acceptable carrier.
- a pharmaceutically acceptable carrier may be vehicles approved by a regulatory agency of the Federal or a state government or listed in the U.S. Pharmacopeia or other generally recognized pharmacopeia for use in mammals, such as humans.
- carrier refers to a diluent, adjuvant, excipient with which a system, system component (e.g., an mRNA or recombinant expression vector), or delivery system described herein is formulated for administration to a subject (e.g., a mammal).
- lipids e.g. liposomes, e.g. liposome dendrimers
- liquids such as water and oils, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, sesame oil and the like, saline; gum acacia, gelatin, starch paste, talc, keratin, colloidal silica, urea, and the like.
- compositions may be formulated into preparations in solid, semi- solid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories, injections, inhalants, gels, microspheres, and aerosols.
- administration of the system, system component (e.g., an mRNA or recombinant expression vector), or delivery system described herein can be achieved in various ways, including oral, buccal, rectal, parenteral, intraperitoneal, intradermal, transdermal, intracheal, etc., administration.
- compositions can include, depending on the formulation desired, pharmaceutically-acceptable, non-toxic carriers of diluents, which are defined as vehicles commonly used to formulate pharmaceutical compositions for animal or human administration.
- diluents are defined as vehicles commonly used to formulate pharmaceutical compositions for animal or human administration.
- the diluent is selected so as not to affect the biological activity of the combination. Examples of such diluents are distilled water, buffered water, physiological saline, PBS, Ringer's solution, dextrose solution, and Hank's solution.
- the pharmaceutical composition or formulation can include other carriers, adjuvants, or non-toxic, nontherapeutic, nonimmunogenic stabilizers, excipients and the like.
- the compositions can also include additional substances to approximate physiological conditions, such as pH adjusting and buffering agents, toxicity adjusting agents, wetting agents and detergents.
- the composition can also include any of a variety of stabilizing agents, such as an antioxidant for example.
- the nucleic acids of a composition can also be complexed with molecules that enhance their in vivo attributes. Such molecules include, for example, carbohydrates, polyamines, amino acids, other peptides, ions (e.g., sodium, potassium, calcium, magnesium, manganese), and lipids.
- compositions can be administered for prophylactic and/or therapeutic treatments.
- Toxicity and therapeutic efficacy of the active ingredient can be determined according to standard pharmaceutical procedures in cell cultures and/or experimental animals, including, for example, determining the LD50 (the dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 50% of the population).
- the dose ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the ratio LD50/ED50.
- compositions intended for in vivo use are usually sterile.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Physics & Mathematics (AREA)
- Virology (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Peptides Or Proteins (AREA)
Abstract
La présente divulgation propose un système d'expression non virale pour améliorer l'expression d'un transgène dans une cellule, le système comprenant un ARNm codant pour une protéine de liaison à l'ADN comprenant un ou plusieurs domaines de liaison à la chromatine et un domaine de liaison à l'ADN (DBD) d'un polypeptide d'antigène nucléaire 1 d'Epstein-Barr (EBNA1) ou un homologue d'antigène nucléaire 1 d'Epstein-Barr (EBNA1) d'un lymphocryptovirus (LCV) de primate non humain (NHP), et un vecteur d'expression recombinant comprenant un transgène et un polynucléotide comprenant au moins un élément de liaison à l'ADN (DBE) d'une origine de réplication (OriP) du virus Epstein-Barr (EBV) ou d'un DBE d'un LCV de NHP. La divulgation propose en outre des véhicules d'administration comprenant les systèmes d'expression non virale décrits dans la description et des procédés in vitro, ex vivo et in vivo pour augmenter l'expression du transgène dans une cellule à l'aide du système et des systèmes de distribution. La divulgation propose également des procédés d'expression sélective d'un transgène dans un tissu cible et/ou une population de cellules cibles à l'aide des systèmes et des systèmes de distribution décrits dans la description.
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363479531P | 2023-01-11 | 2023-01-11 | |
| US202363580677P | 2023-09-05 | 2023-09-05 | |
| US202363599484P | 2023-11-15 | 2023-11-15 | |
| PCT/US2024/011286 WO2024151877A2 (fr) | 2023-01-11 | 2024-01-11 | Systèmes d'expression non virale et leurs procédés d'utilisation |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| EP4649087A2 true EP4649087A2 (fr) | 2025-11-19 |
Family
ID=89977949
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| EP24705886.0A Pending EP4649087A2 (fr) | 2023-01-11 | 2024-01-11 | Systèmes d'expression non virale et leurs procédés d'utilisation |
Country Status (2)
| Country | Link |
|---|---|
| EP (1) | EP4649087A2 (fr) |
| WO (1) | WO2024151877A2 (fr) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2025233870A1 (fr) * | 2024-05-08 | 2025-11-13 | Engage Biologics Inc. | Systèmes d'expression d'adn non viral à immunogénicité réduite et procédés d'utilisation associés |
Family Cites Families (16)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5013556A (en) | 1989-10-20 | 1991-05-07 | Liposome Technology, Inc. | Liposomes with enhanced circulation time |
| US6027726A (en) | 1994-09-30 | 2000-02-22 | Inex Phamaceuticals Corp. | Glycosylated protein-liposome conjugates and methods for their preparation |
| US6417002B1 (en) * | 1999-02-11 | 2002-07-09 | Pharmacopeia, Inc. | Method for maintenance and selection of episomes |
| AU2002221731A1 (en) * | 2000-10-23 | 2002-05-06 | Mermaid Pharmaceuticals Gmbh | Method for generating transgenic fish embryos using an episomal vector system |
| CA2429814C (fr) | 2000-12-01 | 2014-02-18 | Thomas Tuschl | Petites molecules d'arn mediant l'interference arn |
| US20050222064A1 (en) | 2002-02-20 | 2005-10-06 | Sirna Therapeutics, Inc. | Polycationic compositions for cellular delivery of polynucleotides |
| EP1519714B1 (fr) | 2002-06-28 | 2010-10-20 | Protiva Biotherapeutics Inc. | Appareil liposomal et procedes de fabrication |
| DE10328289B3 (de) | 2003-06-23 | 2005-01-05 | Enginion Ag | Arbeitsmedium für Dampfkreisprozesse |
| CN101267805A (zh) | 2005-07-27 | 2008-09-17 | 普洛体维生物治疗公司 | 制造脂质体的系统和方法 |
| US8691750B2 (en) | 2011-05-17 | 2014-04-08 | Axolabs Gmbh | Lipids and compositions for intracellular delivery of biologically active compounds |
| WO2013086354A1 (fr) | 2011-12-07 | 2013-06-13 | Alnylam Pharmaceuticals, Inc. | Lipides biodégradables pour l'administration d'agents actifs |
| WO2013116126A1 (fr) | 2012-02-01 | 2013-08-08 | Merck Sharp & Dohme Corp. | Nouveaux lipides cationiques biodégradables de faible masse moléculaire pour la délivrance d'oligonucléotides |
| WO2017181107A2 (fr) | 2016-04-16 | 2017-10-19 | Ohio State Innovation Foundation | Arnm de cpf1 modifié, arn-guide modifié et leurs utilisations |
| BR112019015797A2 (pt) | 2017-02-01 | 2020-03-17 | Modernatx, Inc. | Composições de mrna terapêuticas imunomoduladoras que codificam peptídeos de mutação de oncogene de ativação |
| JP7554670B2 (ja) * | 2017-10-11 | 2024-09-20 | フェイト セラピューティクス,インコーポレイテッド | 一時的かつ一過性プラスミドベクター発現システムを用いる細胞のリプログラミング |
| JP7556848B2 (ja) | 2018-09-14 | 2024-09-26 | モデルナティエックス インコーポレイテッド | mRNA治療薬を使用したがんを治療するための方法及び組成物 |
-
2024
- 2024-01-11 EP EP24705886.0A patent/EP4649087A2/fr active Pending
- 2024-01-11 WO PCT/US2024/011286 patent/WO2024151877A2/fr not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| WO2024151877A2 (fr) | 2024-07-18 |
| WO2024151877A3 (fr) | 2024-08-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2021204763B2 (en) | Compounds and compositions for intracellular delivery of therapeutic agents | |
| KR102858623B1 (ko) | 신규 인공 핵산 분자 | |
| JP2021138741A (ja) | 適応免疫応答を誘導するためのヌクレオシド修飾rna | |
| CA3055653A1 (fr) | Formulation de nanoparticules lipidiques | |
| WO2021262909A2 (fr) | Compositions de lnp comprenant des agents thérapeutiques à base d'arnm à demi-vie prolongée | |
| US20230248818A1 (en) | Nucleoside-modified RNA for Inducing an Immune Response Against SARS-CoV-2 | |
| US11725035B2 (en) | Methods of treating a disorder associated with with insufficient stimulator of interferon genes (STING) activity | |
| Leblanc et al. | Analysis of the interactions between HIV-1 and the cellular prion protein in a human cell line | |
| US20140187752A1 (en) | Antigenic compositions and use of same in the targeted delivery of nucleic acids | |
| JP2024019460A (ja) | C型肝炎ウイルスに対するヌクレオシド修飾mRNA-脂質ナノ粒子系統ワクチン | |
| WO2024151877A2 (fr) | Systèmes d'expression non virale et leurs procédés d'utilisation | |
| EP3965830A1 (fr) | Microarn de cellules immunitaires exprimés de manière différentielle pour la régulation de l'expression de protéines | |
| WO2023183550A2 (fr) | Acides ribonucléiques messagers à demi-vie allongée | |
| EP4479085A1 (fr) | Arnm codant pour des vaccins anticancéreux contre les points de contrôle et leurs utilisations | |
| CN117487857A (zh) | 一种增强抗原Survivin和/或人表皮生长因子受体2的修饰核苷酸组合及其应用 | |
| KR20240009952A (ko) | 인플루엔자 바이러스 핵산 지질 입자 백신 | |
| WO2025233870A1 (fr) | Systèmes d'expression d'adn non viral à immunogénicité réduite et procédés d'utilisation associés | |
| EP4735033A2 (fr) | Tat codé par arnm à cytotoxicité atténuée pour inversion de latence du vih et du siv | |
| US20230330218A1 (en) | Hepatitis c virus modified e2 glycoprotein and uses thereof as vaccines | |
| WO2024178413A1 (fr) | Compositions de nanoparticules et leurs utilisations pour la réactivation du vih latent | |
| EP4615959A1 (fr) | Nouveaux marqueurs de dégradation inductibles par les médicaments | |
| CN117083383A (zh) | 经修饰的trem的组合物及其用途 | |
| KR20210121767A (ko) | 스플라이스좀 관련 단백질인 ik의 용도 | |
| KR20180095694A (ko) | 핵산 올리고머 및 이의 용도 | |
| HK1148310A (en) | Antigenic compositions and use of same in the targeted delivery of nucleic acids |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: UNKNOWN |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE |
|
| PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
| STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE |
|
| 17P | Request for examination filed |
Effective date: 20250806 |
|
| AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC ME MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
| DAV | Request for validation of the european patent (deleted) | ||
| DAX | Request for extension of the european patent (deleted) |