OA21931A

OA21931A - Antisense compounds and methods for targeting CUG repeats.

Info

Publication number: OA21931A
Application number: OA1202300508
Authority: OA
Inventors: Hugh Durrant-Whyte; Xiulong SHEN; Alistair GILLESPIE; Ziqing QIAN
Original assignee: Entrada Therapeutics, Inc
Priority date: 2021-06-23
Filing date: 2022-06-22
Publication date: 2025-07-14

Abstract

Compounds comprising a cyclic peptide, such as a cyclic cell penetrating peptide, and an antisense compound are provided. The antisense compound binds to a gene having an expanded CTG repeat or a gene transcript having an expanded CUG repeat. The compounds can be delivered to subjects to treat diseases associated with expanded CTG-CUG repeats, such as myotonic dystrophy type 1 (DM1), spinocerebellar ataxia-8 (SCA8), and Huntington disease like-2 (HDL2).

Description

ANTISENSE COMPOUNDS AND METHODS FOR TARGETING CUG REPEATS

CROSS REFERENCE TO RELATED APPLICATIONS

This application daims the benefit of ILS. Provisîonal Application Serial Nos: 63/213,900, filed on June 23, 2021; 63/239,847, filed on September 1, 2021; 63/290,892, fded on December 17, 2021;

63/305,071, filed on January 31,2022; 63/314,369, filed on February 26, 2022; 63/316,634, filed on

March 4,2022; 63/317,856, filed on March 8, 2022; 63/326,201, filed on Match 3 1,2022; 63/327,179, filed on April 4, 2022; 63/339,250, filed on May 6, 2022; 63/362,295, filed on March 31, 2022; 63/239,671, filed on September 1, 2021; 63/290,960, filed on December 17, 2021; 63/298,565, filed on January I 1, 2022; and 63/268,577, filed on February 25, 2022.

FIELD

The présent disclosure relates to compounds, compositions, and methods for modulating the activity and/or levels of genes that include expanded nucléotide repeats, in particular expanded CTG-CUG repeats. The compounds and compositions containing the same may be used to treat diseases associated with genes that include an expanded nucléotide repeat, in particular expanded CTG-CUG 15 repeats.

INTRODUCTION

Several diseases are associated with genes hâve expanded nucléotide repeats, that is, a greater number of nucléotide repeats than is observed in a healthy phenotype. The expanded repeat may cause aggregation and/or nucléation of the expanded repeat containing transcript and/or cause nucléation of 20 proteins that bind to expanded repeat containing transcript. The expanded repeats may resuit in some proteins, such as pre-mRNA processing proteins, being sequestered on the repeat, thus inhibiting the proteins from performing their normal functions, such as processing pre-mRNA transcripts of other genes that do not contain the expanded repeat.

There are several diseases associated with genes having expanded CTG-CUG trinucleotide repeats 25 (CTG refers to the DNA repeat and CUG refers to the corresponding RNA repeat that occurs upon transcription). Diseases associated with genes having expanded CTG-CUG trinucleotide repeats include, but are not limited to, myotonie dystrophy type 1 (DM1), Spinocerebellar Ataxia-8 (SCA8), Huntington’s disease like-2 (HDL2), and Fuchs' endothélial comeal dystrophy (FECD).

Myotonie dystrophy type l (DM l ), the most common cause of muscular dystrophy in adults, affecting l in 8500 individuals worldwide, is associated with a gene that has an expanded trinucleotide repeat (Lee and Cooper. (2009) “Pathogenic mechanisms of myotonie dystrophy,” Biochem Soc Trans. 37(06): 10.1042/BST0371281 ). DMl is a disorder that affects skeletal and smooth muscle, as well as the eye, heart, endocrine System, and central nervous System. DM l is caused by abnormal expansion of a CTG-trinucleotide repeat in the non-coding région of the gene encoding Dystrophia Myotonica Protein Kinase (DMPK). The CTG expansion lies within a région corresponding to the 3' untranslated région (3'-UTR) of the DMPK mRNA. Whereas the DMPK gene in healthy individuals contains between 5 and 40 CTG trinucleotide repeats, patients with DMl hâve from 50 and up to several thousand CTG trinucleotide repeats. CTG-trinucleotide repeat expansion results in global deregulation of gene expression in affected individuals due to nucléation of some regulatory RNAbinding proteins in the CUG-expansîon in the 3' untranslated région (3'-UTR), rendering the RNAbinding proteins, such as muscleblind-like protein (MBNLl-3) unable to perform their normal cellular function. The nucleated RNA-binding proteins are not available to bind and affect translation of other mRNA transcripts. These CUG-expanded mRNA-protein aggregates form distinct nuclear foci. The activity of additional splicing factors, such as CUGBP Elav-like family member l (CELFl), is also disrupted, leading to the mis-splicing of a large number of downstream gene transcripts associated with symptoms of DML Disease severity increases and âge of onset decreases with an increasing number of repeats (Pettersson et al. (2015) “Molecular mechanisms in DMl — a focus on foci.” Nucleic Acids Res. 43(4):2433-2441 ).

The CUG-trinucleotide repeats in the 3' untranslated région of DMPK mRNA form imperfect stable hairpin structures that accumulate in the cell nucléus in small ribonuclear complexes or microscopically visible inclusions, and impair the function of proteins implicated in transcription, splicing or RNA export. Although DMPK genes with CUG repeats are transcribed into mRNA, the mutant transcripts are sequestered in the nucléus as aggregates (foci), which results in a decrease in cytoplasmic DMPK mRNA levels. These aggregations lead to the deregulation of the alternative splicing of many different transcripts due to séquestration of two RNA-binding proteins: MBNLl (muscleblind-like l) and CUGBPl (CUG-binding protein l), resulting in loss-of-functionofMBNLl and upregulation of CUGBPl (Lee and Cooper. (2009) “Pathogenic mechanisms of myotonie dystrophy,” Biochem Soc Trans. 37(06): 10.1042/BST0371281). MBNLl and CUGBP-ETR-3 like factor l (CELFl ) are developmental regulators of splicing events during fêtai to adult transition and 2 modification oftheir activities in DMl leads to expression of a fêtai splicing pattern in adult tissues. The downstream impact of decreased MBNLl and increased CELFl levels includes disruption of alternative splicing, mRNA translation and mRNA decay in proteins such as cardiac troponin T (cTNT), insulin receptor (INSR), muscle-specific chloride ion channel (CLCNl) and sarcoplasmic/endoplasmic réticulum calcium ATPase l (ATP2A1 ) transcripts, in addition to MBNLl (Konieczny et al. (2017) “Myotonie dystrophy: candidate small molécule therapeutics.” Drug Discov Today. 22( 11 ): 1740-174).

Possible therapeutic approaches to treat DMl, or other diseases associated with expanded CTG-CUG repeats, include the use of therapeutic oligonucleotide containing compounds. However, a major problem associated with the use of oligonucleotide compounds in therapeutics is their limited ability to gain access to the intracellular compartment when administered systemically. Intracellular delivery of oligonucleotide compounds can be facilitated by use of carrier Systems such as polymers, cationic liposomes or by Chemical modification of the construct, for example by the covalent attachment of cholestérol molécules. However, intracellular delivery efficiency of oligonucleotide compounds romains low. Improved delivery Systems are still required to increase the potency of these compounds. There is an unmet need for effective compositions to deliver therapeutic oligonucleotide compounds to intracellular compartments to treat diseases that are caused by expanded CTG-CUG repeats, such as DMl.

SUMMARY

Compounds, compositions, and methods for treating a disease associated with an expanded CTG-CUG repeat are described herein. In embodiments, this disclosure relates to compounds that include an antisense compound (AC) and a cyclic peptide, such as a cyclic cell penetrating peptide (cCPP). In embodiments, the AC binds to a gene or gene transcript comprising an expanded CUG repeat. In embodiments, the cyclic peptide facilitâtes intracellular localization of the AC. The compounds may comprise an endosoinal escape vehicle (EEV). The EEV may comprise the cyclic peptide and an exocyclic peptide.

In embodiments, provided herein is a compound comprising: (a) at least one cyclic peptide and (b) an antisense compound (AC) that is complementary to a target nucléotide. In embodiments, the target nucléotide comprises at least one expanded CUG or CTG repeat. In embodiments, the target nucléotide is a gene that comprises at least one expanded CTG repeat. In embodiments, the target 3 nucléotide is RNA that comprises at least one expanded CUG repeat. In embodiments, the RNA that comprises at least one expanded CUG repeat is a pre-mRNA sequence. In embodiments, the expanded CUG repeat corresponds to an expanded CTG repeat in a gene from which the pre-mRNA is transcribed. In embodiments, the antisense compound binds to the expanded CTG repeat or the expanded CUG repeat. In embodiments, the AC comprises 5-40 CAG repeats (e.g., 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, I6, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40 repeats). In embodiments, the AC comprises a sequence of a nucléotide listed in Table 2, Table 10, or Table 11. In embodiments, the AC comprises a sequence of a nucléotide listed in Table 2.

In embodiments, the AC comprises at least one modifiée! nucléotide or nucleic acid selected from a phosphorothioate (PS) nucléotide, a phosphorodiamidate morpholino (PMO) nucléotide, a locked nucleic acid (LNA), a peptide nucleic acid (PNA), a nucléotide comprising a 2’-O-methyl (2’-OMe) modified backbone, a 2’O-methoxy-ethyl (2’-MOE) nucléotide, a 2',4' constrained ethyl (cEt) nucléotide, a 2'-deoxy-2'-lluoro-beta-D-arabinonucleic acid (2Έ-ΑΝΑ), and combinations thereof. In embodiments, the AC comprises a PMO nucléotide.

In embodiments, compounds are provided that include a cyclic peptide having 6 to 12 amino acids, wherein at least two amino acids of the cyclic peptide are charged amino acids, at least two amino acids of the cyclic peptide are aromatic hydrophobie amino acids and at least two amino acids of the cyclic peptide are uncharged. non-aromatic amino acids. In embodiments, the antisense compound (AC) îs complementary to at least a portion of an expanded CUG repeat in a target mRNA sequence. In embodiments, the AC is a phosphorodiamidate morpholino (PMO) nucléotide.

In embodiments, at least two charged amino acids of the cyclic peptide are arginine. In embodiments, at least two aromatic, hydrophobie amino acids of the cyclic peptide are phenylalanine, naphthylalanine (3-Naphth-2-yl-alanine), or a combination thereof. In embodiments, at least two uncharged, non-aromatic amino acids of the cyclic peptide are citrulline, glycine, or a combination thereof. In embodiments, the compound is a cyclic peptide having 6 to 12 amino acids wherein two amino acids of the cyclic peptide are arginine, at least two amino acids are aromatic, hydrophobie amino acids selected from phenylalanine, naphthylalanine, and combinations thereof, and at least two amino acids are uncharged, non-aromatic amino acids selected from citrulline, glycine, and combinations thereof

In embodiments, the compound comprises an endosomal escape vehicle comprising a cyclic peptide and an exocyclic peptide (EP). In embodiments, the EP is conjugated to a linker at an amino group. The linker may be a linker as described herein. In embodiments, the EP is conjugated to the cyclic peptide via the linker. In embodiments, the EP is conjugated to the AC via the linker. In 5 embodiments, the EP is conjugated to the linker that conjugales the AC to the cyclic peptide.

In embodiments, the EP comprises from 2 to 10 amino acids. In embodiments, the EP comprises from 4 to 8 amino acid residues. In embodiments, the EP comprises l or 2 amino acids comprising a side chain comprising a guanidine group, or a protonated form thereof. In embodiments, the EP comprises l, 2, 3, or 4 lysine residues. In embodiments, the amino group on the side chain of each lysine residue 10 is substituted with a trifluoroacetyl (-COCF3) group, allyloxycarbonyl ( Alloc), l-(4,4-dimethyl-2,6dioxocyclohexylidenejethyl (Dde), or (4,4-dimethyl-2,6-dioxocyclohex- l-ylidene-3)-methylbutyl (ivDde) group. In embodiments, EP comprises at least 2 amino acid residues with a hydrophobie side chain. In embodiments, the amino acid residue with a hydrophobie side chain is selected from valine, praline, alanine, leucine, isoleucine, and méthionine. In embodiments, the exocyclic peptide I5 comprises one of the following sequences: PKKKRKV; KR; RR; KKK; KGK; KBK; KBR; KRK;

KRR; RKK; RRR; KKKK; KKRK; KRKK; KRRK; RKKR; RRRR; KGKK; KKGK; KKKKK; KKKRK; KBKBK; KKKRKV; PGKKRKV; PKGKRKV; PKKGRKV; PKKKGKV; PKKKRGV; or PKKKRKG. In embodiments, the exocyclic peptide consists of one of the following sequences: PKKKRKV; KR; RR; KKK; KGK; KBK; KBR; KRK; KRR; RKK; RRR; KKKK; KKRK; KRKK; 20 KRRK; RKKR; RRRR; KGKK; KKGK; KKKKK; KKKRK; KBKBK; KKKRKV; PGKKRKV;

PKGKRKV; PKKGRKV; PKKKGKV; PKKKRGV; or PKKKRKG. In embodiments, the exocyclic peptide has the structure: Ac-P-K-K-K-R-K-V-.

In embodiments, the cyclic peptide comprises 4 to 12 amino acids. In embodiments, the cyclic peptide comprises 6 to 12 amino acids. In embodiments, at least two amino acids of the cyclic peptide are 25 charged amino acids, at least two amino acids of the cyclic peptide are aromatic hydrophobie amino acids and at least two amino acids of the cyclic peptide are uncharged, non-aromatic amino acids. In embodiments, at least two charged amino acids of the cyclic peptide are arginine, at least two aromatic hydrophobie amino acids of the cyclic peptide are phenylalanine, napthylalanine, or combinations thereof, and at least two uncharged, non-aromatic amino acids are citrulline, glycine, or combinations 30 thereof.

In embodiments, the cyclic peptide has 4 to 12 amino acids, wherein at least two amino acids are arginine and at least two amino acids comprise a hydrophobie side chain, provided that the cyclic peptide is not a cyclic peptide having a sequence of SEQ ID NO: 89-117. In embodiments, the cyclic peptide is not a cyclic peptide having a sequence of SEQ ID NO: 89-117.

CPP sequences and SEQ ID NOs
FO>RRRQ	89	RRFR<PRQ	99	FΦRRRRQK	109
FfoRRRC	90	FRRRR4>Q	100	FΦRRRRQC	l IO
F0RRRU	91	rRFR0RQ	IOI	IWrRrRQ	111
RRR0FQ	92	RR<DFRRQ	102	FΦRRRRRQ	112
ΚΒΒΒΦΕ	93	CRRRRFWQ	103	RRRRΦFDΩC	113
FfoRRRR	94	FfΦRrRrQ	104	FΦRRR	114
FijirRrRq	95	FFΦRRRRQ	105	FWRRR	115
F^rRrRQ	96	RFRFRΦRQ	106	RRR®F	ll6
FΦRRRRQ	97	URRRRFWQ	107	RRRWF	H7
f®Ri-RrQ	98	CRRRRFWQ	108

where F is L-phenylalanine, f is D-phenylalanine, Φ is L-3-(2-naphthyI)-alanine, Φ is D-3-(2naphthyl)-alanine, R. is L-arginine, r is D-arginine, Q is L-glutamine, q is D-glutamine, C is Lcysteine, U is L-sclenocysteine, W is L-tryptophan, K is L-lysine, D is L-aspartic acid, and Ω is Lnorleucine.

In embodiments, the cyclic peptide has the following structure:

a protonated form thereof, wherein:

Ri, R2, and R 3 are each indcpcndently H or an aromatic or heteroaromatic side chain of an amino acid;

at least one of Ri, R;, and R 3 is an aromatic or heteroaromatic side chain of an amino acid;

R₄, Rs, Rô, R7 are independently H or an amino acid side chain;

at least one of R₄, Rs, Rb, R? is the side chain of 3-guanidino-2-aminopropionic acid, 4guanidino-2-aminobutanoic acid, arginine, homoarginine, N-methylarginine, N,Ndimethylarginine, 2,3-dîaminopropîonic acid, 2,4-diaminobutanoic acid, lysine, N5 methyllysine, Ν,Ν-dimethyllysine, N-ethyllysine, Ν,Ν,Ν-trimethyllysine, 4guanidinophenylalanine, citrulline, Ν,Ν-dimethyllysine, β-homoarginine, 3-(lpiperidinyl)alanine;

AAsc is an amino acid side chain to which the antisense compound is conjugated; and q is l, 2, 3 or 4.

IO In embodiments, at least one of R4, Rs, Rô, R7 are independently a uncharged, non-aromatic side chain of an amino acid. In embodiments, at least one of R₄, Rs, Rb, R? are independently H or a side chain of citrulline.

In embodiments, the cyclic peptide has the structure of Formula I:

^=NH

Π₂Ν or a protonated form thereof, wherein:

Ri, R₂, and R3 are each independently H or an amino acid residue having a side chain comprising an aromatic group;

at least one of Ri, R₂, and Rs is an aromatic or heteroaromatic side chain of an amino acid;

R₄ and R7 are independently H or an amino acid side chain;

AAsc is an amino acid side chain to which the antisense compound is conjugated;

q is l, 2, 3 or 4; and each m is independently an integer of 0, l, 2, or 3.

In embodiments, the cyclic peptide of Formula (I) has one of the following structures:

h₂n

NH (I-a),

protonated form thereof.

a protonated form thereof.

a protonatcd form thereof.

In embodiments, the cyclic peptide of Formula (I) has the following structure:

a protonated form thereof.

In embodiments, the compound has a structure of Formula C:

(C), or a protonated form or sait thereof, wherein:

Ri, R?, and Ri are each independently H or a side chain comprising an aryl or heteroaryl group, 5 wherein at least one of Ri, R?, and R3 is a side chain comprising an aryl or heteroaryl group;

R4 and R? are independently H or an amino acid side chain;

EP is the exocyclic peptide;

each m is independently an integer from 0-3;

ΙΟ n is an integer from 0-2;

x’ is an integer from l-23;

y is an integer from l-5;

q is an integer from l-4;

z’ is an integer from l-23, and

Cargo is the antisense compound.

In embodiments, the compound has one or the following structures:

or a protonated foiin or sait thereof, wherein EP is the exocyclic peptide, and oligonucleotide is the antisense compound.

In embodiments, the oligonucleotide of the compound of Formula (C-l), (C-2), (C-3), or (C-4) comprises the following sequence: 5’-CAG CAG CAG CAG CAG CAG CAG-3’.

In embodiments, the EP of the compound of Formula (C-l), (C-2), (C-3), or (C-4) comprises the following sequence: PKKKRKV.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic showing multiple strategies for targeting CUG repeats in mRNA.

FIG. 2 shows modified nucléotides used in antisense oligonucleotides described herein. Structures 13(1= Phosphorothioate; 2 = (Scs-Rp)-a,p-CAN; 3 = PMO) are phosphate backbone modifications; 4 (2-thio-dT) is a base modification; 5-8 (5 = 2’-OMe-RNA; 6 = 2’O-MOE-RNA; 7 = 2’F-RNA; 8 = 2’F-ANA) are 2’ sugar modifications; 9-11 are constrained nucléotides; 12-14 (9 = LNA; 10 = (.5)cET; 11 = tcDNA; 12 = FHNA; 13 = (S)5’-C-methyl; 14 = UNA) are additional sugar modification; and 15-18 (15= E-VP; 16 — Methyl phosphonate; 17 = 5’ phosphorothioate; 18 = (S)-5'-C-methyl with phosphate) are 5’ phosphate stabilization modifications; 19 is a morpholino sugar. Reformatted from Khvorova, A., et al., Nat. Biotechnol. 2017 Mar; 35(3): 238-248.

FIGS. 3A-3D illustrate conjugation chemistries for connecting an AC to a cyclic cell penetrating peptide. FIG. 3A shows the amide bond formation between peptides with a carboxylic acid group or with TFP activated ester and primary amine residues at the 5’ end of an AC. FIG. 3B shows the conjugation of secondary amine or primary amine modified AC at 3’ and peptide-TFP ester through amide bond formation. FIG. 3C shows the conjugation of a peptide-azide to the 5’ cyclooctyne modified AC via copper-free azide-alkyne cycloaddition. FIG. 3D demonstrates another exemplary conjugation between a 3’ modified cyclooctyne ACs or 3’ modified azide ACs and CPP containing linker-azide or linker-alkyne/cyclooctyne moiety, via a copper-free azide-alkyne cycloaddition or cupper catalyzed azide-alkyne cycloaddition, respectively (click reaction).

FIG. 4 shows the conjugation chemistry for connecting an AC and CPP with an additional linker modality containing a polyethylene glycol (PEG) moiety.

FIGS. 5A-5D provide structures of the adenine (SA), cytosine (5B), guanine (5C), and thymine (5D) moipholino subunit monomers that may be used to synthesize phosphorodiamidate-linked moipholino oligomers (PMOs).

FIG. 6A-6F show RT-PCR analysis of alternative RNA splicing events (e.g., exon inclusion or exclusion) of MBNLl (exon 5; FIG. 6A, 6B, 6E) and CLASPl (exon 19; FIG. 6B, 6D, 6F) 24 hours (6A-6B) and 48 hours (6C-6D) aller HeLa-48 cells were treated with l μΜ, 3 μΜ, or 10 μΜ of various PMOs or PMO-EEV compounds using the Endo-Porter transfection agent (6A-6C) or without the Endo-Porter agent (6E-6F). The parental HeLa cell line and the HeLa-480 cell line treated with (6A-6D) or without (6E-6F) the Endo-Porter agent were included as Controls.

FIG. 7A-7B show RT-PCR analysis of alternative RNA splicing events (e.g., exon inclusion or exclusion) of MBNLl (exon 5; FIG. 7A) and CLASPl (exon 19; FIG. 7B) 48 hours after DMl myoblasts were treated with l μΜ of various PMO or PMO-EEV compounds without the Endo-Porter transfection reagent. Two Controls, DM-04 without endo-porter treated and DM-05 without endoporter were included as Controls.

FIG. 8A-8D show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion or exclusion) of Atp2al (exon 22; FIG. 8A), Nfix (exon 7; FIG. 8B), Clcnl (exon 7a; FIG. 8C) and Mbnll (exon 5; FIG. 8D) from gastroenemius muscle tissues one week after HSA-LR (DMl-mouse model) mice were treated with a PMO, 20 mpk PMO-EEV 22l-l 106, or 40 mpk PMO-EEV 2211106. FVB/NJ (wild type inbred mouse) and HSA-LR (without treatment) mice were included as control groups.

FIG. 9A-9D show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion or exclusion) of Atp2al (exon 22; FIG. 9A), Nfix (exon 7; FIG. 9B), Clcnl (exon 7a; FIG. 9C) and Mbnll (exon 5; FIG. 9D) from quadricep muscle tissues one week after HSA-LR (DMl-mouse model) mice were treated with a PMO, 20 mpk PMO-EEV 22l-l I06, or 40 mpk PMO-EEV 2211106. FVB/NJ (wild type inbred mouse) and HSA-LR (without treatment) mice were included as control groups.

FIG. 10A-10D show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion or exclusion) of Atp2al (exon 22; FIG. 10A), Nfix (exon 7; FIG. 10B), Clcnl (exon 7a; FIG. 10C) and Mbnll (exon 5; FIG. 10D) from tibialis anterior muscle tissues one week after HSA-LR (DMlmouse model) mice were treated with a PMO, 20 mpk PMO-EEV 22l-l 106, or 40 mpk PMO-EEV 16

22l-l 106. FVB/NJ (wild type inbred mouse) and HSA-LR (without treatment) mice were included as control groups.

FIG. HA-llF show RT-PCR analysis of alternative RNA splicing events (e.g., exon inclusion or exclusion) of MBNLl (exon 5, FIG. HA), SOSl (exon 25, FIG. 11B), IR (exon H, FIG. UC), 5 DMD (exon 78, FIG. HD), BINl (exon H, FIG. UE) and LDB3 (exon H, FIG. UF) after DMl patient derived muscle cells were treatcd with different concentrations (ΙΟμιη, 3μιη, Ipm, 0.3μιη) of DMPK CUG-targeting EEV-PMOs (CUG^exp 197-777 and CUG^exp 221-1106). Muscle cells from two groups, healthy people (négative control) and DM 1 patients (positive control), were tested for the alternative RNA splicing events as control. Ail data was collected from three individual experiments 10 (n=3). T-test of treated versus untreated DM 1 myotubes was conductcd; * p < 0.05; ** p < 0.01 ; *** p< 0.001.

FIG. 12A-12F shows RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion or exclusion) ofMBLl (exon 5, FIG. 12A), SOSl (exon 25, FIG. 12B), INSR (exon 11, FIG. 12C), DMD (exon 78, FIG. 12D), BINl (exon 11, FIG. 12E) and LDB3 (exon 11, FIG. 12F) after patient 15 derived DMl myoblasts and myotubes were treated with ΙΟμιη, 3μιη, or Ιμιη of DMPK CUGtargeting EEV-PMO 197-777. Healthy patient cells and DMl cells were used as Controls. Ail data was collected from three individual experiments (n=3). T- test of treated versus untreated DMl myotubes was conducted; * p < 0.05; ** p < 0.01; *** p< 0.001.

FIG. 13A-13B show the relative levels of mRNA after HSA-LR mice were treated with various 20 concentrations of PMO-EEV 221-1120. FIG. 13A shows the relative mRNA level for the gastrocnemius, triceps, tibialis anterior, and diaphragm. FIG. 13B shows the relative mRNA levels in the diaphragm.

FIG. 14A-14C show the relative levels of mRNA in the quadricep ( 14A), gastrocnemius (14B), tricep (14C) and tibialis anterior ( 14D) tissues after HSA-LR mice were treated with various concentrations 25 ofPMO-EEV 221-1120.

FIG. 15A-15D show the mouse DMl splicing index (mDSI) for various genes in quadricep (FIG. 15A), gastrocnemius (FIG. 15B), tricep (FIG. 15C), and tibialis anterior (FIG. 15D) tissues after HSA-LR mice were treated with various concentrations ofPMO-EEV 221-1120.

FIGS. 16A-16C show the prevalence of RNA foci in the tibialis anterior muscle after HSA-LR mice were either untreated or treated with EEV-PMO 22l-l 120 (EEV-PMO-DM l-3; DMl-3). FIG. 16 A16B show images of tibialis anterior muscle tissue stained for RNA CUG foci (red) and nuclei (blue). FIG. 16C is a plot quantifyîng the percent of nuclei that hâve a CUG foci from data associated with the images in FIG. 16A-16B.

FIG. 17A-17F are plots showing a dose-dependent response for drug levels in the quadricep (17A), tricep (17B), heart (17C), gastrocnemius ( 17D), tibialis anterior (17E), diaphragm (17F), brain (17H), liver (171), and kidney (17J) tissues after HSA-LR mice were treated with various concentrations of EEV-PMO-DM 1-3, FIG. 17K shows drug exposure of varions tissues at a 60 mpk dosage level.

FIG. 18 shows a dose dépendent myotonia réduction in HSA-LR mice 7 days after treatment with EEV-PMO-DM 1-3 at 15, 30, 60 and 90 mpk.

FIG. 19A-19D are plots show the results of a principal component analysis comparing gene expression in un-diseased mice (WT), DM1 mice (HSA-LR), and HSA-LR mice treated with PMOEEV 221-1120. FIG. 19A and 19C are plots showing three principal components and FIG. 19B and 19D are plots showing two principal components.

FIG. 20A-20B show heatmaps of differentially expressed genes between un-diseased mice (WT), DM1 mice (HSA-LR), and HSA-LR mice treated with 60 mpk PMO-EEV 221-1 120. FIG. 20A is a clustered heatmap showing 513 differentially expressed genes. FIG. 20B is a clustered heatmap showing 40 genes that are known to hâve CTG CUG repeats.

FIG. 21 is a volcano plot showing the global transcriptional change across the untreated HSA-LR mice and mice treated with PMO-EEV 221-1120.

FIG. 22A-22E are plots show the resuit of a principal component analysis for the Scube2 (22A), Grebl (22B), Ttc7 (22C), Txlnb(CUG)9 (22D), andNdrg3 (22E) genes from undiseased mice, HSALR mice, and HSA-LR mice treated with PMO-EEV 221-1120.

FIG. 23A-23D show RNA sequencing (RNAseq) data for Atp2al (23A; exon 22 is boxed), Clcnl (23B; exon 7a is boxed), Nftx (23C; exon 7 is boxed), and Mbnl (23D; exon 5 is boxed) for undiseased mice (WT-saline), HSA-LR mice (HSA-LR saline), and HSA-LR mice treated with PMOEEV 221-1120. Two reads are shown for each treatment group.

FIG. 24 shows the percent splîced index (PSI) of individual exons for various genes of interest for undiseased mice (WT-salîne), HSA-LR mice (HSA-LR saline), and HSA-LR mice treated with PMOEEV 221-H20.

FIG. 25A-25D shows the drug levels in HSA-LR mice treated with 80 mpk (60 mpk oligo, 80 mpk whole drug) EEV-PMO-DMl-3 after l week to 4 weeks in the tibialis anterior (25A), gastrocnemius (25B), triceps (25C) and quadriccp (25D) tissues.

FIG. 26A-26D show the drug levels in mice after HSA-LR mice were treated with a single 80 mpk dose of EEV-PMO-DMl-3. FIG. 26A-26B show the drug levels in the liver from l week to 12 weeks post treatment. FIG. 26C-26D show the drug levels in the kidney from l week to 12 weeks post treatment.

FIG. 27A-27C are plots showing the level of exon inclusion for MBNLl (exon 5; 26A), SOSl (exon 25; 26B), and NFIX (exon 7; 26C) after DMl patient derived muscle cells were treated with 30 μΜ EEV-PMO-DMl-3.

FIG. 28A-28C shows that EEV-PMO-DM l -3 reduces CUG nuclear foci (green) in the nucléus (blue) in DMl patient-derived muscle cells. FIG. 28A-28B are images of DMl patient derived muscle cells that are either untreatcd or treated with EEV-PMO-DMl-3 or untreated. FIG. 28C is the quantification of the number of CUG foci per nucléus for data associated with the images in FIG. 28A.

FIG. 29A-29B show the raw data (29A) and the normalized data (29B) of a CELLTITER-GLO luminescent viability assay where RPTEC cells were treated with various concentrations of PMODMl or EEV-PMO-DMl-3. Melittin was used as a positive control.

FIGS. 30A-30C show images depicting RNA CUG repeat foci in DMl patient-derived cells (30A) and DMl patient-derived cells treated with an EEV-PMO 221-1H3 (30B). Cells were stained for nuclei (blue; Hoechst) and RNA CUG foci (green). FIG. 30C is a plot of the CUG RNA foci per nuclear area for data associated with the images of FIG. 3ÜA-30B.

FIG. 31A-31B show the prevalence of RNA CUG7 foci in HeLa. untreated HeLa480 cells, and HeLa480 cells treated with EEV-PMO 22l-l 113. FIG. 31A shows images of cells stained for RNA CUG7 foci (green) and nuclei (blue). FIG. 31B is a plot quantifying the CUG7 foci per nuclear are for data associated with the images in FIG. 31 A.

FIGS. 32A-32C are plots showing the percent inclusion of exon 5 in MBNLl (32A), exon 25 in SOSl (32B), and exon 7 in NFIX (32C) after DMl patient-derived cells were treated with 30 μΜ of EEVPMO 221-1H3.

FIGS. 33A-33E show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion) of MBNLl (exon 5; 33A), SOSl (exon 25; 33B), CLASPl (exon 19, 33C), NFIX (exon 7, 33D), and INSR (exon 11,33E) after DMl patient derived muscle cells were treated with various concentrations of PMO-EEV 22l-l I 13. T- test was used to détermine signiftcance; *p< 0.05; ** p < 0.01; *** p< 0.001.

FIGS. 34A-34D show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion) Atp2al (exon 22, 34A), Nfix (exon 7, 34B), Clcnl (exon 7a, 34C), and Mbnll (exon 5, 34D) in the gastrocnemius tissue of mice treated with various concentrations of PMO 221 or EEV-PMO 2211106.

FIGS. 35A-35C show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion or exclusion) of Mbnll (exon 5, 35A), Nfix (exon 7, 35B), and Atp2al (exon 22, 35C) in the tibialis anterior tissue of HSA-LR mice treated with either PMO-EEV 022l-l I2l (2l-mer) or PMO-EEV 0325-H21 (24-mer).

FIGS. 36A-36C show RT-PCR analysis of the alternative RNA splicing events (e.g., exon inclusion) of Mbnll (exon 5, 36A), Nfix (exon 7, 36B), and Atp2al (exon 22, 36C) in the gastrocnemius tissue of HSA-LR mice treated with either PMO-EEV 0221 -1121 (21 -mer) or PMO-EEV 0325-1121 (24mer).

FIGS. 37 shows PMO-O22la, the major métabolite of PMO-EEV 220-1120 detected in vivo.

FIGS. 38A-38B show the percent exon inclusion in the tibialis anterior (38A) and the gastrocnemius (38B) for MBNLl (exon 5) after Hela480 cells were treated with various concentration of EEV-PMO 221-H20.

FIGS. 39A-39B show the percent exon inclusion in the tibialis anterior (39A) and the gastrocnemius (39B) for NFIX (exon 7) after Hela480 cells were treated with various concentration of EEV-PMO 221-H20.

FIGS. 40A-40B show the percent exon inclusion in the tibialis anterior (40A) and the gastrocnemius (40B) for Atp2al (exon 22) after Hela480 cells were treatcd with various concentration of EEV-PMO 22l-l 120.

FIG. 41 show images (41 A) depicting RNA CUG repeat foci in Hela480 cells after treatment with various concentrations of EEV-PMO 221-1120. FIG. 41B is a plot of the RNA foci per nuclear area for data associated with the images of FIG. 41 A.

FIGS. 42A-42D show the relative r(CUG480) repeat mRNA levels, (42A), the relative DMPK mRNA levels (42B), percent exon 5 inclusion of MBNL1 (42C), and percent exon 25 inclusion in SOS 1 (42D) in HeLa480 cells after treatment with various concentrations of EEV-PMO 221-1 120.

FIG. 43 is a bar chart showing examples of genes expressed in muscle tissue that are known to hâve CTGCUG repeats.

FIGS. 44A-44D show phenotypic myotonia réduction in HSA-LR mouse model treated with 20 mpk PMO-EEV 221-1106. FIGS. 44A and 44C show plots of relaxation. FIG. 44B shows an example raw force trace. FIG. 44D shows représentative electromyography traces.

DETAILED DESCRIPTION

Compounds

In embodiments, compounds are provided that modulate the level and/or activity of a gene transcrîpt having an expanded CUG trinucleotide repeat. In embodiments, the compounds of the présent disclosure include at least one cyclic cell penetrating peptide (cCPP) and a therapeutic moiety (TM). The cCPP faciliates entry of the TM into the cell. In embodiments, the compound includes an ensomal escape vehicle (EEV) that comprises the cCPP and an exocyclic peptide (EP). The cCPP or the EEV may permit the TM to enter the cytosol or a cellular compartment to interact with the target transcritpt.

Therapeutic Moieties

Generally, the TM is the effector moitey that elicites a response. In embodimetns, the TM elicites a response by modulaling the expression, activity, and/or level of a target transcrîpt and/or a target protein. In embodiments, the traget transcrîpt includes an expanded CUG trinucelotide repeat. In embodiments, the TM modulâtes the levels or a target transcript and/or target protein within a cell. In embodimetns, the TM decreases the level of the target transcript and/or target protein within a cell.

In embodiments, the TM modulâtes the activity of the target transcript by rcducing the affinity between the target transcript and one or more proteins that bind to the target transcript. By reducing the affinity between the target transcript and the one or more proteins, the TM may effectively modulate the activity of the one or more proteins that would otherwise be associated with the target transcript. For example, if the one or more proteins are not bound to the target transcript, they are available to carry out their fonctions on other molécules. For example, if the one or more proteins are involved in pre-mRNA processing, reducing the affinity ofthe one or more proteins for a transcript comprising an expanded CUG rcpeat may allow the one or more proteins to process pre-mRNA transcripts that do not comprise expanded CUG repeats. As such, the TM may modulate the activity, expression, and/or levels of the downstream genes (genes that do not contain the expanded CTG repeat) that are regulated by the one or more proteins whose interaction with the target transcript is disrupted by the TM.

In embodiments, the TM comprises an oligonucleotide, a peptide, an antibody, and/or a small molécule. The class and identity of the TM dépends on the mechanism being used to modulate the level and/or activity ofthe target transcript that includes an expanded CUG trinucleotide repeat.

Antisense compound

In various embodiments, the compounds disclosed herein comprise a cell penetrating peptide (CPP) conjugated to an antisense compound (AC).

The term “antisense compound” refers to an oligonucleotide sequence that is complementary, or at least partially complementary, to a target nucléotide sequence. An AC is an oligonucleotide that includes natural DNA bases, modified DNA bases, natural RNA bases, modified RNA bases, natural RNA sugars, modified RNA sugars, natural DNA sugars, modified DNA sugars, natural intemucleoside linkages, modified intemucleoside linkages, or any combination thereof. ACs înclude, but are not limited to, antisense oligonucleotides RNAi, microRNA, antagomirs, aptamers, ribozymes, immunostimulatory oligonucleotides, decoy oligonucleotides, supermir, miRNA mimics, miRNA inhibitors, Ul adapters, and combinations thereof.

In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to a target transcript that has an expanded CUG trinucleotide repeat. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to an expanded CUG trinucleotide repeat in a target mRNA sequence. Several diseases are associated with expanded CUG trinucleotide repeats, for example, myotonie dystrophy type l (DMl), Fuchs' Endothélial Comeal Dystrophy (FECD), Spinocerebellar Ataxia-8 (SCA8), and Huntington’s Disease-Like (HDL2). Table 1 provides examples of nucléotide repeat disorders, and characteristics of genes with expanded nucléotide repeats associated with such disorders. The following document describes exemplary oligonucleotides for treating tandem repeat diseases and is incorporated by référencé herein in its entirety: Zain et al. Neurotherapeutics. 2019; 16(2): 248-262; Zarouchlioti étal. Am J Hum Genet. 2018; 102(4):528-539; Fautsch et al. Prog Retin Eye Res. 2021; 81:100883.

Table 1 : Diseases associated with expanded CUG trinucleotide repeats

Disease (abbreviation)	Gene	Normal repeat length	Expanded repeat length	Gene product	Repeat sequence	Location o f Repeat
DMl	DMPK	5-35	>50	Dystrophia myotonica protein kinase	CTG*CAG	3’ UTR
FECD	TCF4	<30	>40	Transcription factor 4	CTG’CAG	Intron 3
SCA8	ATXN8OS and/or ATXN8	15-50	>50	Ataxin 8 and ataxin 8 opposite strand	CTG*CAG	3’ UTR
HDL2	JPH3	6-27	>40	Junctophilin 3	CTG-CAG	3’ UTR

In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to a nucléotide sequence that is within a target mRNA transcript that includes an expanded CTG CUG trinucleotide repeat. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to an expanded CTG CUG trinucleotide repeat in a target mRNA transcript.

In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to a nucléotide sequence that is within a DMPK1 target transcript that includes an expanded CTG CUG trinucleotide repeat. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to a nucléotide sequence that is within a TCF4 target transcript that includes an expanded CTGCUG trinucleotide repeat. In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to a nucléotide sequence that is within a ATXN8OS/ATXN8 target transcript that includes an expanded CTG CUG trinucleotide repeat. In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to a nucléotide sequence that 5 is within a JPH3 target transcript that includes an expanded CTG CUG trinucleotide repeat.

In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to a trinucleotide repeat in a 3’UTR of a target mRNA transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to an expanded CTG CUG trinucleotide repeat in a 3’UTR of a DMPK.1 target transcript. In embodiments, the AC includes a nucléotide I0 sequence that is at least partially eomplementary to an expanded CTG CUG trinucleotide repeat in a 3’UTR of a ATXN8OS/ATXN8 target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to an expanded CTG CUG trinucleotide repeat in a 3’UTR of a JPH3 target transcript.

In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to 15 trinucleotide repeats, such as CTG CUG repeats. In embodiments, the target nucléotide sequence comprises at least one expanded trinucleotide repeat (e.g., CTG CUG repeats). In embodiments the target nucléotide sequence comprises at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 150, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000, or at least 2000 CTG CUG trinucleotide 20 repeats. In embodiments, the expanded trinucleotide repeat is in the 3’UTR of the target nudeotide sequence.

In embodiments, the AC includes a nucléotide sequence that is at least partially eomplementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats present in the target transcript. In embodiments, the AC includes a nudeotide sequence that is at least partially 25 eomplementary to, and may hybridize with, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, and up to 50,up to 100, up to 150, up to 200, up to 300, up to 400, up to 500, up to 600, up to 700, up to 800, up to 900, up to 1000, or up to 2000 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nudeotide sequence 30 that is at least partially eomplementary to, and may hybridize with. 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,

I4, 15, 16, 17, 18, 19 or 20 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially coniplementary to, and may hybridize with, 5 to 10 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 5 to 9 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 5 to 8 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 5 to 7 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 5 to 6 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 5 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with. 6 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 7 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, 8 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that is complementary at least partially to, and may hybridize with, 9 trinucleotide repeats in a target transcript. In embodiments, the AC includes a nucléotide sequence that

In embodiments, the AC may include a nucléotide sequence that is at least partially complementary to, and my hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent at any location in a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent in the 3’ UTR of a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent in the 3’ UTR of DMPKl, SCA8, and/or HDL2 target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent an intron of a target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent in the intron 3 of a TCF4 target transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with, at least a portion of the contiguous expanded trinucleotide repeats présent in the CTG18.1 locus of the TCF4 transcript. In embodiments, the AC includes a nucléotide sequence that is at least partially complementary to, and may hybridize with. at least a portion of the contiguous expanded trinucleotide repeats présent an exon of a target transcript.

In embodiments, the AC is 5 or more, 10 or more, 15 or more, 20 or more, 25 or more, 30 or more, 35 or more, 40 or more, or 45 or more nucleic acids in length. In embodiments, the AC is 50 or less, 45 or less, 40 or less, 35 or less, 30 or less, 25 or less, 20 or less, 15 or less, or 10 or less nucleic acids in length. In embodiments, the AC is 5 to 50, 5 to 45, 5 to 40, 5 to 35, 5 to 30, 5 to 25, 5 to 20, 5 to 15, or 5 to 10 nucleic acids in length. In embodiments, the AC is 10 to 50, 10 to 45, 10 to 40, 10 to 35, 10 to 30, 10 to 25, 10 to 20, or 10 to 15 nucleic acids in length. In embodiments, the AC is 15 to 50, 15 to 45, 15 to 40, 15 to 35, 15 to 30, 15 to 25, or 15 to 20 nucleic acids in length. In embodiments, the AC is 20 to 50, 20 to 45, 20 to 40, 20 to 35, 20 to 30, or 20 to 25 nucleic acids in length. In embodiments, the AC is 25 to 50, 25 to 45, 25 to 40, 25 to 35, or 25 to 30 nucleic acids in length. In embodiments, the AC is 30 to 50, 30 to 45, 30 to 40, or 30 to 35 nucleic acids in length. In embodiments, the AC is 35 to 50, 35 to 45, or 35 to 40 nucleic acids in length. In embodiments, the AC is 40 to 50 or 40 to 45 nucleic acids in length. In embodiments, the AC is 45 to 50 nucleic acids in length. In embodiments, the AC is 5, 6, 7, 8, 9, 10, H, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleic acids in length.

In embodiments, the AC has 100% complementarity to a target nucléotide sequence. In embodiments, the AC does not hâve 100% complementarity to a target nucléotide sequence. As used herein, the term percent complementarity refers to the number of nucleobases (e.g., natural nucleobase or modified nucleobase) of an AC that hâve nucleobase complementarity with a corresponding nucleobase of an oligomeric compound or nucleic acid (e.g., a target nucléotide sequence) divided by the total length (number of nucleobases) of the AC. One skilled in the art recognizes that the inclusion of mismatches is possible without eliminating the activity of the antisense compound.

In embodiments, the AC includes 20% or less, 15% or less, 10% or less, 5% or less, or zéro mismatches to the target nucléotide sequence. In some embodiments, the AC includes 5% or more, 10% or more, or 15% or more mismatched. In embodiments, the AC includes zéro to 5%, zéro to 10%, zéro to 15%, or zéro to 20% mismatches to the target nucléotide sequence. In embodiments, the AC includes 5% to 10%, 5% to 15%, or 5% to 20% mismatches to the target nucléotide sequence. In embodiments, the AC includes 10% to I5% or 10% to 20% mismatches to the target nucléotide sequence. In embodiments, the AC includes 10% to 20% mismatches to the target nucléotide sequence.

In embodiments, the AC has 80% or greater, 85% or greater, 90% or greater, 95% or greater, 96% or greater, 97% or greater, 98% or greater, or 99% or greater complementarity to a target nucléotide sequence. In embodiments, the AC has 100% or less, 99% or less, 98% or less, 97% or less 96% or less 95% or less, 90% or less, 85% or less complementarity to a target nucléotide sequence. In embodiments, the AC has 80% to 100%, 80% to 99%, 80% to 98%, 80% to 97% 80% to 96%, 80% to 95%, 80% to 90% or 80% to 85% complementarity to a target nucléotide sequence. In embodiments, the AC has 85% to 100%, 85% to 99%, 85% to 98%, 85% to 97% 85% to 96%, 85% to 95%, or 85% to 90% complementarity to a target nucléotide sequence. In embodiments, the AC has 90% to 100%, 90% to 99%, 90% to 98%, 90% to 97%, 90% to 96%, or 90% to 95% complementarity to a target nucléotide sequence. In embodiments, the AC has 95% to 100%, 95% to 99%, 95% to 98%, 95% to 97%, or 95% to 96% complementarity to a target nucléotide sequence. In embodiments, the AC has 96% to 100%, 96% to 99%, 96% to 98%, or 96% to 97% complementarity to a target nucléotide sequence. In embodiments, the AC has 97% to 100%, 97% to 99%, or 97% to 98% complementarity to a target nucléotide sequence. In embodiments, the AC has 98% to 100% or 98% to 99% complementarity to a target nucléotide sequence. In embodiments, the AC has 99% to 100% complementarity to a target nucléotide sequence.

In embodiments, incorporation of nucléotide affinity modifications allows for a greater number of mismatches compared to an unmodilied compound. Similarly, certain oligonucleotide sequences may be more tolérant to mismatches than other oligonucleotide sequences. One of ordinary skill in the art is capable of determining an appropriate number of mismatches between an AC and a target nucléotide sequence, such as by determining the thermal melting température (Tm). Tm or ATm can be calculated by techniques that are familiar to one of ordinary skill in the art. For example, techniques described in Freier et al. (Nucleic Acids Research, 1997, 25, 22: 4429-4443) allow one of ordinary skill in the art to evaluate nucléotide modifications for their ability to increase the melting température of an RNA:DNA duplex.

In embodiments, the AC includes a nucléotide sequence that in itself is a trinucleotide repeat, that is, a CAG trinucleotide repeat. The reverse complément of a 5’-CAG-3’ has 100% complementarity and may hybridize with a 5’-CUG-3’ trinucleotide repeat. In embodiments, the AC includes one to 50

CAG repeats. In embodiments, the CAG repeats are contiguous. In embodiments, the CAG repeats are not contiguous. In embodiments, the AC includes a nucléotide sequence that includes incomplète CAG repeats on either the 5’ or 3’ end. For example, in embodiments, the AC includes a sequence such as AG(CAG)_tl, G(CAG)_n, (CAG)_nAG, or (CAG)nA where n is an integer from l to 50. In 5 embodiments, the AC includes a nucléotide sequence that includes one or more, 2 or more, 3 or more, or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 20 or more, 30 or more, or 50 or more CAG repeats. In embodiments, the AC includes a nucléotide sequence that includes 50 or less, 40 or less, 30 or less, 20 or less, 10 or less, 9 or less, 8 or less, 7 or less, 6 or iess, 5 or less, 4 or less, 3 or less, or 2 or less CAG repeats. In embodiments, the AC includes a nucléotide sequence 10 that includes 2 to 50, 2 to 20, 2 to 10, 4 to 10, 5 to 10, 6 to 10, 6 to 9, 6 to 8, or 6 to 7 CAG repeats.

In embodiments, the AC includes any one of the nucléotide sequences in Table 2 (SEQ ID NO: I5l291).

Table 2: CAG repeat AC nucléotide sequences

AC sequence (5' to 3')	SEQ ID NO:
CAG	NA
CAG-CAG	NA
CAG-CAG-CAG	NA
CAG-CAG-CAG-CAG	I5l
CAG-CAG-CAG-CAG-CAG	152
CAG-CAG-CAG-CAG-CAG-CAG	I53
CAG-CAG-CAG-CAG-CAG-CAG-CAG	154
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I55
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	156
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	157
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I58
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	159
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I60
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG	I6l
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG	162
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG	163
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG	164
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG	165

CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG	166
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG	167
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	168
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I69
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	170
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I7l
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	172
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	173
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG	174
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG	175
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG	176
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG	177
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG	178
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-C'AG-CAG	179
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG	180
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	I8l
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	182

CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	183
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	184
C’AG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-UAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	185
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	186
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG	187
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG	188
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG	189
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG	190
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG	I9l
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG	192
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG	193
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	194

CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	195
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	196
CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAGCAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG-CAG	197
AGC	NA
AGCAGC	NA
AGCAGCAGC	NA
AGCAGCAGCAGC	198
AGCAGCAGCAGCAGC	199
AGCAGCAGCAGCAGCAGC	200
AGCAGCAGCAGCAGCAGCAGC	201
AGCAGCAGCAGCAGCAGCAGCAGC	202
AGCAGCAGCAGCAGCAGCAGCAGCAGC	203
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	204
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	205
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	206
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	207
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	208
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	209
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GC	210
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGC	2ll
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGC	212
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGC	213
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGC	214
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGC	215
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGC	216

AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGC	217
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGC	218
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGC	219
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	220
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	221
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	222
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	223
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	224
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG C	225
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGC	226
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGC	227
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGC	228
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGC	229
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGC	230
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGC	231
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGC	232
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGC	233

AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGC	234
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	235
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	236
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	237
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	238
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	239
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC	240
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGC	241
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGC	242
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGC	243
AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGC	244
GCA	NA
GCAGCA	NA
GCAGCAGCA	NA
GCAGCAGCAGCA	245
GCAGCAGCAGCAGCA______	246

GCAGCAGCAGCAGCAGCA	247
GCAGCAGCAGCAGCAGCAGCA	248
GCAGCAGCAGCAGCAGCAGCAGCA	249
GCAGCAGCAGCAGCAGCAGCAGCAGCA	250
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	251
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	252
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	253
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	254
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	255
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	256
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CA	257
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCA	258
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCA	259
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCA	260
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCA	261
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCA	262
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCA	263
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCA	264
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCA	265
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCA	266
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	267
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	268
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	269
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	270
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	271

GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC A	272
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCA	273
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCA	274
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCA	275
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCA	276
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCA	277
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCA	278
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCA	279
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCA	280
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCA	281
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	282
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	283
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	284
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	285

GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA

286

GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA	287
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCA	288
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCA	289
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCA	290
GCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAG CAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGC AGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCAGCA GCAGCAGCAGCA	291

In embodiments, an AC that has a nucléotide that includes CAG repeats may include additional nucléotide sequences on the 5’ end, 3 ’ end, or both, of the CAG repeat. In embodiments, the additional nucléotide sequences may hâve 80% to 100% or 95% to 100% complementarity to portions of the 5 target transcript to which they hybridize. The additional nucléotide sequences may be added to a CAG repeat nucléotide scquence in order to increase the selectivity of the AC for hybridizing to a spécifie target transcript.

In embodiments, the AC includes a nucléotide sequence that includes I to 50 CAG repeats and that is a gapmer. Gapmers are oligonucleotides that are a DNA/RNA hybrid and that induce RNase decay 10 mechanism. For example, gapmers may hâve a central DNA or DNA mimic segment that is flanked by an RNA or RNA mimic segment on both the 5’ and 3’ ends ofthe DNA or DNA mimetic segment. In embodiments, the AC includes a gapmer that includes a nucléotide sequence that hybridizes with a target nucleic acid sequence of the target transcript that is separate from the expanded CUG repeat of the target transcript.

In embodiments, an AC of the disclosure is a gapmer oligonucleotide as disclosed in U.S. Patent No. 9,550,988, the disclosure of which is incorporated by reference herein.

In embodiments, an AC of the disclosure comprises the sequence and/or structure of any one of the ACs targeting DMPK disclosed in U.S. Patent Publication No. 2017/0260524, the disclosure of which is incorporated b y reference herein.

In embodiments, an AC of the disclosure comprises the sequence and/or structure of any one of the ACs or oligonucleotides disclosed in U.S. Patent Publications US20030235845A1, US20060099616A1, US 2013/0072671 Al, US 2014/0275212 Al, US 2009/0312532 Al, US20100125099A1, US 2010/0125099 Al, US 2009/0269755 Al, US 2011/0294753 Al, US 2012/0022134 Al, US 2011/0263682 Al, US 2014/0128592 Al, US 2015/0073037 Al, and US20120059042A1, the contents of each of which are incorporated herein in their entirety for ali purposes.

When using an AC to target and/or hybridize to an expanded CTG-CUG repeat, care must be taken to avoid off-target effects where the AC unintentionally binds to off-target transcripts that include CTG-CUG repeats (e.g., a transcript that includes CTGCUG repats that are not an expanded CTGCUG repeat). An in-silico analysis of the human genome reveals that in total, 63 human genes hâve CTG CUG repeats (Uhlen, et. al., Science 2015 347(6220):1260419)). The 63 genes can be ranked by the expression of mRNA plus the amount of protein expressed in total muscle (cardiac, skeletal, and smooth muscle). The expression level can be quantified using RPM (reads per million) mRNA expression is FPK.M (Fragments per kilo base of transcript per million mapped fragments) and protein expression is pTPM (transcripts per million protein coding genes), using greater than 10 RPM as the cutoff for non-insignificant expression. FIG. 43 shows the results of such an in-silico analysis. Thirty-six genes show an expression level of> 10 RPM. Of the 36 genes, only three genes (besides DMPK) had > 10 CTG CUG repeats. Genes with < 10 CTG-CUG repeats represent the lowest risk for off-target binding and toxicity. The number of CTG CUG repeats (1 1-24) in these 3 genes (TCF4, CASK, MAP3k4) is nonetheless significantly lower than that seen in classic and congénital DM1 patients. For example, late-onset DM1 patients hâve 100-600 CTG CUG repeats on DMPK, classical DM1 patients hâve 250-750 CTG CUG repeats on DMPK, and congénital DM1 patients hâve 750-1,400 CTG CUG repeats on DMPK. The same in-silico analysis may be performed

L_ for the liver and kidney. CASK is the only significant gene with > 10 CTGCUG repeats in the kidney. No genes with > 10 CTG CUG repeats were significant in the liver.

The ACs described herein may contain one or more asymmetric centers and thus give rise to enantiomers, diastereomers, and other stereoisomeric configurations that may be defined, in ternis of absolute stereochemistry, as (R.) or (S); a or β; or as (D) or (L). Included in the antisense compounds provided herein are ail such possible isomers, as well as their racemic and optically pure fomis.

The efficacy of the ACs may be assessed by evaluating the antisense activity effected by their administration. As used herein. the terni antisense activity refers to any détectable and/or measurable activity attributable to the hybridization of an antisense compound to its target nucléotide sequence. Such détection and/or measuring may be direct or indirect. In embodiments, antisense activity is assessed by detecting and or measuring the amount of the protein expressed from the transcript of interest. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of the transcript of interest. In embodiments, antisense activity is assessed by detecting and/or measuring the amount of altematively spliced RNA and/or the amount of protein isofomis translated from the target transcript.

AC niechanisms of modulation

In embodiments, the AC may modulate the activity and/or level of the target transcript within the cell. FIG. I shows exemplary niechanisms of how an AC can modulate level and/or activity of a target transcript.

In embodiments, the AC may modulate the level of the target transcript within a cell. For exainple, in embodiments where the AC is a gamper, binding of the AC to the target transcript induces the dégradation ofthe target transcript via RNase H pathways (FIG. 1, aiTOws A and B). In embodiments, the gapmer hybridizes to a portion of the target transcript that is distinct from the expanded CUG trinucleotide repeat and thereby induces dégradation ofthe target transcript via RNase H pathways (FIG. 1, Arrow A). In embodiments, the gapmer hybridizes to at least a portion of the expanded CUG repeat withing the target transcript and thereby induces dégradation ofthe target transcript via RNase H pathways (FIG. 1, Arrow B).

In embodiments, the AC may modulate the activity of the target transcript. Modulation of activity may include increasing or decreasing the ability ofthe target transcript to bind with a binding partner.

In embodiments, the AC may modulate the activity of the target transcript by decreasing the ability of the target transcript to bind with one or more proteins that may associate with the transcript, particularly proteins that associate with at least a portion of the expanded CUG repeat of a target transcript (FIG. 1, arrow C). In embodiments, decreasing the ability of the target transcript to bind with one or more proteins includes decreasing the affinity of the target transcript for the one or more proteins. In embodiments, decreasing the ability of the target transcript to bind with one or more proteins includes partial or full steric blocking of the target transcript from binding the one or more proteins. For example, the AC may occupy at least a portion of the binding site that may be occupied by the one or more proteins if not sterically blocked. In embodiments, the binding site of the one or more proteins to the target transcript includes at least a portion of the expanded trinucleotide repeat of the target transcript. As such, in embodiments, the AC may occupy at least a portion of the expanded trinucleotide repeat (e.g., expanded CUG repeat) that may be occupied by the one or more proteins if not sterically blocked. Partial steric blocking of the target transcript may resuit in decreased affinity between the target transcript and the one or more proteins. For example, in embodiments, the AC binds to at least a portion of the expanded CUG repeat of the target transcript thereby sterically blocking and/or decreasing the affinîty of target transcript for a protein that may bind to the expanded CUG repeat (FIG. 1, arrow C). The following review article describes additional applications for steric blocking antisense oligonucleotides and is incorporated by référencé herein in its entirety; Roberts et al. Nature Reviews Drug Discovery (2020) 19: 673-694.

The CUG repeats of an expanded CUG repeat may form a double stranded hairpin structure. In the disease State, proteins bind to the double stranded hairpin structure and become sequestered and unable to perform other fonctions. In embodiments, the AC binds to at least a portion of the double stranded hairpin structure thereby sterically blocking and/or decreasing the affinîty of the double stranded hairpin structure for a protein binding partner. In embodiments, the AC binds to at least a portion of a single stranded expanded CUG repeat thereby inhibiting the formation of the double stranded hairpin structure, and thus, inhibiting one or more protein from binding to the double stranded hairpin structure. In embodiments, hybridization of the AC to the double hairpin structure sterically blocks and/or decreases the affinîty for one or more proteins for binding to the double hairpin structure. In embodiments, hybridization of the AC to at least a portion of a single stranded région of the expanded trinucleotide repeat, inhîbits the formation of a double stranded hairpin structure.

Decreasing the ability of the target transcript to bind with the one or more proteins, may allow the one or more proteins to carry out other functions such as, for example, regulating splicing of downstream transcripts (transcripts that do not contain an expanded CUG repeat). As such, in embodiments, decreasing the ability of the target transcript to bind with the one or more proteins, may increase the level of the one or more proteins within the cell that are available to provide other functions or function on other transcripts. In embodiments, decreasing the ability of the target transcript to bind with the one or more proteins, may increase the cytosolic level ofthe one or more proteins within the cell that are available to provide other functions or function on other transcripts. Therefore, in embodiments, binding of the AC to the target transcript may resuit in the modulation of the level and/or activity of the one or more proteins that internet with the target transcript.

In embodiments, hybridization ofthe AC to at least a portion ofthe expanded CUG repeat, decreases the affinity and or sterically blocks the binding of MNBLl to the target transcript. MNBLl is a splicing factor that régulâtes the splicing of downstream gene transcripts. In a DMl disease phenotype, MNBLl binds to the expanded CUG repeat of a target transcript. Whîle bound to target transcript, MNBLl is sequestered in the nucléus and is not able to regulate the splicing of downstream gene transcripts (transcripts that do not contain an expanded CUG repeat). In embodiments, hybridization of the AC to at least a portion of the expanded CUG repeat in target transcript, sterically blocks and/or decreases the affinity of MNBL l for the target transcript. thereby allowing it to regulate splicing of downstream gene transcripts. In embodiments, hybridization of the AC to at least a portion of the expanded CUG repeat in the target transcript, sterically blocks and/or decreases the affinity of MNBL l for the target transcript, thereby increasing the amount of free (e.g., not bound to a transcript having a CUG repeat) MNBLL In embodiments, hybridization ofthe AC to at least a portion ofthe expanded CUG repeat in the target transcript, sterically blocks and/or decreases the affinity of MNBLl for the target transcript, thereby decreasing the amount of MBNLI bound to and sequestered by the target transcript.

In embodiments where the target transcript is DMPK. hybridization of the AC to at least a portion of the expanded CUG repeat, decreases the affinity and or sterically blocks the binding of MNBLl to the target transcript. MNBLl is a splicing factor that régulâtes the splicing of downstream gene transcripts. In a DMl disease phenotype, MNBLl binds to the expanded CUG repeat of DMPKl. While bound to DMPKl, MNBLl is sequestered in the nucléus and is not able to regulate the splicing of downstream gene transcripts (transcripts that do not contain an expanded CUG repeat). In 40 embodiments, hybridization of the AC to at least a portion of the expanded CUG repeat in DMPK, sterically blocks and/or decreases the affinity of MNBLl for the DMPK transcript, thereby allowing it to regulate splicing of downstream gene transcripts. In embodiments, hybridization of the AC to at least a portion ofthe expanded CUG repeat in DMPK, sterically blocks and/or decreases the affinity of MNBLl for the DMPK transcript, thereby increasing the amount of free (e.g,, not bound to a transcript having a CUG repeat) MNBLL In embodiments, hybridization of the AC to at least a portion of the expanded CUG repeat in DMPK, sterically blocks and/or decreases the affinity of MNBLl for the DMPK transcript, thereby decreasing the amount of MBNLl bound to and sequestered by the DM PKI transcript.

In embodiments where the target transcript is DMPK, hybridization ofthe AC to at least a portion of the expanded CUG repeat, results in the decrease of CUGBPl levels. In the DMl disease State, the level of free MBNL l (able to function) decreases while the level of free (able to function) of CUGBP l increases. An increase in CUGBPl levels is associated with the disease State. As such, in embodiments, hybridization of the AC to at least a portion of the expanded CUG repeat results in increased levels of free (able to function) MBNLl and/or a decrease in free (able to function) CUGBPl levels.

Decreasing the ability of the target transcript to bind with the one or more proteins, may reduce, or inhibit the formation of CUG repeat foci. Transcripts that include expanded nucléotide repeats (e.g., expanded CUG repeats) may be transcribed and then sequestered in the nucléus. Within the nucléus, the sequestered transcripts may form aggregates. Proteins that bind to the transcript may then being to nucleate on sequestered transcript and/or sequester transcript aggregate thereby forming expanded nucléotide repeat (e.g., CUG repeat) foci. The CUG repeat foci may be visible using microscopy. In embodiments, decreasing the ability ofthe target transcript to bind with the one or more proteins, may reduce, or inhibit the formation of aggregates that include the target transcript. In embodiments, decreasing the ability of the target transcript to bind with the one or more proteins, may reduce, or inhibit the nucléation of the one or more proteins on the target transcript, on a double stranded hairpin région of a transcript, or on an aggregate of target transcripts. In embodiments where the target transcript is DMPK, decreasing the ability of the DMPKl target transcript to bind with MNBLl may reduce, or inhibit the nucléation MNBLl on a DMPKl target transcript or DMPKl target transcript aggregate. In embodiments, hybridization of the AC to the target transcript may resuit in inhibition or réduction in formation of CUG repeat nuclear foci. In embodiments, hybridization of the AC to the 41 target transcript may resuit in inhibition or réduction in formation of CUG repeat nuclear foci formed from a DMPK, TCF4, JPH3, and/or ATXN8OS/ATXN8 target transcript.

In embodiments, hybridization of the AC to the target transcript may resuit in the modulation of the level, expression, and/or activity of one or more downstream genes. For example, hybridization ofthe AC to the target transcript may be used to induce target transcript dégradation or sterically block or decreases the affinity of the target transcript for one or more proteins, thereby allowing the one or more proteins that were sequestered by the target transcript to regulate the expression, level, and/or activity of downstream genes. For example, in embodiments, the one or more proteins may include a protein that is involved in rcgulating the splicing of one or more downstream transcripts (transcripts that do not contain an expanded CUG repeat). In embodiments, the splicing of downstream transcripts is altered when the protein involved in splicing is bound and sequestered on the target transcript. For example, alteration of splicing may include the exclusion of one or more exons or the inclusion of one or more introns in a transcript thereby leading to the expression of various protein isoforms. In embodiments, the alteration of splicing may resuit in the inclusion of an exon and/or intron that includes a prématuré stop codon thereby resulting in a truncated isoform that may hâve no or deleterious activity. Alteration of downstream gene transcript splicing may lead to a change in level, folding, and/or activity of the downstream gene product which may be associated with a disease phenotype. When not bound to the target transcript comprising the expanded CUG repeat, the protein involved in splicing is free to regulate splicing which may results in a correction (or rescue) ofthe splicing of the downstream gene transcript, thereby at least partially restoring the protein level, folding, and/or activity ofthe downstream gene product associated with a healthy phenotype.

In embodiments, AC hybridization to the target transcript may resuit in the modulation of the splicing of downstream gene transcripts that are regulated by proteins that are sequestered by the target transcript in a disease State associated with expanded nucléotide repeats (e.g., expanded trinucleotide repeats). In disease associated with expanded trinucleotide repeats, downstream gene transcripts are often mis-processed, for example, mis-spliced, The mis-splicing leads of the downstream gene transcripts may lead to gene products that are destroyed before translation or translated into proteins that hâve aberrant structure and/or function. For example, séquestration of proteins that regulate the Processing of downstream gene transcripts may lead to the inclusion of exons and/or introns with prématuré stop codons, the inclusion of introns, the exclusion of exons, and/or the inclusion of alternative exons, which may lead to a transcript and or gene product that îs destroyed prior to 42 translation or that is translatée! into an gene product with aberrant function. The change in levels of the downstream gene transcript and/or gene product and/or the aberrant structure and/or function of the downstream gene products are associated with expanded trinucleotide disease phenotypes. In embodiments, AC hybridization to the target transcript may resuit in the modulation ofexon inclusion, 5 exon exclusion, intron inclusion, and/or întron exclusion in downstream transcripts whose splicing is regulated by proteins that are sequestered to the target transcript during a disease State. As such, hybridization of the AC to the target transcript may resuit in the upregulation of downstream protein isomers and/or transcripts associated with a healthy phenotype. Similarly, hybridization of the AC to the target transcript may resuit in the downregulation (e.g., suppression) of downstream transcripts 10 and/or protein isomers associated with a disease phenotype.

In embodiments where the target transcript is DMPK, AC hybridization to the target transcript may resuit in the modulation of the splicing of downstream gene transcripts that are regulated by proteins that are sequestered by the DMPK target transcript during a disease State. In DMl, several downstream gene transcripts are mis-spliced leading. The mis-spliced genes are associated with the disease phenotype. As such, modulation of gene splicing may include correcting (e.g., rescuing) the splicing of genes to resuit in gene products of downstream genes that are associated with a healthy phenotype. In embodiments where the target transcript is DMPK, AC hybridization to the target transcript may resuit in the modulation of the splicing of downstream gene transcripts that are regulated by MNBLl, a splicing regulator that is sequestered by the DMPK target transcript during a disease State. In embodiments where the target transcript is DMPK, AC hybridization to the target transcript may resuit in the modulation of the splicing of downstream gene transcripts that are regulated by CUGBPl, a protein whose activity is affected by expanded CUG repeats. In embodiments where the target transcript if DMPK, AC hybridization to the target transcript may resuit in the correct processing (e.g., splicing) of downstream genes that are regulated by MNBLl and/or

CUGBPl. In embodiments where the target transcript is DMPK, AC hybridization to the target transcript may resuit in the modulation of splicing of downstream genes including, but not limited to, 4833439Ll9Rik, Abcc9, Atp2al, ArhgeflO, Arhgap28, Armcxô, Angell, Best3, Binl, Brd2, Cacnals, Cacna2dl, Cpd, Cpeb3, Ccpgl, Claspl, CIC-1, Clcnl, Clk4, Cpeb2, Camk2g, Capzb, Copz2, Coch, cTNT, Ctu2, Cyp2sl, Dctn4, Dnmll, Eya4, Efna3, Efna2, Fbxo3l, Fbxo2l, Frem2,

Fgd4, Fucal, Fnl, Gogla4, Gpr37ll, Grebl, Hegl, Insr, Impdh2, IR, Itgav, Jag2, Klcl, Kcan6,

Kifl3a, Ldb3, Lrrfïp2, Mapt, Macfl, Map3k4, Mapkapl, Mbnll, Mllt3, Mbnl2, Mef2c, Mpdz,

Mrpll, Mxra7, Mybpcl, Myo9a, Ncapd3, Ngfr, Ndrg3, Ndufv3, Neb, Nfix, Numal, Opal, Pacsin2, Pcolce, Pdlim3, Pla2gl5, Phactr4, Phkal, Phtf2, Ppplrl2b, Ppp3cc, Ppplcc, Ramp2, Rapgefl, Rurl, Ryrl, Sorcs2, Spsb4, Scube2, Semaôc, Sfc8a3, Slain2, Sorbsl, Spag9, Tmem28, Taccl, Tacc2, Ttc7, Tnik, Tnfrsf22, Tnfrst25, Trappc9, Trim55, Ttn, Txnl4a, Txlnb, Ube2d3, Vsp39, or any combination 5 thereof.

Mis-splicing of rnany of the above-mentioned downstream gene transcripts results in spécifie DMl disease phenotypes. For example, MNBLl is a splicing factor with loss of function in DMl due to exon 5 inclusion. MNBLl is sequestered by DMPK CUG expansion and forms RNA nuclear foci. Additionally, SOSl promotes Ras activation to positively regulate RAS/MAPK signalîng palhway.

In DM l, exon 25 of SOS l is excluded leading to inhibition of muscle hypertrophy pathways. In DM l, IR/INSR has exon 11 exclusion, which results in higher levels of low-signaltng non-muscle isoform and decreased metabolic response to insulin in DM l (insulin résistance). Similarly, exon 78 exclusion of DMD is observed in DM. Exon 78 exclusion results in out-of-frame transcript at C-terminal domain. This mutated protein is expressed in DMl patients and associated with a mechanism responsible for muscle wasting in patients. BIN l is required for proper muscle T-tubule formation (EC coupling). Exon 11 exclusion produces inactive isoform and is found in DMl patients. LDB3 interacts with α-actinin at the Z-disc in striated muscle and maintains muscle structure. Exon 11 inclusion of LDB3 detected in DM I results in réduction of affmity for Protein kinase C (PKC). Consequently, PKC becomes hyperactive in DMl. In embodiments modulation of one or more 20 downstream genes results in the correction, or rescue, of transcript splicing that is associated with a healthy phenotype. As such, în embodiments, hybridization of the AC to the DMPK target transcript, results in the rescue of mis-splicing of downstream genes/transcripts, thereby, reducing the level of downstream genes/transcripts associated with a disease phenotype. As such, in embodiments, hybridization of the AC to the DMPK target transcript, results in the rescue of mis-splicing of 25 downstream genes/transcripts, thereby, increasing the level of downstream genes/transcripts associated with a healthy phenotype.

In embodiments, the AC inhibits expression of the target transcript. In embodiments, the AC inhibits expression of the target transcript by blocking the pre-mRNA processing machinery and/or translation machinery from accessing and/or completing translation and/or pre-mRNA processing. In 30 embodiments, the AC inhibits expression of the target transcript by inducing dégradation ofthe target transcript, for example, through RNase H pathways.

AC structure

The AC includes an oligonucleotide and/or an oligonucleoside. Oligonucleotides and/or oligonucleotides are nucléotides or nucleosides linked through intemuclcosîde linkages. Nucleosides include a pentose sugar (e.g., ribose or deoxyribose) and a nitrogenous base covalently attached to sugar. The naturally occumng (or traditional basses) bases found in DNA and/or RNA are adenine (A), guanine (G), thymine (T), cytosine (C), and uracil (U). The naturally occumng sugars (or traditional sugars) found in DNA and/or RNA deoxyribose (DNA) and ribose (RNA). The naturally occumng nucleoside linkage (or traditional intemucleoside linkage) is a phosphodiester bond. In embodiments, the ACs of the présent disclosure may hâve ail natural sugars, bases, and intemucleoside linkages.

Chemically modified nucleosides are routinely used for incorporation into antisense compounds to enhance one or more properties, such as nuclease résistance, pharmacokinetics, or affinity for a target RNA. In embodiments, the ACs of the présent disclosure may hâve one or more modified nucleosides. In embodiments, the ACs of the présent disclosure may hâve one or more modified sugars. In embodiments, the ACs of the présent disclosure may hâve one or more modified bases. In embodiments, the ACs of the présent disclosure may hâve one or more modified intemucleoside linkages.

In general, a nucleobase is any group that contains one or more atom or groups of atoms capable of hydrogen bonding to a base of another nucleic acid. In addition to unmodified or natural nucleobases (A, G, T, C, and U) many modified nucleobases or nucleobase mimetics are known to those skilled in the art are amenable with the compounds described herein Generally a modified nucleobase refers to a nucleobase that is fairly similar in structure to the parent nucleobase, such as for example a 7-deaza purine, a 5-methyl cytosine, 2-thio-dT ( FIG. 2) or a G-clamp. Generally, a nucleobase mimetic is a nucleobase that includes a structure that is more complicated than a modified nucleobase, such as for example a tricyclic phenoxazine nucleobase mimetic. Methods for préparation of the above noted modified nucleobases are well known lo those skilled in the art.

In embodiments, the AC may include one or more nucleosides having a modified sugar moiety. In embodiments, the furanosyl sugar of a natural nucleoside may hâve a 2’ modification, modifications to make a constrained nucleoside, and others (see FIG. 2). For example, în embodiments, the furanosyl sugar ring of a natural nucleoside can be modified in a number of ways including, but not limited to, addition of a substituent group, bridging of two non-geminal ring atoms to form a bicyclic nucleic acid (BNA) or a locked nucleic acid; exchanging the oxygen of the furanosyl ring with C or N; and/or substitution of an atom or group such (see FIG. 2). Modified sugars are well known and can be used to increase or decrease the affinity of the AC for its target nucléotide sequence. Modified sugars may also be used increase AC résistance to nucleases. Sugars can also be replaced with sugar mimetic groups among others. In embodiments, one or more sugars of the nucleosides of the AC is replaced with a methylenemorpholine ring as shown as 19 in FIG. 2.

In embodiments, the AC includes one or more nucleosides that include a bicyclic modified sugar (BNA; sometimes called bridged nucleic acids). Examples of BNAs include, but are not limited to LNA (4'-(CH7)-O-2' bridge), 2'-thio-LNA (4'-(CH2)-S-2^r bridge), 2'-amino-LNA (4'-(CH₂)-NR-2' bridge), ENA (4'-(CH2)2-O-2' bridge), 4'-(CH₂)3-2' bridged BNA, 4'-(CH₂CH(CH3))-2' bridged BNA cEt (4'-(CH(CH₃)-O-2' bridge), and cMOE BNAs (4'-(CH(CH₂OCH3)-O-2’ bridge). BNA's hâve been prepared and disclosed in the patent literature as well as in scientific literature (Srivastava, et al. J. Am. Chem. Soc. (2007), ACS Advanced online publication, 10.102 l/jaO7l 106y; Albaek et al. .1. Org. Chem. (2006), 71,773 1 -7740; Fluiter, et al. Chembiochem (2005), 6, 1 104-1 109; Singh et al., Chem. Commun. (1998), 4, 455-456; Koshkin et al., Tetrahedron (1998), 54, 3607-3630; Wahlestedt et al., Proc. Natl. Acad. Sci. U.S.A. (2000), 97, 5633-5638; Kumaret al., Bioorg. Med. Chem. Lett. (1998), 8, 2219-2222; WO 94/14226; WO 2005/021570; Singh et al., J. Org. Chem. (1998), 63, 10035-10039, WO 2007/090071; U.S. Patent Nos. 7,053,207; 6,268,490; 6,770,748; 6,794,499; 7,034,133; and 6,525,191; and U.S. Pre-Grant Publication Nos. 2004-0171570; 2004-0219565; 2004-0014959; 20030207841; 2004-0143114; and 20030082807).

In embodiments, the AC includes one or more nucleosides that include a locked nucleic acid (LNA). In LNAs the 2'-hydroxyl group of the ribosyl sugar ring is linked to the 4' carbon atom of the sugar ring thereby forming a 2'-C,4'-C-oxymethylene linkage to form the bicyclic sugar moiety (reviewed in Elayadi et al., Curr. Opinion Invens. Drugs (2001), 2, 558-561; Braasch et al., Chem. Biol. (2001), 8 1-7; and Orum et al., Curr. Opinion Mol. Ther. (2001), 3, 239-243; see also U.S. Patents: 6,268,490 and 6,670,461). The linkage can be a methylene (-CH2-) group bridging the 2' oxygen atom and the 4' carbon atom, for which the terni LNA is used for the bicyclic moiety; in the case of an ethylene group in this position, the terni ENA™ is used (Singh et al., Chem. Commun. (1998), 4, 455-456; ENA™; Morita et al., Bioorganic Médicinal Cheniistry (2003), 11, 2211-2226). LNA and other bicyclic sugar analogs display very high duplex thermal stabilities with complementary DNA and 46

RNA (Tm = +3 to +10 °C), stability towards 3'-exonucleolytic dégradation and good solubility properties. Potent and nontoxic antisense oligonucleotides containing LNAs hâve been described (Wahlestedt et al., Proc. Natl. Acad. Sci. U.S.A. (2000), 97, 5633-5638).

An isomer of LNA that has also been studied is alpha-L-LNA which has been shown to hâve superior stability against a 3'-exonuclease. The alpha-L-LNA's were incorporated into antisense gapmers and chimeras that showed potent antisense activity (Frieden et al., Nucleic Acids Research (2003), 21, 6365-6372).

The synthesis and préparation of the LNA monomers adenine, cytosine, guanine, 5-methyl-cytosine, thymine and uracil, along with their oligomérization, and nucleic acid récognition properties hâve been described (Koshkin et al., Tetrahedron, 1998, 54, 3607-3630). LNAs and préparation thereof are also described in WO 98/39352 and WO 99/14226.

Analogs of LNA, phosphorothioate-LNA and 2'-thio-LNAs, hâve also been prepared (Kumar et al., Bioorg. Med. Chem. Lett., 1998, 8, 2219-2222). Préparation of LNAanalogs containing oligodeoxyribonucleotide duplexes as substrates for nucleic acid polymerases has also been described (Wengel et al., WO 99/14226). Furthermore, synthesis of 2'-amino-LNA. a conformationally restricted high-affïnity oligonucleotide analog has been described (Singh et al., J. Org. Chem. ( 1998), 63, 10035-10039). In addition, 2'-amino- and 2'-methylamino-LNA's hâve been prepared and the thermal stability of their duplexes with complementary RNA and DNA strands has been previously reported.

Methods for the préparations of modified sugars are well known to those skilled in the art. Some représentative patents and publications that teach the préparation of such modified sugars include, but are not limited to, U.S. Patents: 4,981,957; 5,118,800; 5,319,080; 5,359,044; 5,393,878; 5,446,137; 5,466,786; 5,514,785; 5,519,134; 5,567,811 ; 5,576,427; 5,591,722; 5,597,909; 5,610,300; 5,627,053; 5,639,873; 5,646,265; 5,658,873; 5,670,633; 5,792,747; 5,700,920; and 6,600,032; and WO 2005/121371.

Intemucleoside Linkages

Described herein are intemucleoside linking groups that link the nucleosides or otherwise modified nucleoside monomer units together thereby forming an oligonucleotide and/or an oligonucleotide containing AC. The ACs may include naturally occurring intemucleoside linkages, unnatural intemucleoside linkages, or both.

In naturally occurring DNA and RNA, the intemucleoside linking group is a phosphodiester that covalently links adjacent nucleosides to one another to form a linear polymeric compound. In naturally occurring DNA and RNA, phosphodiester is linked to the 2', 3' or 5’ hydroxyl moiety of the sugar. Within oligonucleotides, the phosphate groups are commonly referred to as forming the intemucleoside backbone of the oligonucleotide. In naturally occumng DNA and RNA, the linkage or backbone of RNA and DNA, is a 3' to 5' phosphodiester linkage. In embodiments, the intemucleoside linking groups of the ACs are phosphodiesters. In embodiments, the intemucleoside linking groups of the ACs are 3' to 5' phosphodiester linkages.

The two main classes of unnatural intemucleoside linking groups are defined by the presence or absence of a phosphorus atom. Représentative phosphorus containing intemucleoside linkages include, but are not limîted to, phosphotriesters, methylphosphonates, phosphoramidate, and phosphorothioates. Représentative non-phosphorus containing intemucleoside linking groups include, but are not limited to, methylenemethylimino (-CH2-N(CH3)-O-CH2-), thiodiester (-O-C(O)S-), thionocarbamate (-O-C(O)(NH)-S-); siloxane (-O-Si(H2-O-); and N,N'-dimethylhydrazinc (CH2-N(CHj)-N(CH3)-). ACs having phosphorus intemucleoside linking groups are referred to as oligonucleotides. Antisense compounds having non-phosphorus intemucleoside linking groups are referred to as oligonucleosides. Modified intemucleoside linkages, compared to natural phosphodiester linkages, can be used to alter, typically increase, nuclease résistance of the antisense compound. Intemucleoside linkages having a chiral atom can be prepared as racemic, chiral, or as a mixture. Représentative chiral intemucleoside linkages include, but are not limited to, alkylphosphonates and phosphorothioates. Methods of préparation of phosphorous-containing and non-phosphorous-containing linkages are well known to those skilled in the art.

In embodiments, two or more nucleosides having modified sugars and/or modified nucleobases may be joined using a phosphoramidate. In embodiments, two or more nucleosides having a methylenemorpholine ring may be connected through a phosphoramidate intemucleoside linkage.

Antisense compounds that include nucleobases with a methylenemorpholine ring that are linked through phosphoramidate intemucleoside linkage may be referred to as phosphoramidate morpholino oligomers (PMOs).

Conjugale Groups

In embodiments, ACs are modified by covalent attachaient of one or more conjugale groups. In general, conjugale groups modify one or more properties of the attached AC including but not limited to pharmacodynamie, pharmacokinetic, binding, absorption, cellular distribution, cellular uptake, charge and clearance. Conjugale groups are routinely used in the Chemical arts and are linked directly or via an optional linking moiety or linking group to a parent compound such as an AC. Conjugale groups include without limitation, intercalators, reporter molécules, polyamines, polyamides, polyethylene glycols, thioethers, polyethers, cholestérols, thiocholesterols, cholic acid moieties, folate, lipids, phospholipids, biotin, phenazine, phenanlhridine, anthraquinone, adamantane, acridine, fluoresceins, rhodamines, coumarins and dyes. In embodiments, the conjugale group is a polyethylene glycol (PEG), and the PEG is conjugated to either the AC or the CPP (CPP discussed elsewhere herein).

In embodiments, conjugale groups include lipid moieties such as a cholestérol moiety (Letsinger et al., Proc. Natl. Acad. Sci. USA (1989), 86, 6553); cholic acid (Manoharan et al., Bioorg. Med. Chem. Lett. (1994), 4, 1053); a thioether, e.g., hexyl-S-tritylthiol (Manoharan et al., Ann. N.Y. Acad. Sci. (I992), 660, 306; Manoharan et al., Bioorg. Med. Chem. Let. (1993), 3, 2765); a thiocholesterol (Oberhauser et al., Nucl. Acids Res. ( 1992), 20, 533); an aliphatic chain, e.g., dodecandiol or undecyl residues (Saison-Behmoaras et al., EMBO J. ( 1991 ), 10, 111 ; Kabanov et al., FEBS Lctt. ( 1990), 259, 327; Svinarchuk et al., Biochimie ( 1993), 75, 49); a phospholipid, e.g., di-hexadecyl-rac-glycerol or triethylammonium-l,2-di-O-hexadecyl-rac-glycero-3-H-phosphonate (Manoharan et al., Tetrahedron Lett. ( 1995), 36, 3651; Shea et al., Nucl. Acids Res. ( 1990), 18, 3777); a polyamine or a polyethylene glycol chain (Manoharan et al., Nucleosides & Nucléotides (1995), 14, 969); adamantane acetic acid (Manoharan et al., Tetrahedron Lett. (1995), 36, 3651); a palmityl moiety (Mishra et ah, Biochim. Biophys, Acta. (1995), 1264, 229); or an octadecylamine or hexylamino-carbonyl-oxycholesterol moiety (Crooke et al., J. Pharmacol. Exp. Ther. (1996) ,277,923).

Types of Antisense Compounds

Various types of AC may be used for example, including an antisense oligonucleotide, siRNA, microRNA, antagomir, aptamer, ribozyme, supermir, miRNA mimic, miRNA inhibitor, or combinations lhereof.

Antisense Oligonucleotides

In various embodiments, the antisense compound (AC) is an antisense oligonucleotide (ASO) that is eomplementary to a target nudeotide sequence. The terni antisense oligonucleotide (ASO) or simply antisense is meant to include oligonucleotides that are eomplementary to a target nudeotide sequence. The tenu also encompasses ASOs that may not be fully eomplementary to the desired target nudeotide sequence. ASOs include single strands of DNA and/or RNA that are eomplementary to a chosen target nudeotide sequence or a target gene. ASOs may include one or more modified DNA and/or RNA bases, modified sugars, and/or unnatural intemucleoside linkages. In embodiments, the ASOs may include one or more phosphoramidate intemucleoside linkages. In embodiments, the ASO is phosphoramidate morpholino oligomers (PMOs). ASOs may hâve any characteristic, be any length, bind to any target nudeotide sequence and/or sequence element, and effect any mechanism as described relative to an AC.

Antisense oligonucleotides hâve been demonstrated to be effective as targeted inhibitors of protein synthesis, and, consequently, can be used to specifically inhibit protein synthesis by a targeted gene. The efficacy of ASO for inhibiting protein synthesis is well established. To date, these compounds hâve shown promise in several in vitro and in vivo models, including models of inflammatory disease, cancer, and HIV (Agrawal, Trends in Biotech. (1996), 14:376-387). Antisense can also affect cellular activity by hybridizing specifically with chromosomal DNA.

Methods of producing antisense oligonucleotides are known in the art and can be readily adapted to produce an antisense oligonucleotide that targets any polynucleotide sequence. Sélection of antisense oligonucleotide sequences spécifie for a given target sequence is based upon analysis of the chosen target sequence and détermination of secondary structure, Tm, binding energy, and relative stability. Antisense oligonucleotides may be selected based upon their relative inability to form dimers, hairpins, or other secondary structures that would reduce or prohibit spécifie binding to the target mRNA in a host cell. Target régions of the mRNA include those régions at or near the AUG translation initiation codon and those sequences that are substantially eomplementary to 5' régions of the mRNA. These secondary structure analyses and target site sélection considérations can be performed, for example, using v.4 of the OLIGO primer analysis software (Molecular Biology Insights) and/or the BLASTN 2,0.5 algorithm software (Altschul et ai, Nucleic Acids Res. 1997, 25(17):3389-402).

RNA Interférence

In embodiments, the AC includes a molécule that médiates RNA interférence (RNAi). As used herein, the phrase médiates RNAi refers to the ability to silence, in a sequence spécifie manner, a target transcript. While not wishing to be bound by theory, it is believed that silencing uses the RNAi machinery or process and a guide RNA, e.g., an sîRNA compound of from about 21 to about 23 nucléotides. In embodiments, the AC targets the target transcript for dégradation. As such, in embodiments, RNAi molécule may be used to disrupt the expression of a gene or polynucleotide of interest. In embodiments, RNAi molécule is used to induce dégradation of the target transcript, such as a pre-mRNA or a mature mRNA.

In embodiments, the AC includes a small interfering RNA (siRNA) that elicits an RNAi response.

Small interfering RNAs (siRNAs) are nucleic acid duplexes normally from about 16 to about 30 nucléotides long that can associate with a cytoplasmic multi-protein complex known as RNAiinduced silencing complex (RISC). RISC loaded with siRNA médiates the dégradation of homologous transcripts, therefore siRNA can be designed to knock down protein expression with high specificity, Unlike other antisense technologies, siRNA function through a naturel mechanism evolved to control gene expression through non-coding RNA. A variety of RNAi reagents, including siRNAs targeting clinically relevant targets, are cuncntly under pharmaceutical development, as described, e.g., in de Fougerolles, A. et al., Nature Reviews (2007) 6:443-453.

While the first described RNAi molécules were RNA:RNA hybrids that include both an RNA sense and an RNA antisense strand, it has now been demonstrated that DNA sense:RNA antisense hybrids, 20 RNA sense:DNA antisense hybrids, and DNA:DNA hybrids are capable of mediating RNAi (Lamberton, J.S. and Christian. A.T., Molecular Biotechnology (2003), 24:lll-ll9). In embodiments, RNAi molécules are used that include any of these different types of double-stranded molécules. In addition, it is understood that RNAi molécules may be used and întroduced to cells in a variety of forms. Accordingly, as used herein, RNAi molécules encompasses any and ail molécules capable of mediating an RNAi in cells, including, but not limited to, double-stranded oligonucleotides that include two separate strands, i.e. a sense strand and an antisense strand, e.g., small interfering RNA (siRNA); double-stranded oligonucleotide that includes two separate strands that are linked together by non-nucleotidyl linker; oligonucleotides that include a haîrpin loop of complementary sequences, which forms a double-stranded région, e.g., shRNAi molécules, and expression vectors

5!

that express one or more polynucleotîdes capable of forming a double-stranded polynucleotide alone or in combination with another polynucleotide.

A single strand siRNA compound as used herein, is an siRNA compound which is made up of a single molécule. It may include a duplexed région, formed by intra-strand pairing, e.g., it may be, or include, a hairpin or pan-handle structure. Single strand siRNA compounds may be antisense with regard to the target molécule.

A single strand siRNA compound may be sufficiently long that it can enter the RISC and participate in RISC mediated cleavage of a target mRNA. A single strand siRNA compound is at least about 14, at least about 15, at least about 20, at least about 25, at least about 30, at least about 35, at least about 40, or up to about 50 nucléotides in length. In certain embodiments, the single strand siRNA is less than about 200, about 100, or about 60 nucléotides in length.

Hairpin siRNA compounds may hâve a duplex région equal to or at least about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, or about 25 nucléotide pairs. The duplex région may be equal to or less than about 200, about 100, or about 50 nucléotide pairs in length. In certain embodiments, ranges for the duplex région are from about 15 to about 30, from about 17 to about 23, from about 19 to about 23, and from about 19 to about 21 nucléotides pairs in length. The hairpin may hâve a single strand overhang or terminal unpaired région. In certain embodiments, the overhangs are from about 2 to about 3 nucléotides in length. In embodiments, the overhang is at the same side of the hairpin and in embodiments on the antisense side of the hairpin.

A double stranded siRNA compound as used herein, is an siRNA compound which includes more than one, and in some cases two, strands in which interchain hybridization can form a région of duplex structure.

The antisense strand of a double stranded siRNA compound may be equal to or at least about 14, about 15, about 16 about 17, about 18, about 19, about 20, about 25, about 30, about 40, or about 60 nucléotides in length. It may be equal to or less than about 200, about 1Û0, or about 50 nucléotides in length. Ranges may be from about 17 to about 25, from about 19 to about 23, and from about 19 to about 21 nucléotides in length. As used herein, term antisense strand means the strand of an siRNA compound that is sufficiently complementary to a target molécule, e.g., the target nucléotide sequence of a target transcript.

The sense strand of a double stranded siRNA compound may be equal to or at least about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 25, about 30, about 40, or about 60 nucléotides in length. It may be equal to or less than about 200, about 100, or about 50, nucléotides in length. Ranges may be from about 17 to about 25, from about 19 to about 23, and from about 19 to 5 about 21 nucléotides in length.

The double strand portion of a double stranded siRNA compound may be equal to or at least about 14, about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 30, about 40, or about 60 nucléotide pairs in length. It may be equal to or less than about 200, about 100, or about 50, nucléotides pairs in length. Ranges may be from about 15 to about I0 30, from about 17 to about 23, from about 19 to about 23, and from about 19 to about 21 nucléotides pairs in length.

In embodiments, the siRNA compound is sufficiently large that it can be cleaved by an endogenous molécule, e.g., by Dicer, to produce smaller siRNA compounds, e.g., siRNAs agents.

The sense and antisense strands may be chosen such that the double-stranded siRNA compound 15 includes a single strand or unpaîred région at one or both ends of the molécule. Thus, a doublestranded siRNA compound may contain sense and antisense strands, paired to contain an overhang, e.g., one or two 5' or 3' overhangs, or a 3' overhang of l to 3 nucléotides. The overhangs can be the resuit of one strand being longer than the other, or the resuit of two strands of the same length being staggered. Some embodiments will hâve at least one 3' overhang. In embodiments, both ends of an 20 siRNA molécule will hâve a 3' overhang. In embodiments, the overhang is 2 nucléotides.

In embodiments, the length for the duplexed région is from about 15 to about 30, or about 18, about 19, about 20, about 21, about 22, or about 23 nucléotides in length, e.g., in the ssiRNA (siRNA with sticky overhangs) compound range discussed above. ssiRNA compounds can resemble in length and structure the natural Dicer processed products from long dsiRNAs. Embodiments in which the two 25 strands ofthe ssiRNA compound are linked, e.g., covalently linked are also included. In embodiments, hairpin, or other single strand structures which provide a double stranded région, and a 3' over hangs are included.

The siRNA compounds described herein, including double-stranded siRNA compounds and singlestranded siRNA compounds can médiate silencing of a target RNA, e.g.. mRNA, e.g., a transcript of 30 a gene that encodes a protein. For convenience, such mRNA is also referred to herein as mRNA to be 53 silenced. Such a gene is also referred to as a target gene. In general, the RNA to be siienced is an endogenous gene.

In embodiments, an siRNA compound is sufficiently complementary to a target transcript, such that the siRNA compound silences production of protein encoded by the target mRNA. In embodiments, the siRNA compound is sufficiently complementary to at least a portion of a target transcript, such that the siRNA compound silences production of the gene product encoded by the target transcript. In another embodiment, the siRNA compound is exactly complementary to a target nucléotide sequence (e.g., a portion of a target transcript) such that the target nucléotide sequence and the siRNA compound anneal, for example to form a hybrid made exclusively of Watson-Crick base pairs in the région of exact complementarity. A sufficiently complementary to a target nucléotide sequence can include an internai région (e.g., of at least about 10 nucléotides) that is exactly complementary to a target nucléotide sequence. Moreover, in certain embodiments, the siRNA compound specifically discriminâtes a single-nucleotide différence. In this case, the siRNA compound only médiates RNAi if exact complementary is found in the région (e.g., within 7 nucléotides of) the single-nucleotide différence.

The therapeutic applications of RNAi are extremely broad, since siRNA and miRNA constructs can be synthesized with any nucléotide sequence directed against a target protein. To date, siRNA constructs hâve shown the ability to specifically down-regulate target proteins in both in vitro and in vivo models, as well as in clinical studies

MicroRNAs

In embodiments, the AC includes a microRNA molécule. MicroRNAs (miRNAs) are a highly conserved class of small RNA molécules that are transcribed from DNA in the genomes ofplants and animais but are not translated into protein. Processed miRNAs are single stranded 17-25 nucléotide RNA molécules that become incorporated into the RNA-induced silencing complex (RISC) and hâve been identified as key regulators of development, cell prolifération, apoptosis and différentiation. They are believed to play a rôle in régulation of gene expression by binding to the 3 ’-untranslated région of spécifie mRNAs. RISC médiates down-regulation ofgene expression through translational inhibition, transcript cleavage, or both. RISC is also implicated in transcriptional silencing in the nucléus of a wide range of eukaryotes.

Antagomirs

In embodiments, the AC is an antagomir. Antagomirs are RNA-like oligonucleotides that harbor vanous modifications for RNAse protection and pharmacologie properties, such as enhanced tissue and cellular uptake. They differ from normal RNA by, for example, complété 2'-0-methylation of sugar, phosphorothioate backbone and, for example, a cholesterol-moiety at 3'-end. Antagomirs may be used to efficiently silence endogenous miRNAs by forming duplexes that include the antagomir and endogenous miRNA, thereby preventing miRNA-induced gene silencing. An example of antagomir-mediated miRNA silencing is the silencing of miR-122, described in Krutzfeldt et al., Nature (2005), 438: 685-689, which is expressly incorporated by reference herein in its entirety. Antagomir RNAs may be synthesized using standard solid phase oligonucleotide synthesis protocols (U.S. Patent Application Ser. Nos. 11/502,158' and 1I/657,341²; the disclosure of each of which are incorporated herein by reference).

An antagomir can include ligand-conjugated monomer subunits and monomers for oligonucleotide synthesis. Monomers are described in U.S. Application No. 10/916,185³. An antagomir can hâve a ZXY structure, such as is described in PCT Application No. PCT/US2004/07070. An antagomir can be complexed with an amphipathic moiety. Amphipathic moieties for use with oligonucleotide agents are described in PCT Application No. PCT/US2004/07070,

Aptamers

In embodiments, the AC includes an aptamer. Aptamers are nucleic acid or peptide molécules that bind to a particular molécule of interest with high affinity and specificity (Tuerk and Gold, Science 249:505 (1990); Ellington and Szostak, Nature 346:818 (1990)). DNA or RNA aptamers hâve been successfully produced which bind many different entities from large proteins to small organic molécules (Eaton, Cuit. Opin, Chem. Biol. (1997), 1: 10-16; Famulok, Curr. Opin. Struct. Biol. (1999), 9:324-9; and Hermann and Patel, Science (2000), 287:820-5). Aptamers may be RNA or DNA based and may include a riboswitch. A riboswitch is a part of an mRNA molécule that can directly bind a small target molécule, and whose binding of the target affects the gene's activity. Thus, an mRNA that contains a riboswitch is directly învolved in regulating its own activity, depending on the presence or absence of its target molécule. Generally, aptamers are engineered through repeated rounds of in vitro sélection or equivalently, SELEX (systematic évolution of ligands by exponential enrichment) to bind to various molecular targets such as small molécules, proteins, nucleic acids, and even cells, tissues and organisms. The aptamer may be prepared by any known method, including synthetic, recombinant, and purification methods, and may be used atone or in combination with other aptamers spécifie for the same target. Further, the terni aptamer also includes secondary aptamers containing a consensus sequence derived from comparing two or more known aptamers to a given target. In embodiments, the aptamer is an “intracellular aptamer”, or “intramer”, which specifically recognîze intracellular targets (Famulok et al., Chem Biol. (2001 ),8( 10):931-939; Yoon and Rossi, Adv. Drug Deliv. Rev. (2018), 134:22-35; each incorporated by reference herein).

Ribozymes

In embodiments, the AC is a ribozyme. Ribozymes are RNA molécules complexes having spécifie catalytic domains that possess endonuclease activity (Kim and Cech, Proc. Natl. Acad. Sci. USA ( 1987),84(24):8788-92; Forster and Symons, Cell (1987) 24, 49(2):211-20). For example, a large number of ribozymes accelerate phosphoester transfer reactions with a high degree of specificity, often cleaving only one of several phosphoesters in an oligonucleotide substrate (Cech et al., Cell (1981), 27(3 Pt 2):487-96; Michel and Westhof, J. Mol. Biol. (1990), 5, 216(3):585-610; ReinholdHurek and Shub, Nature (1992), 14, 357(6374): 173-6). This specificity has been attributed to the requirement that the substrate bind via spécifie base-pairing interactions to the internai guide sequence (IGS) of the ribozyme prior to Chemical reaction.

Al least six basic varieties of naturally occurring enzymatic RNAs are known presently. Each can catalyze the hydrolysis of RNA phosphodiester bonds in trans (and thus can cleave other RNA molécules) under physiological conditions, In general, enzymatic nucleic acids act by first binding to a target RNA. Such binding occurs through the target binding portion of an enzymatic nucleic acid which is held in close proximity to an enzymatic portion of the molécule that acts to cleave the target RNA. Thus, the enzymatic nucleic acid first recognizes and then binds a target RNA through complementary base-pairing, and once bound to the correct site, acts enzymatically to eut the target RNA. Strategie cleavage of such a target RNA will destroy its ability to direct synthesis of an encoded protein. After an enzymatic nucleic acid has bound and cieaved its RNA target, it is released from that RNA to search for another target and can repeatedly bind and cleave new targets.

The enzymatic nucleic acid molécule may be formed in a hammerhead, hairpin, a hepatitis δ virus, group I intron or RNaseP RNA (in association with an RNA guide sequence) or Neurospora VS RNA motif, for example. Spécifie examples of hammerhead motifs are described by Rossi et al. Nucleic Acids Res. ( 1992), 20(17):4559-65. Examples ofhaiipin motifs are described by Hampel et al. (Eur.

Pat. Appl. Publ. No. EP 0360257), Hampel andTritz, Biochemistry ( 1989), 28(12):4929- 33; Hampel et al, Nuclcic Acids Res. (1990),18(2):299-304 and U. S. Patent 5,631,359. An example ofthe hepatitis virus motif is described by Perrotta and Been, Biochemistry (1992), 31(47): 11843-52; an example of the RNaseP motif is described by Guerrier-Takada et al., Cell (1983), 35(3 Pt 2):849-57; Neurospora VS RNA ribozyme motif is described by Collins (Saville and Collins, Cell (1990), 61 (4):685-96; Saville and Collins, Proc. Natl. Acad. Scî. USA (1991),88( 19):8826-30; Collins and Olive. Biochemistry ( 1993),32(1 l):2795-9); and an example of the Group I intron is described in U. S. Patent 4,987,071. In embodiments, enzymatic nucleic acid molécules hâve a spécifie substrate binding site which is complementary to one or more ofthe target gene DNA or RNA régions, and that they hâve nucléotide sequences within or surrounding that substrate binding site which impart an RNA cleaving activity to the molécule. Thus, the ribozyme constructs need not be limited to spécifie motifs nientioned herein.

Ribozymes may be designed as described in Int. Pat. Appl. Publ. No. WO 93/23569 and Int. Pat. Appl. Publ. No. WO 94/02595, each specifically incorporated herein by référencé, and synthesized to be tested in vitro and in vivo, as described therein. In embodiments, the ribozyme is targeted to a target nucléotide sequence of a target transcript.

Ribozyme activity can be increased by altering the length of the ribozyme binding anns or chemically synthesizing ribozymes with modifications that prevent their dégradation by sérum rîbonucleases (see e.g. , Int. Pat. Appl. Publ. No. WO 92/07065; Int. Pat. Appl. Publ. No. WO 93/15187; Int. Pat. Appl. Publ. No. WO 91/03162; Eur. Pat. Appl. Publ. No. 92110298.4; U. S. Patent 5,334,711 ; and Int. Pat. Appl. Publ. No. WO 94/13688, which describe various Chemical modifications that can be made to the sugar moieties of enzymatic RNA molécules), modifications which enhancc their effieacy in cells, and removal of stem II bases to shorten RNA synthesis times and reduce Chemical requirements.

Supermir

In embodiments, the AC is a supermir. A supermir refers to a single strandcd, double strandcd, or partially double strandcd oligomer or polymer of RNA, polymer of DNA, or both , or modifications thereof, which has a nucléotide sequence that is substantially identical to an miRNA and that is antisense with respect to its target, This term includes oligonucleotides composed of naturallyoccun ing nucleobases, sugars and covalent intemucleoside (backbone) linkages and which contain at least one non-naturally- occurring portion which functions similarly. Such modified or substituted 57 oligonucleotides hâve désirable properties such as, for example, enhanced cellular uptake, enhanced affinity for nucleic acid target and increased stability in the présence of nucleases. In embodiments, the supermir does not include a sensé strand, and in another embodiment, the supermir does not selfhybridize to a significant extent. A supermir can hâve secondary structure, but it is substantially single-stranded under physiological conditions. A supermir that is substantially single-stranded is single-stranded to the extent that less than about 50% (e.g., less than about 40%, about 30%, about 20%, about 10%, or about 5%) of the supermir is duplexed with itself. The supermir can include a hairpin segment, e.g., sequence, for example, at the 3' end can self-hybridize and form a duplex région, e.g., a duplex région of at least about l, about 2, about 3, or about 4 or less than about 8, about 7, about 6, or about 5 nucléotides, or about 5 nucléotides. The duplexed région can be connected by a linker, e.g., a nucléotide linker, e.g., about 3, about 4, about 5, or about 6 dTs, e.g., modified dTs. In another embodiment the supermir is duplexed with a shorter oligo, e.g., of about 5, about 6, about 7, about 8, about 9, or about 10 nucléotides in length, e.g., at one or both of the 3' and 5' end or at one end and in the non-terminal or middle of the supermir.

miRNA mimics

In embodiments, the AC is a miRNA mimic. miRNA mimics represent a class of molécules that can be used to imitate the gene silencing ability of one or more miRNAs. Thus, the terni microRNA mimic refers to synthetic non-coding RNAs (i.e., the miRNA is not obtained by purification from a source of the endogenous miRNA) that are capable of entering the RNAi pathway and regulating gene expression. miRNA mimics can be designed as mature molécules (e.g., single stranded) or mimic precursors (e.g., pri- or pre-miRNAs). miRNA mimics can include nucleic acid (modified or modified nucleic acids) including oligonucleotides that include, without limitation, RNA, modified RNA, DNA, modified DNA, locked nucleic acids, or 2'-0,4’-C-ethylene-bridged nucleic acids (ENA), or any combination of the above (including DNA-RNA hybrids). In addition, miRNA mimics can include conjugales thaï can affect delivery, intracellular compartmentalization, stability, specifîcity, lunctionalily, strand usage, and/or potency. In one design, miRNA mimics are double stranded molécules (e.g., with a duplex région of between about I6 and about 31 nucléotides in length) and contain one or more sequences that hâve identity with the mature strand of a given miRNA. Modifications can include 2' modifications (including 2'-0 methyl modifications and 2' F modifications) on one or both strands of the molécule and intemucleoside modifications (e.g., phosphorothioate modifications) thaï enhance nucleic acid stability and/or specifîcity. In addition, 58 miRNA mimics can include overhangs. The overhangs can include from about l to about 6 nucléotides on either the 3’ or 5' end of either strand and can be modified to enhance stability or functionality. In embodiments, a miRNA mîmic includes a duplex région of from about 16 to about 31 nucléotides and one or more of the following Chemical modification patterns: the sense strand contains 2'-0-methyl modifications of nucléotides l and 2 (counting from the 5' end of the sense oligonucleotide), and ail of the Cs and Us; the antisense strand modifications can include 2' F modification of ail of the Cs and Us, phosphorylation of the 5' end of the oligonucleotide, and stabilized intemucleoside linkages associated with a 2 nucléotide 3 ' overhang.

miRNA inhibitor

In embodiments, the AC is a miRNA inhibitor. The ternis antimir microRNA inhibitor, miR inhibitor, or miRNA inhibitor are synonymous and refer to oligonucleotides or modified oligonucleotides that interfère with the ability of spécifie miRNAs. In general, the inhibitors are nucleic acid or modified nucleic acids in nature including oligonucleotides that include RNA, modified RNA, DNA, modified DNA, locked nucleic acids (LNAs), or any combination of the above. Modifications include 2' modifications (including 2'-0 alkyl modifications and 2’ F modifications) and intemucleoside modifications (e.g., phosphorothioate modifications) that can affect delivery, stability, specificity, inlracellular compartmentalization, or potency. In addition, miRNA inhibitors can include conjugates that can affect delivery, intracellular compartmentalization. stability, and/or potency. Inhibitors can adopt a variety of configurations including single stranded, double stranded (RNA/RNA or RNA/DNA duplexes), and hairpin designs, in general, microRNA inhibitors include contain one or more sequences or portions of séquences that are complementary or partially complementary with the mature strand (or strands) of the miRNA to be targeted. In addition, the miRNA inhibitor may also include additional sequences located 5' and 3' to the sequence that is the reverse complément of the mature miRNA. The additional sequences may be the reverse compléments of the sequences that are adjacent to the mature miRNA in the pri-miRNA from which the mature miRNA is derived, or the additional sequences may be arbitrary sequences (having a mixture of A, G, C, or U). In embodiments, one or both of the additional sequences are arbitrary sequences capable of forming hairpins. Thus, in embodiments, the sequence that is the reverse complément of the miRNA is flankcd on the 5' side and on the 3' side by hairpin structures. Micro-RNA inhibitors, when double stranded, may include mismatchcs between nucléotides on opposite strands. Furthermore, micro-RNA inhibitors may be linked to conjugale moieties in order to facilitate uptake of the inhibitor into a cell. For example, a micro-RNA inhibitor may be linked to cholesteryl 5-(bis(4methoxyphenyl)(pheny!)methoxy)-3 hydroxypentylcarbamate) which allows passive uptake of a micro-RNA inhibitor into a cell. Micro-RNA inhibitors, including hairpin miRNA inhibitors, are described in detail in Vermeulen et al., RNA 13: 723- 730 (2007) and in W02007/095387 and WO 2008/036825 each of which is incorporated herein by référencé in its entirety. A person of ordinary skill in the art can select a sequence from the database for a desired miRNA and design an inhibitor useful for the methods disclosed herein.

Linking groups or bifunctional linking moieties such as those known in the art are amenable to the compounds provided herein. Linking groups are useful for attachment of Chemical functional groups, conjugale groups, reporter groups and other groups to sélective sites in a parent compound such as for example an AC. In general, a bifunctional linking moiety includes a hydrocarbyl moiety having two functional groups. One of the functional groups is selected to bind to a parent molécule or compound of interest and the other is selected to bind essentially any selected group such as Chemical functional group or a conjugale group. Any of the linkers described here may be used. In embodiments, the linker includes a chain structure or an oligomer of repeating units such as cthylene glycol or amino acid units. Examples of functional groups that are routinely used in a bifunctional linking moiety include, but are not limited to, electrophiles for reacting with nucleophilic groups and nucleophiles for reacting with electrophilic groups. In embodiments, bifunctional linking moieties include amino, hydroxyl, carboxylic acid, thiol, unsaturations (e.g., double or triple bonds), and the like. Some nonlimiting examples of bifunctional linking moieties include 8-amino-3,6-dioxaoctanoic acid (ADO), succinimidyl 4-(N-malcimidomethyl) cyclohexane-l-carboxylate (SMCC) and 6aminohexanoic acid (AHEX or AHA). Other linking groups include, but are not limited to, substituted Cl-CIO alkyl, substituted or unsubstiluted C2-C10 alkenyl or substituted or unsubslituted C2-C10 alkynyl, wherein a nonlimiting list of substituent groups includes hydroxyl, amino, alkoxy, carboxy, benzyl, phenyl, nitro, thiol, thioalkoxy, halogen, alkyl, aryl, alkenyl and alkynyl.

in embodiments, AC includes nucléotide modification designed to not support RNase H activity. Nucléotide modifications of antisense compounds that do not support RNase H activity are known and include, but are not limited to, 2’-O-methoxy ethyl/phosphorothioate (MOE) modifications. Advantageously, AC with MOE modifications hâve increased affmity for target RNA and increase nuclease stability.

Immunostimulatory Oligonucleotides

In embodiments, the therapeutic moiety is an immunostimulatory oligonucleotide. Immunostimulatory oligonucleotides (ISS; single-or double- stranded) are capable of inducing an immune response when administered to a patient, which may be a mammal or other patient. ISS include, e.g., certain palindromes leading to hairpin secondary structures (see Yamamoto S., et al. ( 1992) J. Immunol. 148: 4072-4076), or CpG motifs, as well as other known ISS features (such as multi-G domains, see WO 96/11266).

The immune response may be an innate or an adaptive immune response. The immune system is divided into a more innate immune system, and acquired adaptive immune system of vertebrates, the latter of which is further divided into humoral cellular components. In particular embodiments, the immune response may be mucosal.

Immunostimulatory nucleic acids are considered to be non-sequence spécifie when it is not required that they specifically bind to and reduce the expression of a target polynucleotide in order to provoke an immune response. Thus, certain immunostimulatory nucleic acids may include a sequence corresponding to a région of a naturally occumng gene or mRNA, but they may still be considered non-sequence spécifie immunostimulatory nucleic acids.

In embodiments, the immunostimulatory nucleic acid or oligonucleotide includes at least one CpG dinucleotide. The oligonucleotide or CpG dinucleotide may be unmethylated or methylated. In another embodiment, the immunostimulatory nucleic acid includes at least one CpG dinucleotide having a methylated cytosine. In embodiments, the nucleic acid includes a single CpG dinucleotide, wherein the cytosine in said CpG dinucleotide is methylated. In a spécifie embodiment, the nucleic acid includes the sequence 5' TAACGTTGAGGG’CAT 3’ (SEQ ID NO: 369). In an alternative embodiment, the nucleic acid includes at least two CpG dinucleotides, wherein at least one cytosine in the CpG dinucleotides is methylated. In a further embodiment, each cytosine in the CpG dinucleotides présent in the sequence is methylated. In another embodiment, the nucleic acid includes a plurality of CpG dinucleotides, wherein at least one of said CpG dinucleotides includes a methylated cytosine.

Additional spécifie nucleic acid sequences of oligonucleotides (ODNs) suitable for use in the compositions and methods are described in Raney et al, Journal of Pharmacology and Experimental Therapeutics, 298:1185-l 192 (2001). In certain embodiments, ODNs used in the compositions and 6I methods hâve a phosphodiester(PO) backbone or a phosphorothioate (PS) backbone, and/or at least one methylated cytosine residue in a CpG motif.

Decoy Oligonucleotides

In embodiments, the therapeutic moiety is a decoy oligonucleotide. Because transcription factors 5 recognize their relatively short binding sequences, even in the absence of surrounding genomic DNA, short oligonucleotides bearing the consensus binding sequence of a spécifie transcription factor can be used as tools for manipulating gene expression in living cells. This strategy involves the intracellular delivery of such decoy oligonucleotides, which are then recognized and bound by the target factor. Occupation of the transcription factor's DNA-binding site by the decoy renders the 10 transcription factor incapable of subsequently binding to the promoter régions of target genes. Decoys can be used as therapeutic agents, either to inhibit the expression of genes that are activated by a transcription factor, or to upregulate genes that are suppressed by the binding of a transcription factor. Examples of the utilization of decoy oligonucleotides may be found in Mann et al., J. Clin. Invest, 2000, 106: 1071-1075, which is expressly incorporated by reference herein, in its entirety.

Ul adaptor

In some embodiments, the therapeutic moiety is a U l adaptor. U l adaptors inhibit polyA sites and are bifunctional oligonucleotides with a target domain complementary to a site in the target gene’s terminal exon and a ’U l domain' that binds to the U l smaller nuclear RNA component of the U l snRNP (Goraczniak, et al., 2008, Nature Biotechnology, 27(3), 257-263, which is expressly 20 incoiporated by reference herein, in its entirety). Ul snRNP is a ribonucleoprotein complex that functions primarily to direct early sleps in spliceosome formation by binding to the pre-mRNA exonintron boundary (Brown and Simpson, 1998, Annu Rev Plant Physiol Plant Mol Biol 49:77-95). Nucléotides 2-l l ofthe 5'end of Ul snRNA base pair bind with the 5'ss of the pre mRNA. In one embodiment, oligonucleotides are Ul adaptors. In one embodiment, the Ul adaptor can be 25 administered in combination with at least one other iRNA agent.

(CRISPR) Gene-Editing Machinery

In embodiments, the compounds disclosed herein include one or more CPP (or cCPP) conjugated to CRISPR gene-editing machinery. As used herein, “CRISPR gene-editing machinery” refers to protein, nucleic acids, or combinations thereof, which may be used to edit a genome. Non-limitîng

examples of gene-editing machinery include gRNAs, nucleases, nuclease inhibitors, and combinations and complexes thereof. The following patent documents describe CRISPR gene-editing machinery: U.S. Pat. No. 8,697,359, U.S. Pat. No. 8,771,945, U.S. Pat. No. 8,795,965, U.S. Pat. No. 8,865,406, U.S. Pat. No. 8,871,445, U.S. Pat. No. 8,889,356, U.S. Pat. No. 8,895,308, U.S. Pat. No. 8,906,616, U.S. Pat. No. 8,932,814, U.S. Pat. No. 8,945,839, U.S. Pat. No. 8,993,233, U.S. Pat. No. 8,999,641, U.S. Pat. App. No. 14/704,551⁴, and U.S. Pat. App. No. 13/842,859⁵. Each of the aforementioned patent documents is incorporated by référencé herein in its entirety.

In embodiments, a linker conjugales the cCPP to the CRISPR gene-editing machinery. Any linker described in this disclosure or that is known to a person of skill in the art may be utilized.

gRNA

In embodiments, the compounds include the CPP (or cCPP) is conjugated to a gRNA. A gRNA targets a genomic loci in a prokaryotic or eukaryotic cell.

In embodiments, the gRNA is a single-molecule guide RNA (sgRNA). A sgRNA includes a spacer sequence and a scaffold sequence. A spacer sequence is a short nucleic acid sequence used to target a nuclease (e.g., a Cas9 nuclease) to a spécifie nucléotide région of interest (e.g., a genomic DNA sequence to be cleaved). In embodiments, the spacer may be about 17-24 bases in length, such as about 20 bases in length. In embodiments, the spacer may be about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 bases in length. In embodiments, the spacer may be at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 bases in length. In embodiments, the spacer may be about 15, about 16, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, or about 30 bases in length. In embodiments, the spacer sequence has between about 40% to about 80% GC content.

In embodiments, the spacer targets a site that immediately précédés a 5’ protospacer adjacent motif (PAM). The PAM sequence may be selected based on the desired nuclease. For example, the PAM sequence may be any one of the PAM sequences shown in Table 3 below, wherein N refers to any nucleic acid, R refers to A or G, Y refers to C or T, W refers to A or T, and V refers to A or C or G.

Table 3: Exemplary Nucleases and PAM sequences

PAM sequence (5’ to 3’)	Nuclease	Isolated from
NGG	SpCas9	Streptococcus pyogènes
NGRRT orNGRRN	SaCas9	Staphylococcus attrei is
NNNNGATT	NmeCas9	Neisseriu nieningitidis
NNNNRYAC	CjCas9	Campylobacter jejuni
NNAGAAW	StCas9	Streptococci ts therm oph îles
TTTV	LbCpfl	Lachnospiraceae bacterium
TTTV	AsCpfl	Acidaminococcus sp.

In embodiments, a spacer may target a sequence of a mammalian gene, such as a human gene. In embodiments, the spacer may target a mutant gene. In embodiments, the spacer may target a coding 5 sequence. In embodiments, the spacer may target an exonic sequence. In embodiments, the spacer may target a polyadenylation site (PS). In embodiments, the spacer may target a sequence elcment of a PS. In embodiments, the spacer may target a polyadenylation signal (PAS), an intervening sequence (IS), a cleavage site (CS), a downstream element (DES), or a portion or combination thereof. In embodiments, a spacer may target a splicing element (SE) or a cis-splicing regulatory element (SRE).

I0 The scaffold sequence is the sequence within the sgRNA that is responsible for nuclease (e.g., Cas9) binding. The scaffold sequence does not include the spacer/targeting sequence. In embodiments, the scaffold may be about l to about 10, about 10 to about 20, about 20 to about 30, about 30 to about 40, about 40 to about 50, about 50 to about 60, about 60 to about 70, about 70 to about 80, about 80 to about 90, about 90 to about 100, about 100 to about 110, about 110 to about 120, or about I20 to 15 about 130 nucléotides in length. In embodiments, the scaffold may be about l, about 2, about 3, about

4, about 5, about 6, about 7, about 8, about 9, about 10, about 11, about 12, about 13, about 14, about 15, about I6, about 17, about 18, about 19, about 20, about 21, about 22, about 23, about 24, about 25, about 26, about 27, about 28, about 29, about 30, about 31, about 32, about 33, about 34, about 35, about 36, about 37, about 38, about 39, about 40, about 41, about 42, about 43, about 44, about 20 45, about 46, about 47, about 48, about 49, about 50, about 51, about 52, about 53, about 54, about

55, about 56, about 57, about 58, about 59, about 60,about 60, about 61, about 62, about 63, about 64, about 65, about 66, about 67, about 68, about 69, about 70, about 71, about 72, about 73, about 74, about 75, about 76, about 77, about 78, about 79, about 80, about 81, about 82, about 83, about 84, 64 about 85, about 86, about 87, about 88, about 89, about 90, about 91, about 92, about 93, about 94, about 95, about 96, about 97, about 98, about 99, about 100, about 101, about 102, about I03, about 104, about 105, about 106, about 107, about 108, about 109, about 110, about 111, about 112, about

113, about l 14, about 115, about 116, about 117, about 118, about l 19, about 120, about I2l, about

122, about 123, about 124, or about 125 nucléotides in length. In embodiments, the scaffold may be at least 10, at least 20, at least 30, at least 40, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 110, at least 120, or at least 125 nucléotides in length.

In embodiments, the gRNA is a dual-molecule guide RNA, e.g, crRNA and tracrRNA. In embodiments, the gRNA may further include a poly(A) tail.

In embodiments, a compound that includes a CPP is conjugated to a nucleic acid that includes a gRNA. In embodiments, the nucleic acid includes about I, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about I0, about 11, about 12, about 13, about 14, about 15, about 16, about 17, about 18, about 19, or about 20 gRNAs. In embodiments, the gRNAs recognize the same target. In embodiments, the gRNAs recognize diffèrent targets. In embodiments, the nucleic acid that includes a gRNA includes a sequence encoding a promoter, wherein the promoter drives expression of the gRNA.

Nuclease

In embodiments, the compounds include a cell penetrating peptide conjugated to a nuclease. In embodiments, the nuclease is a Type II, Type V-A, Type V-B, Type VC, Type V-U, Type VI-B 20 nuclease. In embodiments, the nuclease is a transcription, activator-like effector nuclease (TALEN), a meganuclease, or a zînc-finger nuclease. In embodiments, the nuclease is a Cas9, Cas 12a (Cpfl), Casl2b, Casl2c, Tnp-B like, Casl3a (C2c2), Casl3b, or Casl4 nuclease. For example, in some embodiments, the nuclease is a Cas9 nuclease or a Cpfl nuclease.

In embodiments, the nuclease is a modified form or variant of a Cas9, Cas 12a (Cpfl ), Cas 12b, Cas 12c, 25 Tnp-B like, Casl3a (C2c2), Casl3b, or Casl4 nuclease. In embodiments, the nuclease îs a modified form or variant of a TAL nuclease, a meganuclease, or a zinc-finger nuclease. A “modified” or “variant” nuclease is one that is, for example, truncated, fused to another protein (such as another nuclease), catalytically inactivated, etc. In embodiments, the nuclease may hâve at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or 30 about 100% sequence identity to a naturally occurring Cas9, Cas 12a (Cpfl), Cas 12b, Cas 12c, Tnp-B 65 like, Casl3a (C2c2), Casl3b, Casl4 nuclease, or a TALEN, meganuclease, or zinc-finger nuclease. In embodiments, the nuclease is a Cas9 nuclease derived from S. pyogenes (SpCas9). In embodiments, a nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity to a Cas9 nuclease derived from S. pvogenes (SpCas9). In embodiments, the nuclease is a Cas9 derived from S. aitreits (SaCas9). In embodiments, the nuclease has at least about 90%, al least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity to a Cas9 derived from S. aitreus (SaCas9). In embodiments, the Cpfl is a Cpfl enzyme IromAcidaniinococcus (species BV3L6, UniProt Accession No. U2UMQ6). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity to a Cpfl enzyme from Acidaminococctis (species BV3L6, UniProt Accession No. U2UMQ6).

In embodiments, the Cpfl is a Cpfl enzyme from Lachnospiraceae (species ND2006, UniProt Accession No. A0A182DWE3). In embodiments, the nuclease has at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% sequence identity to a Cpfl enzyme from Lachnospiraceae. In embodiments, a sequence encoding the nuclease is codon oplimized for expression in mammalian cells. In embodiments, the sequence encoding the nuclease is codon optimized for expression in human cells or inouse cells.

In embodiments, a compound that includes a CPP is conjugated to a nuclease. In embodiments, the nuclease is a soluble prolein.

In embodiments, a compound that includes a CPP is conjugated to a nucleic acid encoding a nuclease. In embodiments, the nucleic acid encoding a nuclease includes a sequence encoding a promoter, wherein the promoter drives expression of the nuclease.

gRNA and Nuclease Combinations

In embodiments, the compounds include one or more CPP (or cCPP) conjugated to a gRNA and a nuclease. In embodiments, the one or more CPP (or cCPP) are conjugated to a nucleic acid encoding a gRNA and/or a nuclease. In embodiments, the nucleic acid encoding a nuclease and a gRNA includes a sequence encoding a promoter, wherein the promoter drives expression ofthe nuclease and the gRNA. In embodiments, the nucleic acid encoding a nuclease and a gRNA includes two promoters, wherein a first promoter Controls expression of the nuclease and a second promoter Controls expression ofthe gRNA. In embodiments, the nucleic acid encoding a gRNA and a nuclease 66 encodes from about l to about 20 gRNAs, or from about l, about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 9, about 10, about H, about 12, about 13, about 14, about 15, about 16, about 17, about 18, or about I9, and up to about 20 gRNAs. In embodiments, the gRNAs recognize different targets. In embodiments, the gRNAs recognize the same target.

In embodiments, the compounds include a cell penetrating peptide (or cCPP) conjugated to a ribonucleoprotein (RNP) that includes a gRNA and a nuclease.

In embodiments, a composition that includes: (a) a CPP conjugated to a gRNA and (b) a nuclease is delivered to a cell. In embodiments, a composition that includes: (a) a CPP conjugated to a nuclease and (b) an gRNA is delivered to a cell.

I0 In embodiments, a composition that includes: (a) a first CPP conjugated to a gRNA and (b) a second CPP conjugated to a nuclease is delivered to a cell. In embodiments, the first CPP and second CPP are the same. In embodiments, the first CPP and second CPP are different.

Genetic Element of Interest

In embodiments, the compounds disclosed herein include a cell penetrating peptide conjugated to a 15 genetic element of interest. In embodiments, a genetic element of interest replaces a genomic DNA sequence cleaved by a nuclease. Non-limiting examples of genetic éléments of interest include genes, a single nucléotide polymorphism, promoter, or tenninators.

Nuclease Inhibitors

In embodiments, the compounds disclosed herein include a cell penetrating peptide conjugated to an 20 inhibitor of a nuclease (e.g., Cas9). A limitation of gene editing is potential off-target editing, The delivery of a nuclease inhibitor will limit off-target editing. In embodiments, the nuclease inhibitor is a polypeptide, polynucleotide, or small molécule. Exemplary nuclease inhibitors are described in U.S.

Publication No. 2020/087354, International Publication No. 2018/085288, U.S. Publication No.

2018/0382741, International Publication No. 2019/089761, International Publication No. 25 2020/068304, International Publication No. 2020/041384, and International Publication No.

2019/076651, each of which is incorporated by reference herein in its entirety.

Therapeutic polypeptides

In embodiments, the therapeutic moiety includes a polypeptide. In embodiments, the therapeutic moiety includes a protein or a fragment thereof. In embodiments, the therapeutic moiety includes an RNA binding protein or an RNA binding fragment thereof. In embodiments, the therapeutic moiety includes an enzyme. In embodiments, the therapeutic moiety includes an RNA-cleaving enzyme or 5 an active fragment thereof.

Conjugale Groups

In embodiments, ACs are modified by covalent attachment of one or more conjugale groups. In general, conjugale groups modify one or more properties of the attached AC including but not limited to pharmacodynamie, pharmacokinetic, binding, absorption, cellular distribution, cellular uptake, 10 charge and clearance. Conjugale groups are routînely used in the Chemical arts and are linked directly or via an optional linking moiety or linking group to a parent compound such as an AC. Conjugale groups include without limitation, intercalators, reporter molécules, polyamines, polyamides, polyethylene glycols, thioethers, polyethers, cholestérols, thiocholesterols, cholic acid moieties, folate, lipids, phospholipids, biotin, phenazine, phenanthridine, anthraquinone, adamantane, acridine, 15 fluoresceins, rhodamines, coumarins and dyes. In embodiments, the conjugale group is a polyethylene glycol (PEG), and the PEG is conjugated to either the AC or the CPP.

Conjugale groups include lipid moieties such as a cholestérol moiety (Letsinger et al., Proc. Natl. Acad. Sci. USA, 1989, 86, 6553); cholic acid (Manoharan et al., Bioorg. Med. Chem. Lett., 1994, 4, 1053); a thioether, e.g,, hexyl-S-tritylthiol (Manoharan et al., Ann. N.Y. Acad. Sci., 1992, 660, 306; 20 Manoharan et al., Bioorg. Med. Chem. Let., 1993, 3, 2765); a thiocholesterol (Oberhauser et al., Nucl.

Acids Res., 1992, 20, 533); an aliphatic chain, e.g., dodecandiol or undecyl residues (SaisonBehmoaras et al., EMBO J., 199I, 10, 111; Kabanov et al., FEBS Lett., 1990, 259, 327; Svinarchuk étal., Biochimie, 1993, 75, 49); a phospholipid, e.g., di-hexadecyl-rac-glycerol ortriethylammoniuml,2-di-O-hexadecyl-rac-glyceiO-3-H-phosphonate (Manoharan et al., Tetrahedron Lett., 1995, 36, 25 3651; Shea et al., Nucl. Acids Res., 1990, 18, 3777); a polyamine or a polyethylene glycol chain (Manoharan et al., Nucleosides & Nucléotides, 1995, 14, 969); adamantane acetic acid (Manoharan et al., Tetrahedron Lett., 1995, 36, 3651); a palmityl moiety (Mishra et al., Biochim. Biophys. Acta, 1995, I264, 229); or an octadecylamine or hexylamino-carbonyl-oxycholesterol moiety (Crooke et al., J. Pharmacol. Exp. Ther., 1996,277,923).

Linking groups or bifunctionai linking moieties such as those known in the art are amenable to the compounds provided herein. Linking groups are useful for attachment of Chemical functional groups, conjugale groups, reporter groups and other groups to sélective sites in a parent compound such as for example an AC. In general, a bifunctionai linking moiety comprises a hydrocarbyl moiety having two functional groups. One of the functional groups is selected to bind to a parent molécule or compound of interest and the other is selected to bind essentially any selected group such as Chemical functional group or a conjugale group. Any of the linkers described here may be used. In embodiments, the linker comprises a chain structure or an oligomer of repeating units such as ethylene glycol or amino acid units. Examples of functional groups that are routinely used in a bifunctionai linking moiety include, but are not limited to, electrophiles for reacting with nucleophilic groups and nucleophiles for reacting with electrophilic groups. In embodiments, bifunctionai linking moieties include amino, hydroxyl, carboxylic acid, thiol, unsaturations (e.g., double or triple bonds), and the like. Some nonlimiting examples of bifunctionai linking moieties include 8-amino-3,6-dioxaoctanoic acid (ADO), succinimidyl 4-(N-maleimidomethyl) cyclohexane-l-carboxylate (SMCC) and 6aminohexanoic acid (AHEX or AHA). Other linking groups include, but are not limited to, substituted Cl-CIO alkyl, substituted or unsubstituted C2-C10 alkenyl or substituted or unsubstituted C2-C10 alkynyl, wherein a nonlimiting list of substituent groups includes hydroxyl, amino, alkoxy, carboxy, benzyl, phcnyl, nitro, thiol, thioalkoxy, halogen, alkyl, aryl, alkenyl and alkynyl.

In embodiments, the AC may be linked to a 10 arginine-serine dipeptîde repeat. ACs linked to 10 arginine-serine dipeptîde repeats for the artificial recruitment of splicing cnhancer factors hâve been applied in vitro to induce inclusion of mutated BRCAl and SMN2 exons that otherwise would be skîpped. See Cartegni and Krainer 2003, incorporated by référencé herein.

Endosomal Escape Vehicles (EEVs)

An endosomal escape vehicle (EEV) can be used to transport a cargo across a cellular membrane, for example, to deliver the cargo to the cytosol or nucléus of a cell. Cargo can include a therapeutic moiety (TM). The EEV can comprise a cell penetrating peptide (CPP), for example, a cyclic cell penetrating peptide (cCPP). In embodiments, the EEV comprises a cCPP, which is conjugated to an exocyclic peptide (EP). The EP can be referred to interchangeably as a modulatory peptide (MP). The EP can comprise a sequence of a nuclear localization signal (NLS). The EP can be coupled to the cargo. The EP can be coupled to lhe cCPP. The EP can be coupled to the cargo and the cCPP. Coupling between the EP, cargo, cCPP, or combinations thereof, may be non-covalent or covalent. The EP can be attached through a peptide bond to the N-terminus of the cCPP. The EP can be attached through a peptide bond to the C-terminus of the cCPP. The EP can be attached to the cCPP through a side chain of an amino acid in the cCPP. The EP can be attached to the cCPP through a side chain of a lysine which can be conjugated to the side chain of a glutamine in the cCPP. The EP can be conjugatcd to the 5’ or 3’ end of an oligonucleotide cargo. The EP can be coupled to a linker. The exocyclic peptide can be conjugated to an amino group of the linker. The EP can be coupled to a linker via the Cterminus of an EP and a cCPP through a side chain on the cCPP and/or EP. For example, an EP may comprise a terminal lysine which can then be coupled to a cCPP containing a glutamine through an amide bond. When the EP contains a terminal lysine, and the side chain of the lysine can be used to attach the cCPP, the C- or N-terminus may be attached to a linker on the cargo.

Exocyclic Peptides

The exocyclic peptide (EP) can comprise from 2 to 10 amino acid residues e.g., 2, 3, 4, 5, 6, 7, 8, 9, or 10 amino acid residues, inclusive of ail ranges and values therebetween. The EP can comprise 6 to 9 amino acid residues. The EP can comprise from 4 to 8 amino acid residues.

Each amino acid in the exocyclic peptide may be a natural or non-naturel amino acid. The terni “nonnatural amino acid” refers to an organic compound that is a congener of a natural amino acid in that it has a structure similar to a natural amino acid so that it mimics the structure and reactivîty of a natural amino acid. The non-natural amino acid can be a modified amino acid, and/or amino acid analog, that is not one of the 20 common naturally occurring amino acids or the rare natural amino acids selenocysteine or pyrrolysine. Non-natural amino acids can also be the D-isomer of the natural amino acids. Examples of suitable amino acids include, but are not limited to, alanine, allosoleucine, arginine, citrulline, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, méthionine, napthylalanine, phenylalanine, proline, pyroglutamic acid, serine, threonine, tryptophan, tyrosine, valine, a dérivative thereof, or combinations thereof. These, and others amino acids, are listed in the Table 4 along with their abbreviations used herein. For example, the amino acids can be A, G, P, K, R, V, F, H, Nal, or citrulline.

The EP can comprise at least one positively charged amino acid residue, e.g., at least one lysine residue and/or at least one amine acid residue comprising a side chain comprising a guanidine group, or a protonated fonn thereof. The EP can comprise l or 2 amino acid residues comprising a side chain comprising a guanidine group, or a protonated form thereof. The amino acid residue comprising a side chain comprising a guanidine group can be an arginine residue. Protonated forms can mean sait thereof throughout the disclosure.

The EP can comprise at least two, at least three or at least four or more lysine residues. The EP can comprise 2, 3, or 4 lysine residues. The amino group on the side chain of each lysine residue can be substituted with a protecting group, including, for example, trifluoiOacetyl (-COCF3), allyloxycarbonyl (Alloc), l-(4,4-dimethyl-2,6-dioxocyclohexylidene)ethyl (Dde), or(4,4-dimethyl2,6-dioxocyclohex-l-ylidene-3)-methylbutyl (ivDde) group. The amino group on the side chain of each lysine residue can be substituted with a trifluoroacetyl (-COCF3) group. The protecting group can be included to enable amide conjugation. The protecting group can be removed after the EP is conjugated to a cCPP.

The EP can comprise at least 2 amino acid residues with a hydrophobie side chain. The amino acid residue with a hydrophobie side chain can be selected from valine, proline, alanine, leucine, isoleucine, and méthionine. The amino acid residue with a hydrophobie side chain can be valine or praline.

The EP can comprise at least one positively charged amino acid residue, e.g., at least one lysine residue and/or at least one arginine residue. The EP can comprise at least two, at least three or at least four or more lysine residues and/or arginine residues.

The EP can comprise KK, KR, RR, HH, HK, HR, RH, KKK, KGK, KBK, KBR, KRK. KRR, RKK, RRR, KKH, KHK, HKK, HRR, HRH, HHR, ΗΒΗ, HHH, HHHH (SEQ ID NO:l), KHKK (SEQ ID NO:2), KKHK (SEQ ID NO:3), KKKH (SEQ ID NO;4), KHKH (SEQ ID NO:5), HKHK (SEQ ID NO:6), KKKK (SEQ ID NO:7), KKRK (SEQ ID NO:8), KRKK (SEQ ID NO:9), KRRK (SEQ ID NO:lO), RKKR (SEQ ID NO:l l), RRRR (SEQ ID NO:l2), KGKK (SEQ ID NO:l3), KKGK (SEQ ID NO:l4), HBHBH (SEQ ID NO:l5), HBKBH (SEQ ID NO:l6), RRRRR (SEQ ID NO:l7), KKKKK (SEQ ID NO:l8), KKKRK (SEQ ID NO:l9), RKKKK (SEQ ID NO:20), KRKKK (SEQ ID NO:2l), KKRKK (SEQ ID NO:22), KKKKR (SEQ ID NO:23), KBKBK (SEQ ID NO:24), RKKKKG (SEQ ID NO:25), KRKKKG (SEQ ID NO:26), KKRKKG (SEQ ID NO:27), KKKKRG (SEQ ID NO:28), RKKKKB (SEQ ID NO:29), KRKKKB (SEQ ID NO:30), KKRKKB (SEQ ID NO:3I), KKKKRB (SEQ ID NO:32), KKKRKV (SEQ ID NO:33), RRRRRR (SEQ ID NO:34), HHHHHH (SEQ ID NO:35), RHRHRH (SEQ ID NO:36), HRHRHR (SEQ ID NO:37), KRKRKR 71 (SEQ ID NO:38), RKRKRK (SEQ ID NO:39), RBRBRB (SEQ ID NO:40), KBKBKB (SEQ ID NO:4I), PKKKRKV (SEQ ID NO:42), PGKKRKV (SEQ ID NO:43), PKGKRKV (SEQ ID NO:44), PKKGRKV (SEQ ID NO:45), PKKKGKV (SEQ ID NO:46), PKKKRGV (SEQ ID NO:47), or PKKKRKG (SEQ ID NO:48), wherein B is beta-alanine. The ainino acids in the EP can hâve D or 5 L stereochemistry.

The EP can comprise KK, KR, RR, KKK, KGK, KBK, KBR, KRK, KRR, RKK. RRR, KKKK (SEQ ID NO:7), KKRK (SEQ ID NO:8), KRKK (SEQ ID NO:9), KRRK (SEQ ID NO:lO), RKKR (SEQ ID NO:ll), RRRR (SEQ IDN0:l2), KGKK (SEQ ID NO:l3), KKGK (SEQ ID NO:l4), KKKKK (SEQ ID NO:I8), KKKRK (SEQ ID NO:l9), KBKBK (SEQ ID NO:24), KKKRKV (SEQ ID 10 NO:33), PKKKRKV (SEQ ID NO:42), PGKKRKV (SEQ ID NO:43), PKGKRKV (SEQ ID NO:44),

PKKGRKV (SEQ ID NO:45), PKKKGKV (SEQ ID NO:46), PKKKRGV (SEQ ID NO:47), or PKKKRKG (SEQ ID NO:48). The EP can comprise PKKKRKV (SEQ ID NO:42), RR, RRR, RHR, RBR, RBRBR (SEQ ID NO:49), RBHBR (SEQ ID NQ:50), or HBRBH (SEQ ID NO:5I), wherein B is beta-alanine. The amino acids in the EP can hâve D or L stereochemistry.

The EP can consist of KK, KR, RR, KKK, KGK, KBK, KBR, KRK. KRR, RKK, RRR, KKKK (SEQ ID NO:7), KKRK (SEQ ID NO:8), KRKK (SEQ ID NO:9), KRRK (SEQ ID NO: I0), RKKR (SEQ ID NO:ll), RRRR (SEQ ID NO:l2), KGKK (SEQ ID NO:l3), KKGK (SEQ ID NO;l4), KKKKK (SEQ ID NO:l8), KKKRK (SEQ ID NO:I9), KBKBK (SEQ ID NO:24), KKKRKV (SEQ ID NO:33), PKKKRKV (SEQ ID NO:42), PGKKRKV (SEQ ID NO:Z43), PKGKRKV (SEQ ID 20 NO:Z44), PKKGRKV (SEQ ID NO.Z45), PKKKGKV (SEQ ID NO:46), PKKKRGV (SEQ ID

NO:47), or PKKKRKG (SEQ ID NO:48). The EP can consist of PKKKRKV (SEQ ID NO:42), RR, RRR. RHR, RBR, RBRBR (SEQ ID NO:49), RBHBR (SEQ ID NO:50), or HBRBH (SEQ ID NO:51 ), wherein B is beta-alanine. The amino acids in the EP can hâve D or L stereochemistry.

The EP can comprise an amino acid sequence identified in the art as a nuclear localization sequence 25 (NLS). The EP can consist of an amino acid sequence identified in the ail as a nuclear localization sequence (NLS). The EP can comprise an NLS comprising the amino acid sequence PKKKRKV (SEQ ID NO:42). The EP can consist of an NLS comprising the amino acid sequence PKKKRKV (SEQ ID NO:42). The EP can comprise an NLS comprising an amino acid sequence selected from NLSKRPAAIKKAGQAKKKK (SEQ ID NO:52), PAAKRVKLD (SEQ ID NO:53), 30 RQRRNELKRSF (SEQ ID NO:54), RMRKFKNKGKDTAELRRRRVEVSVELR (SEQ ID

NO:Z55), KAKKDEQILKRRNV (SEQ ID NO:56), VSRKRPRP (SEQ ID NO:57), PPKKARED (SEQ ID NO:58), PQPKKKPL (SEQ ID NO:59), SALIKKKKKMAP (SEQ ID NO:60), DRLRR (SEQ ID NO:61 ), PKQKKRK (SEQ ID NO:62), RKLKKKIKKL (SEQ ID NO:63), REKKKFLKRR (SEQ ID NO:64), KRKGDEVDGVDEVAKKKSKK (SEQ ID NO:65), and 5 RKCLQAGMNLEARKTKK (SEQ ID NO:66). The EP can consist of an NLS comprising an amino acid sequence selected from NLSKRPAAIKKAGQAKKKK (SEQ ID NO:52), PAAKRVKLD (SEQ ID NO:53), RQRRNELKRSF (SEQ ID NO:54), RMRKFKNKGKDTAELRRRRVEVSVELR (SEQ ID NO:55), KAKKDEQILKRRNV (SEQ ID NO:56), VSRKRPRP (SEQ ID NO:57), PPKKARED (SEQ ID NO:58), PQPKKKPL (SEQ ID NO:59), SALIKKKKKMAP (SEQ ID NQ:60), DRLRR 10 (SEQ ID NO:61 ), PKQKKRK (SEQ ID NO:62), RKLKKKIKKL (SEQ ID NO:63), REKKKFLKRR (SEQ ID NO:64), KRKGDEVDGVDEVAKKKSKK (SEQ ID NO:65), and RKCLQAGMNLEARKTKK (SEQ ID NO:66).

Ail exocyclîc sequences can also contain an N-terminal acetyl group. Hence, for example, the EP can hâve the structure: Ac-PKKKRKV (SEQ ID NO:42).

Cell Penetrating Peptides (CPP)

The cell penetrating peptide (CPP) can comprise 6 to 20 amino acid residues. The cell penetrating peptide can be a cyclic cell penetrating peptide (cCPP). The cCPP is capable of penetrating a cell membrane. An exocyclîc peptide (EP) can be conjugated to the cCPP, and the resulting construct can be rcferred to as an endosomal escape vehicle (EEV). The cCPP can direct a cargo (e.g., a therapeutic 20 moiety (TM) such as an oligonucleotide, peptide or small molécule) to penetrate the membrane of a cell. The cCPP can deliver the cargo to the cytosol of the cell. The cCPP can deliver the cargo to a cellular location where a target (e.g., prc-mRNA) is located. To conjugale the cCPP to a cargo (e.g., peptide, oligonucleotide, or small molécule), at least one bond or lone pair of électrons on the cCPP can be replaced.

The total number of amino acid residues in the cCPP is in the range of from 6 to 20 amino acid residues, e.g., 6, 7, 8, 9, 10, 11, 12, 13, I4, 15. 16, 17, 18, I9, or 20 amino acid residues, inclusive of ail ranges and subranges therebetween. The cCPP can comprise 6 to 13 amino acid residues. The cCPP disclosed herein can comprise 6 to I0 amino acids. By way of example, cCPP comprising 6-10 amino acid residues can hâve a structure according to any of Formula l-A to l-E:

AAg AA-|

AA7 AA2

^ΑΑ_ή

AAg AA2

AAe AA₃ i i

AA? AA4

AAg AA5

I-A ^AAk)—AA-i

AAg AA₂

AAg AA₃ \ /

AAy AA4

AAg^-AA ₅ l-E , wherein ΑΑι, AA2, AAj, AA4, AA5, ΑΑό, AA7, AAs, AA?, and AAjo are amino acid residues.

The cCPP can comprise 6 to 8 amino acids. The cCPP can comprise 8 amino acids.

Each amino acid in the cCPP may be a natural or non-natural amino acid. The term “non-natural amino acid” refers to an organic compound that is a congener of a natural amino acid in that it has a structure similar to a natural amino acid so that it mimics the structure and reactivity of a natural amino acid. The non-natural amino acid can be a modified amino acid, and/or amino acid analog, that is not one of the 20 common naturally occurring amino acids or the rare natural amino acids selenocysteine or pyrrolysine. Non-natural amino acids can also be a D-isomer of a natural amino acid. Examples of suitable amino acids include, but are not limited to, alanine, allosoleucine, arginine, citrulline, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, méthionine, napthylalanine, phenylalanine, proline, pyroglutamic acid, serine, threonine, tryptophan, tyrosine, valine, a dérivative thereof, or combinations thereof. These, and others amino acids, are listed in the Table 5 along with their abbreviations used herein.

Table 4. Amino Acid Abbreviations

Amino Acid	Abbreviations* L-amino acid	Abbreviations* D-amino acid
2-[2-[2-aminoethoxy]ethoxy]acetic acid	AEEA, miniPEG, PEG2	NA
Alanine	Ala (A)	ala (a)
Allo-isoleucine	Aile	Aile
Arginine	Arg(R)	arg (r)

Amino Acid	Abbreviations* L-amino acid	Abbreviations* D-amino acid
Asparagine	Asn (N)	asn (n)
aspartic acid	Asp (D)	asp (d)
Cysteine	Cys (C)	cys(c)
Citrulline	Cit	Cit
Cyclohexylalanine	Cha	cha
2,3-diaminopropionic acid	Dap	dap
4-fluorophenylalanine	Fpa (Σ)_________________	p fa
glutamic acid	Glu (E)	glu (e)
glutamine	Gin (Q)	gin(q)____________
glycine	Gly(G)	giy (g)
histidine	His(H)	bis (h)
Homoproline (aka pipecolic acid)	Pip (Θ)	pip (θ)
isoleucine	Ile (I)	ile(i)
leucine	Leu (L)	leu (l)
lysine	Lys (K)	lys (k)
méthionine	Met(M)	met (m)
3-(2-naphthyl)-alanine	Nal (Φ)	nal(φ)
3-( l -naphthyl)-alanine	l-Nal	l-nal
norleucine	Nle(Q)	nie
phenylalanine	Phe (F)	phe (f)
phenylglycine	Phg(T)_____________	P^hg
4-(phosphonodifluoromethyl)phcnylalanine	F2Pmp (A)	Î2pmp
proline	Pro (P)	P^ro (P)___________
sarcosine	Sar (Ξ)	sar
selenocysteine	Sec (U)	sec (u)
serine	Ser(S)	ser (s)
threonine	Thr (T)	thr (y)
tyrosine	Tyr(Y)	tyr (y)
tryptophan	Trp (W)	trp (w)
valine	Val (V)	val (v)
Tert-butyl-alanine	Tle	tle
Penicillamine	Pen	Pen
Homoarginine	HomoArg	homoarg
Nicotinyl-lysine	Lys(NIC)	lys(NIC)
Tri ilouroacetyl-1ysîne	Lys(TFA)	lys(TFA)
Methyl-leucinc	MeLcu	meLeu
3-(3-benzothienyl)-alanine	Bta	bta
* single letter abbreviations: capital letters indicate the L-amino acid form, lower case letter indicate the D-amino acid form.

As used herein, “polyethylene glycol” and “PEG” are used interchangeably. “PEGm,” and “PEG_m,” are, or are derived from, a molécule ofthe formula HO(CO)-(CH2)n-(OCH2CH2)_in-NH2 where n is 75 any integer from l to 5 and m is any integer from l to 23. In embodiments, n is l or 2. In embodiments, n is l. In embodiments, n is 2. In embodiments, n is l and m is 2. In embodiments, n is 2 and m is 2. In embodiments, n is l and m is 4. In embodiments, n is 2 and m is 4. In embodiments, n is l and m is 12. In embodiments, n is 2 and m is 12.

As used herein, “miniPEGm” or “miniPEGm” are, or are derived from, a molécule of the formula HO(CO)-(CH2)_n-(OCH2CH2)_m-NH2 where n is l and m is any integer from I to 23. For example, “miniPEG2” or “miniPEGj” is, or is derived from, (2-[2-[2-aminoethoxy]ethoxy]acetic acid), and “miniPEG4” or “ininiPEG-t” is, or is derived from, HO(CO)-(CH2)_n-(OCH2CH2)m-NH2 where n is l and m is 4.

The cCPP can comprise 4 to 20 amino acids, wherein: (i) at least one amino acid has a side chain comprising a guanidine group, or a protonated fonn thereof; (ii) at least one amino acid has no side

hâve a side chain comprising an aromatîc or heteroaromatic group.

used herein, when no side chain is présent, the amino acid has two hydrogen atoms on the carbon atom(s) (e.g., -CH2-) linking the amine and carboxylic acid.

The amino acid having no side chain can be glycine or β-alanine.

The cCPP can comprise from 6 to 20 amino acid residues which fonn the cCPP, wherein: (i) at least one amino acid can be glycine, β-alanine, or 4-aminobutyric acid residues; (ii) at least one amino acid can hâve a side chain comprising an aryl or heteroaryl group; and (iii) at least one amino acid

Η

H₂N N ' H₂N has a side chain comprising a guanidine group, H

, or a protonated form thereof.

The cCPP can comprise from 6 to 20 amino acid residues which form the cCPP, wherein: (i) at least two amino acid can independently be glycine, β-alanine, or 4-aminobutyric acid residues; (ii) at least one amino acid can hâve a side chain comprising an aryl or heteroaryl group; and (iii) at least

ΝΗ Ο

H₂N N one amino acid has a side chain comprising a guanidine group, H

H₂N

H H

, or a protonated form thereof.

The cCPP can comprise from 6 to 20 amino acid residues which form the cCPP, wherein: (i) at least three amino acids can independently be glycine, β-alanîne, or 4-aminobutyric acid residues; (ii) at least one amino acid can hâve a side chain comprising an aromatic or heteroaromatic group; and (iii) O

H₂N N at least one amino acid can hâve a side chain comprising a guanidine group, H h₂n

Glycine and Related Amino Acid Residues , or a protonated form thereof.

The cCPP can comprise (i) l, 2, 3, 4, 5, or 6 glycine, β-alanine, 4-aminobutyric acid residues, or 15 combinations thereof. The cCPP can comprise (i) 2 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 4 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 5 glycine, β-alanine, 4aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 6 glycine, β-alanine, 20 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3, 4, or 5 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3 or 4 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof.

The cCPP can comprise (i) l, 2, 3, 4, 5, or 6 glycine residues. The cCPP can comprise (i) 2 glycine residues. The cCPP can comprise (i) 3 glycine residues. The cCPP can comprise (i) 4 glycine residues. The cCPP can comprise (i) 5 glycine residues. The cCPP can comprise (i) 6 glycine residues. The cCPP can comprise (i) 3, 4, or 5 glycine residues. The cCPP can comprise (i) 3 or 4 glycine residues. The cCPP can comprise (i) 2 or 3 glycine residues. The cCPP can comprise (i) l or 2 glycine residues.

The cCPP can comprise (i) 3, 4, 5, or 6 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 4 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 5 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 6 glycine, β-alanine, 4aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3, 4, or 5 glycine, βalanine, 4-aminobutyric acid residues, or combinations thereof. The cCPP can comprise (i) 3 or 4 glycine, β-alanine, 4-aminobutyric acid residues, or combinations thereof.

The cCPP can comprise at least three glycine residues. The cCPP can comprise (i) 3,4, 5, or 6 glycine residues. The cCPP can comprise (i) 3 glycine residues. The cCPP can comprise (i) 4 glycine residues. The cCPP can comprise (i) 5 glycine residues. The cCPP can comprise (i) 6 glycine residues. The cCPP can comprise (i) 3, 4, or 5 glycine residues. The cCPP can comprise (i) 3 or 4 glycine residues

In embodiments, none of the glycine, β-alanine, or 4-aminobutyric acid residues in the cCPP are contiguous. Two or three glycine, β-alanine, 4-or aminobutyric acid residues can be contiguous. Two glycine, β-alanine, or 4-aminobutyric acid residues can be contiguous.

In embodiments, none of the glycine residues in the cCPP are contiguous. Each glycine residues in the cCPP can be separated by an amino acid residue that cannot be glycine. Two or three glycine residues can be contiguous. Two glycine residues can be contiguous

Amino Acid Side Chains with an Aromatic or Heteroaromatic Group

The cCPP can comprise (ii) 2, 3, 4, 5 or 6 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 2 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 3 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 4 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 5 amino acid 5 residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 6 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 2, 3, or 4 amino acid residues independently having a side chain comprising an aromatic or heteroaromatic group. The cCPP can comprise (ii) 2 or 3 amino acid residues independently having a side chain comprising an aromatic or 10 heteroaromatic group.

The cCPP can comprise (ii) 2, 3, 4, 5 or 6 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 2 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 3 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 4 amino 15 acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 5 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 6 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 2, 3, or 4 amino acid residues independently having a side chain comprising an aromatic group. The cCPP can comprise (ii) 2 or 3 amino acid residues 20 independently having a side chain comprising an aromatic group.

The aromatic group can be a 6- to 14-mcmbered aryl. Aryl can be phenyl, naphthyl or anthracenyl, each of which is optionally substituted. Aryl can be phenyl or naphthyl, each of which is optionally substituted. The heteroaromatic group can be a 6- to 14-membered heteroaryl having 1, 2, or 3 heteroatoms selected from N, O, and S. Heteroaryl can be pyridyl, quinolyl, or isoquinolyl.

The amino acid residue having a side chain comprising an aromatic or heteroaromatic group can each independently be bis(homonaphthylalanine), homonaphthylalanine, naphthylalanine, phenylglycine, bis(homophenylalanine), homophenylalanine, phenylalanine, tryptophan, 3-(3-benzothienyl)-alanine, 3-(2-quinolyl)-alanine, O-benzylserine, 3-(4-(benzyloxy)phenyl)-alanine, S-(4methylbenzyl)cysteine, /V-(naphthalen-2-yl)glutamine, 3-( l,l'-biphenyl-4-yl)-alanine, 3-(330 benzothienyl)-alanine or tyrosine, each of which is optionally substituted with one or more substituents. The amino acid having a side chain comprising an aromatic or heteroaromatic group can each independently be selected from:

3-(2-quinolyl)-alanine O-bcnzylsermc 3-(4-( benzyloxy)phenyl)-alanine

S-(4-methyIbenzyI jcysteine A^ri-(naphthalen-2-yl (glutamine 3-( l, l '-biphenyl-4-yl )-alanine _afK|

3-(3-benzothienyl)-alanine . _{wherein the H on (he N}._{lelminus and/or the H on the c}.

terminus are replaced by a peptide bond.

The amino acid residue having a side chain comprising an aromatic or heteroaromatic group can each be independently a residue of phenylalanine, naphthylalanine, phenylglycine, homophenylalanine, homonaphthylalanine, bis(homophenylalanine), bis-(homonaphthylalanine), 10 tryptophan. or tyrosine, each of which is optional ly substituted with one or more substituents. The amino acid residue having a side chain comprising an aromatic group can each independently be a residue of tyrosine, phenylalanine, l-naphthylalanine, 2-naphthylalanine, tryptophan, 3benzothienylalanine, 4-phenylphenylalanine, 3,4-difluorophenylalanine, 4trifluoromethylphenylalanine, 2,3,4,5,6-pentafluorophenylalanine, homophenylalanine, β15 homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridinylalanine, 3-pyridinylalanine, 4methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, 3-(9-anthryl)-alanine. The amino acid residue having a side chain comprising an aromatic group can each independently be a residue of phenylalanine, naphthylalanine, phenylglycine, homophenylalanine, or homonaphthylalanine, each of which is optionally substituted with one or more substituents. The amino acid residue having a side chain comprising an aromatic group can each be independently a residue of phenylalanine, naphthylalanine, homophenylalanine, homonaphthylalanine, bis(homonaphthylalanine), or bis(homonaphthylalanine), each of which is optionally substituted with one or more substituents. The amino acid residue having a side chain comprising an aromatic group can each be independently a residue of phenylalanine or naphthylalanine, each of which is optionally substituted with one or more substituents. At least one amino acid residue having a side chain comprising an aromatic group can be a residue of phenylalanine. At least two amino acid 10 residues having a side chain comprising an aromatic group can be residues of phenylalanine. Each amino acid residue having a side chain comprising an aromatic group can be a residue of phenylalanine.

In embodiments, none of the amino acids having the side chain comprising the aromatic or heteroaromatic group are contiguous. Two amino acids having the side chain comprising the aromatic or heteroaromatic group can be contiguous. Two contiguous amino acids can hâve opposite stereochemistry. The two contiguous amino acids can hâve the same stereochemistry. Three amino acids having the side chain comprising the aromatic or heteroaromatic group can be contiguous. Three contiguous amino acids can hâve the same stereochemistry. Three contiguous amino acids can hâve alternating stereochemistry.

The amino acid residues comprising aromatic or heteroaromatic groups can be L-amino acids. The amino acid residues comprising aromatic or heteroaromatic groups can be D-amino acids. The amino acid residues comprising aromatic or heteroaromatic groups can be a mixture of D- and Lamino acids.

The optional substituent can be any atom or group which does not significantly reduce (e.g., by more than 50%) the cytosolic delivery efficiency of the cCPP, e.g., compared to an otherwise identical sequence which does not hâve the substituent. The optional substituent can be a hydrophobie substituent or a hydrophilic substituent. The optional substituent can be a hydrophobie substituent. The substituent can increase the solvent-accessible surface area (as defined herein) of the hydrophobie amino acid. The substituent can be halogen, alkyl, alkenyl, alkynyl, cycloalkyl, 30 cycloalkenyl, cycloalkynyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, acyl, alkylcarbamoyl, alkylcarboxamidyl, alkoxycarbonyl, alkylthio, or aryIthio. The substituent can be halogen.

While not wishing to be bound by theory, it is believed that amino acids having an aromatic or heteroaromatic group having higher hydrophobicity values (i.e., amino acids having side chains comprising aromatic or heteroaromatic groups) can împrove cytosolic delivery effieiency of a cCPP relative to amino acids having a lower hydrophobicity value. Each hydrophobie amino acid can independently have a hydrophobicity value greater than that of glycine. Each hydrophobie amino acid can independently be a hydrophobie amino acid having a hydrophobicity value greater than that of alanine. Each hydrophobie amino acid can independently have a hydrophobicity value greater or equal to phenylalanine. Hydrophobicity may be measured using hydrophobicity scales known in the ail. Table 5 lists hydrophobicity values for various amino acids as reported by Eisenberg and Weiss (Proc. Natl. Acad. Sci. U. S. A. 1984;81 ( l ): 140-144), Engleman, et al. (Ann. Rev. of Biophys. Biophys. Chem. 1986; 1986( 15):321-53), Kyte and Doolittle (J. Mol. Biol. 1982; 157( l ): 105-132), Hoop and Woods (Proc. Natl. Acad. Sci. LJ. S. A. 1981:78(6):3824-3828), and Janin (Nature. 1979;277(5696):491-492), the entirety of each of which is herein incorporated by reference. Hydrophobicity can be measured using the hydrophobicity scale reported in Engleman, et I5 al.

Table 5. Amino Acid Hydrophobicity

Amino Acid	Group	Eisenberg and Weiss	Engleman et al.	Kyrie and Doolittle	Hoop and Woods	Janin
Ile	Nonpolar	0.73	3.1	4.5	-1.8	0.7
Phe	Nonpolar	0.61	3.7	2.8	-2.5	0.5
Val	Nonpolar	0.54	2.6	4.2	-1.5	0.6
Leu	Nonpolar	0.53	2.8	3.8	-1.8	0.5
^TrP	Nonpolar	0.37	1.9	-0.9	-3.4	0.3
Met	Nonpolar	0.26	3.4	1.9	-1.3	0.4
Ala	Nonpolar	0.25	1.6	1.8	-0.5	0.3
Gly	Nonpolar	0.I6	LO	-0.4	0.0	0.3
Cys	Unch/Polar	0.04	2.0	2.5	-LO	0.9
Tyr	Unch/Polar	0.02	-0.7	-1.3	-2.3	-0.4
Pro	Nonpolar	-0.07	-0.2	-1.6	0.0	-0.3
Thr	Unch/Polar	-0.18	1.2	-0.7	-0.4	-0.2
Ser	Unch/Polar	-0.26	0.6	-0.8	0.3	-O.l
His	Charged	-0.40	-3.0	-3.2	-0.5	-O.l
Glu	Charged	-0.62	-8.2	-3.5	3.0	-0.7
Asn	Unch/Polar	-0.64	-4.8	-3.5	0.2	-0.5
Gin	Unch/Polar	-0.69	-4.1	-3.5	0.2	-0.7
Asp	Charged	-0.72	-9.2	-3.5	3.0	-0.6
Lys	Charged	-1.I0	-8.8	-3.9	3.0	-1.8

Amino Acid	Group	Eiscnberg and Weiss	Engleman et al.	Kyrie and Doolittle	Hoop and Woods	Janin
Arg	Charged	-1.80	-12.3	-4.5	3.0	-1.4

The size of the aromatic or heteroaromatic groups may be selected to improve cytosolic delivery efficiency of the cCPP. While not wishing to be bound by theory, it is belîeved that a larger aromatic or heteroaromatic group on the side chain of amino acid may improve cytosolic delivery 5 efficiency compared to an otherwise identical sequence having a smaller hydrophobie amino acid.

The size of the hydrophobie amino acid can be measured in ternis of niolecular weight of the hydrophobie amino acid, the steric effects of the hydrophobie amino acid, the solvent-accessible surface area (SASA) ofthe side chain, or combinations thereof. The size of the hydrophobie amino acid can be measured in tenus ofthe molecular weight ofthe hydrophobie amino acid, and the larger hydrophobie amino acid has a side chain with a molecular weight of at least about 90 g/mol, or at least about 130 g/mol, or at least about I4l g/mol. The size ofthe amino acid can be measured in tenus ofthe SASA ofthe hydrophobie side chain. The hydrophobie amino acid can hâve a side chain with a SASA of greater than or equal to alanine, or greater than or equal to glycine. Larger hydrophobie amino acids can hâve a side chain with a SASA greater than alanine, or greater than glycine. The hydrophobie amino acid can hâve an aromatic or heteroaromatic group with a SASA greater than or equal to about pipcridîne-2-carboxylic acid, greater than or equal to about tryptophan, greater than or equal to about phenylalanine, or greater than or equal to about naphthylalanine. A first hydrophobie amino acid (AAhi) can hâve a side chain with a SASA of at least about 200 A², at least about 210 A², at least about 220 A², at least about 240 À², at least about

250 A², at least about 260 A², at least about 270 A², at least about 280 A², at least about 290 A², at least about 300 A², at least about 310 A², at least about 320 A², or at least about 330 A². A second hydrophobie amino acid (AAhz) can hâve a side chain with a SASA of at least about 200 A², at least about 210 A², at least about 220 A², at least about 240 A², at least about 250 A², at least about 260 A², at least about 270 A², at least about 280 A², at least about 290 À², at least about 300 A², at least about 310 A², at least about 320 A², or at least about 330 A². The side chains of AAm and AAmcan hâve a combined SASA of at least about 350 À², at least about 360 A², at least about 370 A², at least about 380 A², at least about 390 A², at least about 400 A², at least about 410 A², at least about 420 A², at least about 430 À², at least about 440 A², at least about 450 À², at least about 460 A², at least about 470 A², at least about 480 A², at least about 490 A², greater than about 500 A², at least about

510 À², at least about 520 A², at least about 530 A², at least about 540 A², at least about 550 A², at least about 560 A², at least about 570 A², at least about 580 A², at least about 590 A², at least about 600 A², at least about 610 A², at least about 620 Â², at least about 630 A², at least about 640 A², greater than about 650 A², at least about 660 A², at least about 670 A², at least about 680 A², at least about 690 À², or at least about 700 À². AAm can be a hydrophobie amino acid residue with a side chain having a S AS A that is less than or equal to the S AS A of the hydrophobie side chain of AAm. By way of example, and not by limitation, a cCPP having a Nal-Arg motif may exhibit improved cytosolic delivery efficiency compared to an otherwise identical cCPP having a Phe-Arg motif; a cCPP having a Phe-Nal-Arg motif may exhibit improved cytosolic delivery efficiency compared to an otherwise identical cCPP having a Nal-Phe-Arg motif; and a phe-Nal-Arg motif may exhibit improved cytosolic delivery efficiency compared to an otherwise identical cCPP having a nal-PheArg motif.

As used herein, “hydrophobie surface area” or “SASA” refers to the surface area (reported as square Ângstroms; A²) of an amino acid side chain that is accessible to a solvent., SASA can be calculated using the ïolling bail' algorithm developed by Shrake & Rupiey (JMol Biol. 79 (2): 351-71), which is herein incorporated by reference in its entirety for ail purposes. This algorithm uses a “sphere” of solvent of a particular radius to probe the surface of the molécule. A typical value of the sphere is l .4 Â, which approximates to the radius of a water molécule.

SASA values for certain side chains are shown below in Table 6. The SASA values described herein are based on the theoretical values listed in Table 6 below, as reported by Tien, et al. (PLOS ONE 8(l l): e80635, available at doi.org/10.1371/journal.pone.0080635), which is herein incorporated by reference in its entirety for ail purposes.

Table 6. Amino Acid SASA Values

Residue	Theoretical	Empirical	Miller et al. (1987)	Rose et al. (1985)
Alanine	129.0	121.0	113.0	118.1
Arginine	274.0	265.0	241.0	256.0
Asparagine	195.0	187.0	158.0	165.5
Aspartate	193.0	187.0	151.0	158.7
Cysteine	167.0	148.0	140.0	146.1
Glutamate	223.0	214.0	183.0	186.2
Glutamine	225.0	214.0	189.0	193.2
Glycine	I04.0	97.0	85.0	88.1
Histidine	224.0	216.0	194.0	202.5
Isoleucine	197.0	195.0	182.0	181.0

Residue	Theoretical	Empirical	Miller et ah (1987)	Rose étal. (1985)
Leucine	201.0	19L0	180.0	193.1
Lysine	236.0	230.0	211.0	225.8
Méthionine	224.0	203.0	204.0	203.4
Phenylalanine	240.0	228.0	218.0	222.8
Proline	159.0	154.0	143.0	146.8
Serine	155.0	143.0	122.0	129.8
Threonine	172.0	163.0	146.0	152.5
Tryptophan	285.0	264.0	259.0	266.3
Tyrosine	263.0	255.0	229.0	236.8
Valine	174.0	165.0	160.0	164.5

Amino Acid Residues Having a Side Chain Comprising a Guanidine Group, Guanidine Replacement Group, or Protonated Farm Thereof

As used herein, guanidine refers to the stiucture:

NH₂

HN^'N^

H

As used herein, a protonated form of guanidine refers to the structure:

Guanidine replacement groups refer to functional groups on the side chain of amino acids that will be positively charged at or above physiological pH or those that can recapitulate the hydrogen bond donating and accepting activity of guanidinium groups.

The guanidine replacement groups facilitate cell pénétration and delivery of therapeutic agents while reducing toxicity associated with guanidine groups or protonated forms thereof. The cCPP can comprise at least one amino acid having a side chain comprising a guanidine or guanidinium replacement group. The cCPP can comprise at least two amino acids having a side chain comprising a guanidine or guanidinium replacement group. The cCPP can comprise at least three amino acids having a side chain comprising a guanidine or guanidinium replacement group

The guanidine or guanidinium group can be an isostere of guanidine or guanidinium. The guanidine or guanidinium replacement group can be less basic than guanidine.

N H

H₂N N ^x H₂N

As used herein, a guanidine replacement group refers to H ,

, or a protonated form thereof.

The disclosure relates to a cCPP comprising from 4 to 20 amino acids resîdues, wherein: (i) at least one amino acid has a side chain comprising a guanidine group, or a protonated form thereof; (ii) at

O

H₂N N least one amino acid residue has no side chain or a side chain comprising H h₂n

, or a protonated form thereof; and (iii) at least two amino acids resîdues independently hâve a side chain comprising an aromatic or heteroaromatic group.

H₂N N

At least two amino acids resîdues can hâve no side chain or a side chain comprising H

H₂N

, or a protonated fonn thereof. As used herein, when no side chain is present, the amino acid residue hâve two hydrogen atoms on the carbon atom(s) (e.g., -CH2-) linking the amine and carboxylic acid.

The cCPP can comprise at least one amino acid having a side chain comprising one of the following

H₂N N ^x H₂N moieties: H

protonated form thereof.

The cCPP can comprise at least two amino acids each independently having one of the following

H₂N N ^x H₂N moieties H ,

protonated form thereof At least two amino acids can hâve a side chain comprising the same moiety

H₂N N ^x H₂N selected from: H

H₂N N ^vprotonated fonn thereof At least one amino acid can hâve a side chain comprising H , or

O

H₂N N ^xa protonated fonn thereof At least two amino acids can hâve a side chain comprising H , or a protonated fonn thereof One, two, three, or four amino acids can hâve a side chain comprising

O

H₂N N H

, or a protonated forni thereof. One amino acid can hâve a side chain comprising , or a protonated fonn thereof Two amino acids can hâve a side chain comprising

O

H₂N N ^x H₂N , or a protonated fonn thereof. H

HN ^NV ' , or a protonated fonn thereof, can be attached to the tenninus of

O

H₂N N ^x the amino acid side chain. H can be attached to the tenninus of the amino acid side chain.

The cCPP can comprise (iii) 2, 3, 4, 5 or 6 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 2 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 3 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 4 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 5 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a 87 protonated fonn thereof. The cCPP can comprise (iii) 6 amino acid residues indcpcndently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 2, 3, 4, or 5 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) 2, 3, or 4 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated form thereof. The cCPP can comprise (iii) 2 or 3 amino acid residues independently having a side chain comprising a guanidine group, guanidine replacement group, or a protonated fonn thereof. The cCPP can comprise (iii) at least one amino acid residue having a side chain comprising a guanidine group or protonated fonn thereof. The cCPP can comprise (iii) two amino acid residues having a side chain comprising a guanidine group or protonated fonn thereof. The cCPP can comprise (iii) three amino acid residues having a side chain comprising a guanidine group or protonated fonn thereof.

The amino acid residues can independently hâve the side chain comprising the guanidine group, guanidine replacement group, or the protonated form thereof that are not contiguous. Two amino acid residues can independently hâve the side chain comprising the guanidine group, guanidine replacement group, or the protonated fonn thereof can be contiguous. Three amino acid residues can independently hâve the side chain comprising the guanidine group, guanidine replacement group, or the protonated form thereof can be contiguous. Four amino acid residues can independently hâve the side chain comprising the guanidine group, guanidine replacement group, or the protonated form thereof can be contiguous. The contiguous amino acid residues can hâve the same stereochemistry. The contiguous amino acids can hâve altemating stereochemistry.

The amino acid residues independently having the side chain comprising the guanidine group, guanidine replacement group, or the protonated form thereof, can be L-amino acids. The amino acid residues indcpcndently having the side chain comprising the guanidine group, guanidine replacement group, or the protonated fonn thereof, can be D-amino acids. The amino acid residues independently having the side chain comprising the guanidine group, guanidine replacement group, or the protonated form thereof, can be a mixture of L- or D-amino acids.

Each amino acid residue having the side chain comprising the guanidine group, or the protonated form thereof, can independently be a residue of arginine, homoarginine, 2-amino-3-propionic acid, 2-amino-4-guanidinobutyric acid or a protonated fonn thereof. Each amino acid residue having the side chain comprising the guanidine group, or the protonated form thereof, can independently be a residue of arginine or a protonated form thereof.

Each amino acid having the side chain comprising a guanidine replacement group, or protonated

Without being bound by theory, it is hypothesîzed that guanidine replacement groups hâve reduced basicity, relative to arginine and in some cases are uncharged at physiological pH (e.g., a N(H)C(O)), and are capable of maintaining the bidentate hydrogen bonding interactions with phospholipids on the plasma membrane that is believed to facilitate effective membrane association and subséquent intemalization. The removal of positive charge is also believed to reduce toxicity of the cCPP.

Those skilled in the art will appreciate that the N- and/or C-termini of the above non-natural aromatic hydrophobie amino acids, upon incorporation into the peptides disclosed herein, form amide bonds.

The cCPP can comprise a first amino acid having a side chain comprising an aromatic or heteroaromatic group and a second amino acid having a side chain comprising an aromatic or heteroaroinatic group, wherein an N-terminus of a first glycine forms a peptide bond with the first amino acid having the side chain comprising the aromatic or heteroaromatic group, and a Cterminus of the first glycine forms a peptide bond with the second amino acid having the side chain comprising the aromatic or heteroaromatic group. Although by convention, the term “first amino acid” often refers to the N-terminal amino acid of a peptide sequence, as used herein “first amino acid” is used to distinguish the réfèrent amino acid from another amino acid (e.g., a “second amino acid”) in the cCPP such that the terni “first amino acid” may or may refer to an amino acid located at the N-terminus ofthe peptide sequence.

The cCPP can comprise an N-terminus of a second glycine forms a peptide bond with an amino acid having a side chain comprising an aromatic or heteroaromatic group, and a C-terminus of the second glycine forms a peptide bond with an amino acid having a side chain comprising a guanidine group, or a protonated form thereof.

The cCPP can comprise a first amino acid having a side chain comprising a guanidine group, or a protonated form thereof, and a second amino acid having a side chain comprising a guanidine group, or a protonated form thereof, wherein an N-terminus of a third glycine fonns a peptide bond with a first amino acid having a side chain comprising a guanidine group, or a protonated form thereof, and a C-terminus of the third glycine forms a peptide bond with a second amino acid having a side chain comprising a guanidine group, or a protonated form thereof.

The cCPP can comprise a residue of asparagine, aspartic acid, glutamine, glutamic acid, or homoglutamine. The cCPP can comprise a residue of asparagine. The cCPP can comprise a residue of glutamine.

The cCPP can comprise a residue of tyrosine, phenylalanine, l-naphthylalanine, 2-naphlhylalanine, tryptophan, 3-benzothienylalanine, 4-phenylphenylalanine, 3,4-difiuorophenylalanine, 4trifluoromelhylphenylalanine, 2,3,4,5,6-pentafluorophenylalanine, homophenylalanine, βhomophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridinylalanine, 3-pyridinylalanine, 4methylphenylalanine, 4-fluorophenylalanîne, 4-chlorophenylalanine, 3-(9-anlhryl)-alanine.

While not wishing to be bound by theory, it is believed that the chirality of the amino acids in the cCPPs may impact cytosolic uptake efficiency. The cCPP can comprise at least one D amino acid. The cCPP can comprise one to fifteen D amino acids. The cCPP can comprise one to ten D amino acids. The cCPP can comprise l, 2, 3, or 4 D amino acids. The cCPP can comprise 2, 3, 4, 5, 6, 7, or 8 contiguous amino acids having altemating D and L chirality. The cCPP can comprise three contiguous amino acids having the same chirality. The cCPP can comprise two contiguous amino acids having the same chirality. At least two of the amino acids can hâve the opposite chirality. The at least two amino acids having the opposite chirality can be adjacent to each other. At least three amino acids can hâve altemating stereochemistry relative to each other. The at least three amino acids having the altemating chirality relative to each other can be adjacent to each other. At least four amino acids hâve altemating stereochemistry relative to each other. The at least four amino acids having the altemating chirality relative to each other can be adjacent to each other. At least two of the amino acids can hâve the same chirality. At least two amino acids having the same chirality can be adjacent to each other. At least two amino acids hâve the same chirality and at least two amino acids hâve the opposite chirality. The at least two amino acids having the opposite chirality can be adjacent to the at least two amino acids having the same chirality. Accordingly, adjacent amino acids in the cCPP can hâve any of the following séquences: D-L; L-D; D-L-L-D; L90

D-D-L; L-D-L-L-D; D-L-D-D-L; D-L-L-D-L; or L-D-D-L-D. The amino acid residues that form the cCPP can ail be L-amino acids. The amino acid residues that form the cCPP can ail be D-amino acids.

At least two of the amino acids can hâve a different chirality. At least two amino acids having a different chirality can be adjacent to each other. At least three amino acids can hâve different chirality relative to an adjacent amino acid. At least four amino acids can hâve different chirality relative to an adjacent amino acid. At least two amino acids hâve the same chirality and at least two amino acids hâve a different chirality. One or more amino acid residues that fonn the cCPP can be achiral. The cCPP can comprise a motif of 3, 4, or 5 amino acids, wherein two amino acids having the same chirality can be separated by an achiral amino acid. The cCPPs can comprise the following sequences: D-X-D; D-X-D-X; D-X-D-X-D; L-X-L; L-X-L-X; or L-X-L-X-L, wherein X is an achiral amino acid. The achiral amino acid can be glycine.

An amino acid having a side chain comprising:

protonated fonn lhereof, can be adjacent to an amino acid having a side chain comprising an group. An amino acid having a side chain comprising:

O

aromatic or heteroaromatic

, or a protonated fonn thereof, can be adjacent to at least one amino acid having a side chain comprising a guanidine or protonated form thereof. An amino acid having a side chain comprising a guanidine or protonated form thereof can 20 be adjacent to an amino acid having a side chain comprising an aromatic or heteroaromatic group.

Two amino acids having a side chain comprising:

or protonated forms thereof, can be adjacent to each other. Two amino acids having a side chain comprising a guanidine or protonated fonn thereof are adjacent to each other. The cCPPs can comprise at least two contiguous amino acids having a side chain can comprise an aromatic or heteroaromatic group and at least two non-adjacent amino acids having a

' , or a protonated form thereof. The cCPPs can comprise at least two contiguous amino acids 5 having a side chain comprising an aromatic or heteroaromatic group and at least two non-adjacent

O

H₂N^N^ amino acids having a side chain comprising H , or a protonated form thereof. The adjacent amino acids can hâve the sanie chirality. The adjacent amino acids can hâve the opposite chirality. Other combinations of amino acids can hâve any arrangement of D and L amino acids, e.g., any of the sequences described in the preceding paragraph.

At least two amino acids having a side chain comprising:

protonated form thereof, are altemating with at least two amino acids having a side chain comprising a guanidine group or protonated form thereof.

The cCPP can comprise the structure of Formula (A):

or a protonated form thereof, wherein:

Ri, R2, and Rj are each independently H or an aromatic or heteroaromatic side chain ofan amino acid;

at least one of Ri, Ri, and R s is an aromatic or heteroaromatic side chain of an amino acid; R4, R5, Rô, R7 are independently H or an amino acid side chain;

at least one of R4, R5, Rô, R? is the side chain of 3-guanidino-2-aminopropionic acid, 4guanidino-2-aminobutanoic acid, arginine, homoarginine, N-methylarginine, N,N-dimethylarginine, 2,3-diaminopropionic acid, 2,4-diaminobutanoic acid, lysine, N-methyllysine, N,N-dimethyllysine, N-ethyllysine, Ν,Ν,Ν-trimethyllysine, 4-guanidinophenylalanine, citrulline, Ν,Ν-dimethyllysine, , β-homoarginine, 3-( l -piperidinyl)alanine;

AAsc is an amino acid side chain; and q is l, 2, 3 or 4.

In embodiments, at least one of R4, R5, R(>, R7 are independently an uncharged, non-aromatic side chain of an amino acid. In embodiments, at least one of R4, R5, Rf>, R7 are independently H or a side chain of citrulline.

In embodiments, compounds are provided that include a cyclic peptide having 6 to 12 amino acids, wherein at least two amino acids of the cyclic peptide are charged amino acids, at least two amino acids of the cyclic peptide are aromatic hydrophobie amino acids and at least two amino acids ofthe cyclic peptide are uncharged, non-aromatic amino acids. In embodiments, at least two charged amino acids of the cyclic peptide are arginine. In embodiments, at least two aromatic, hydrophobie amino acids ofthe cyclic peptide are phenylalanine or naphthylalanine. In embodiments, at least two uncharged, non-aromatic amino acids ofthe cyclic peptide are citrulline or glycine.

In embodiments, the cyclic peptide of Formula (A) is not selected from a cyclic peptide having a sequence of SEQ ID NO: 89-117.

In embodiments, the cyclic peptide of Formula (A) is selected from a cyclic peptide having a sequence of SEQ ID NO: 89-117.

CPP sequences and SEQ ID NOs
F0RRRQ	89	RRFRORQ	99	FdiRRRRQK	109
FtDRRRC	90	FRRRR0>Q	100	F<t>RRRRQC	HO
FiDRRRU	91	rRFR4>RQ	ιοί	f<I>RrRrRQ	lll
RRROFQ	92	RRtDFRRQ	102	F<DRRRRRQ	112
RRRRfbF	93	CRRRRFWQ	103	RRRRtDFDOC	113
F<DRRRR	94	Ff4>RrRrQ	104	Fd>RRR	114
F(\|irRrRq	95	FF<DRRRRQ	I05	FWRRR	115
FtjirRrRQ	96	RFRFRd>RQ	106	RRRtDF	116

FΦRRRRQ	97	URRRRFWQ	107	RRRWF	117
fΦRrRrQ	98	CRRRRFWQ	108

Φ = L-naphthylalanine; φ = D-naphthylalanine; Ω = L-norleucine The cCPP can comprise the structure of Formula (I):

)=NH ^h2ⁿ _(I) or a protonated form thereof, wherein:

Ri, R?, and R3 can each independently be H or an amino acid resîdue having a side chain comprising an aromatic group;

at least one of Ri, Rz, and R3 is an aromatic or heteroaromatic side chain of an amino acid;

R4 and R7 are independently H or an amino acid side chain;

IO AAsc is an amino acid side chain;

q is l, 2, 3 or 4; and each m is independently an integer of 0, l, 2, or 3.

Ri, Rz, and R3 can each independently be H, -alkylene-aryl, or -alkylene-heteroaryl. Ri, Rz, and R3 can each independently be H, -Cijalkylene-aryl, or -Cijalkylene-heteroaryl. Ri, Rz, and R3 can each independently be H or -alkylene-aryl. Ri, Rz, and R3 can each independently be H or -Cj-zalkylenearyl. Cualkylene can be methylene. Aryl can be a 6- to 14-membered aryl. Heteroaryl can be a 6- to 14-membered heteroaryl having one or more hcteroatoms selected from N, O, and S. Aryl can be selected from phenyl, naphthyl, or anthracenyl. Aryl can be phenyl or naphthyl. Aryl can be phenyl. Heteroaryl can be pyridyl, quinolyl, and isoquinolyl. Ri, Rz, and R3 can each independently be H, 94

Ci.jalkylene-Ph or -Ci-jalkylene-Naphthyl. Ri, Ri, and Rj can each independently be H, -CHiPh, or -CPbNaphthyl. Ri, R2, and R 3 can each independently be H or -CH;Ph.

Ri, R2, and Rj can each independently be the side chain oftyrosine, phenylalanine, l-naphthylalanine, 2-naphthylalanine, tryptophan, 3-benzothienylalanine, 4-phenylphenylalanine, 3,4difluorophenylalamne, 4-trifluoromethylphenylalanine, 2.3,4,5,6-pentafluorophenylalanine, homophenylalanine, β-homophenylalanine, 4-tert-butyl-phenylalanine, 4-pyridinylalanine, 3pyridinylalanine, 4-methylphenylalanine, 4-fluorophenylalanine, 4-chlorophenylalanine, 3-(9anthryl)-alanine.

Ri can be the side chain of tyrosine. R, can be the side chain of phenylalanine. Ri can be the side chain of l -naphthylalanine. Ri can be the side chain of 2-naphthylalanine. R, can be the side chain of tryptophan. Ri can be the side chain of 3-benzothienylalanine. Rj can be the side chain of 4phenylphenylalanine. Ri can be the side chain of 3,4-difluorophenylalanine. Ri can be the side chain of4-trifluoromethylphenylalanine. Ri can be the side chain of 2,3,4,5,6-pentafluorophenylalanine. Ri can be the side chain of homophenylalanine. Ri can be the side chain of β-homophenylalanine. Ri can be the side chain of 4-tert-butyl-phenylalanine. Ri can be the side chain of 4-pyridînylalanine. Ri can be the side chain of 3-pyridinylalanine. Ri can be the side chain of 4-methylphenylalanine. Ri can be the side chain of 4-fluorophenylalanine. Ri can be the side chain of 4-chlorophenylalanine. Ri can be the side chain of 3-(9-anthryl)-alanine.

Ri can be the side chain of tyrosine. R2 can be the side chain of phenylalanine. R2 can be the side chain of 1-naphthylalanine. Ri can be the side chain of 2-naphthylalanine. R; can be the side chain of tryptophan. R? can be the side chain of 3-benzothienylalanine. R2 can be the side chain of 4phenylphenylalanine. Ri can be the side chain of 3,4-difluorophenylalanine. R2 can be the side chain of 4-trifluoromethylphenylalanine. R2 can be the side chain of 2,3,4,5,6-pentafluorophenylalanine. R? can be the side chain of homophenylalanine. R? can be the side chain of β-homophenylalanine. R2 can be the side chain of 4-tert-butyl-phenylalanine. R2 can be the side chain of 4-pyridinylalanine. R2 can be the side chain of 3-pyridinylalanine. R2 can be the side chain of 4-methylphenylalanine. R2 can be the side chain of 4-fluorophenylalanine. R2 can be the side chain of 4-chlorophenylalanine. Ri can be the side chain of 3-(9-anthryl)-alanine.

R3 can be the side chain of tyrosine. R3 can be the side chain of phenylalanine. R3 can be the side chain of 1-naphthylalanine. R3 can be the side chain of 2-naphthylalanine. R3 can be the side chain of tryptophan. R3 can be the side chain of 3-benzothienylalanine. Rj can be the side chain of 4phenylphenylalanine. R? can be the side chain of 3,4-difluorophenylalanîne. R3 can be the side chain of4-trifluoromethylphenylalanine. R3 can be the side chain of 2,3,4,5,6-pentafluorophenylalanine. R3 can be the side chain of homophenylalanine. R3 can be the side chain of β-homophenylalanine. R3 can 5 be the side chain of 4-tert-butyl-phenylalanine. R3 can be the side chatn of 4-pyridinylalanine. R3 can be the side chain of 3-pyridinylalanine. R3 can be the side chain of4-methylphenylalanine. R3 can be the side chain of4-fluorophenylalanine. Ri can be the side chain of 4-chlorophenylalanine. R3 can be the side chain of 3-(9-anthryl)-alanine.

R4 can be H, -alkylene-aryl, -alkylene-heteroaryl. R4 can be H, -Ci-3alkylene-aryl, or -Cj jalkyleneI0 heteroaryl. R4 can be H or -alkylene-aryl. R4 can be H or -Cj-3alkylene-aryl. Cjjalkylene can be a methylene. Aryl can be a 6- to 14-membered aryl. Heteroaryl can be a 6- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, and S. Aryl can be selected from phenyl, naphthyl, or anthracenyl. Aryl can be phenyl or naphthyl. Aryl can phenyl. Heteroaryl can be pyridyl, quinolyl, and isoquinolyl. R4 can be H, -Ci-3alkylene-Ph or -Cualkylene-Naphthyl. R4 can be H or 15 the side chain of an amino acid in Table 4 or Table 6. R4 can be H or an amino acid residue having a side chain comprising an aromatic group. R4 can be H, -CH₂Ph, or -CH₂Naphthyl. R4 can be H or CH2PI1.

R5 can be H, -alkylene-aryl, -alkylene-heteroaryl. R5 can be H, -Cuialkylene-aryl, or -Cj^alkyleneheteroaryl. R5 can be H or -alkylene-aryl. R5 can be H or -Ci-3alkylene-aryl. Ci-ialkylcne can be a 20 methylene. Aryl can be a 6- to 14-membered aryl. Heteroaryl can be a 6- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, and S. Aryl can be selected from phenyl, naphthyl, or anthracenyl. Aryl can be phenyl or naphthyl. Aryl can phenyl. Heteroaryl can be pyridyl, quinolyl, and isoquinolyl. Rs can be H, -Ci-salkylene-Ph or -Cualkylene-Naphthyl. R5 can be H or the side chain of an amino acid in Table 4 or Table 6. R4 can be H or an amino acid residue having a 25 side chain comprising an aromatic group. R 5 can be H, -CH2Ph, or -CH₂Naphthyl. R4 can be H or CH₂Ph.

Rô can be H, -alkylene-aryl, -alkylene-heteroaryl. Rr, can be H, -Cjjalkylene-aryl, or -Ci-3alkyleneheteroaryl. Rô can be H or -alkylene-aryl. R<, can be H or -Cualkylene-aryl. C|.3alkylene can be a methylene. Aryl can be a 6- to 14-membered aryl. Heteroaryl can be a 6- to 14-membered heteroaryl 30 having one or more heteroatoms selected from N, O, and S. Aryl can be selected from phenyl, naphthyl, or anthracenyl. Aryl can be phenyl or naphthyl. Aryl can phenyl. Heteroaryl can be pyridyl, quinolyl, and isoquinolyl. R_h can be H, -Ci.jalkylene-Ph or -Ci-jalkylene-Naphthyl. R<, can be H or the side chain of an amino acid in Table 4 or Table 6. Rô can be H or an amino acid residue having a side chain comprising an aromatic group. R(, can be H, -CH₂Ph, or -CH₂Naphthyl. Ro can be H or CH₂Ph.

R? can be H, -alkylene-aryl, -alkylene-heteroaryl. R? can be H, -Ci-jalkylene-aryl, or -Ci-jalkyleneheteroaryl. R? can be H or -alkylene-aryl. R? can be H or -Cj.jalkylene-aryl. Ci.jalkylene can be a methylene. Aryl can be a 6- to 14-membered aryl. Heteroaryl can be a 6- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, and S. Aryl can be selected from phenyl, naphthyl, or anthracenyl. Aryl can be phenyl or naphthyl. Aryl can phenyl. Heteroaryl can be pyridyl, quinolyl, and isoquinolyl. R? can be H, -Cj.jalkylene-Ph or -Ci-jalkylene-Naphthyl. R? can be H or the side chain of an amino acid in Table 4 or Table 6. R? can be H or an amino acid residue having a side chain comprising an aromatic group. R? can be H, -CH₂Ph, or -CHjNaphthyL R? can be H or CHzPh.

One, two or three of Rj, R₂, Rj, R₄, Rs, Rô, and R? can be -CH₂Ph. One of Rj, R₂, Ri, R₄, Rs, Rô, and R? can be -CH₂Ph. Two of Ri, R₂, Rj, R₄, Rs, Ro, and R? can be -CH₂Ph. Three of Ri, R₂, Rj, R₄, Rs, Ro, and R? can be -CH₂Ph. At least one of Ri, R₂, Rj, Rj, R5, Rh, and R? can be -CH₂Ph. No more than four of Ri, R₂, Rj, R₄, Rs, Rô. and R7 can be -CH₂Ph.

One, two or three of Rj, R₂, Rj, and R₄ are -CH₂Ph. One of R1, R₂, Rj, and R₄ is -CH₂Ph. Two of Ri, R₂, Rj, and R₄ are -CH₂Ph. Three of Ri, R₂, Rj, and R₄ are -CH₂Ph. At least one of Ri, R₂, Rj, and R₄ is -CH₂Ph.

One, two or three of Ri, R₂, Rj, R₄. Rs, R(„ and R7 can be H. One of Rj, R₂, Rj, R₄, R5, Rf>, and R7 can be H. Two of Ri, R₂, Rj, R₄, Rs, R₍,, and R7 are H. Three of Ri, R₂, Rj, R5, R<„ and R7 can be H. At least one of Ri, R₂, Rj, R₄, Rs, R_(>, and R7 can be H. No more than three of Ri, R₂, Rj, R₄, R5, R₆, and R7 can be -CH₂Ph.

One, two or three of Ri, R₂, Rj, and R₄ are H. One of Ri, R₂, Rj, and R₄ is H. Two of Ri, R₂, Rj, and R₄ are H. Three of Ri, R₂, Rj, and R₄ are H. At least one of Ri, R₂, Rj, and R₄ is H.

At least one of R₄, R5, R<,, and R7 can be side chain of 3-guanidino-2-aminopiOpionic acid. At least oneof R₄, Rs, Rô, and R7can be side chain of 4-guanidino-2-aminobutanoic acid. At least one of R₄,

Rs, Rô, and R? can be sîde chaîn of arginine. At least one of R-ι, Rs, Rb, and R? can be side chain of homoarginine. At least one of R-ι, Rs, Rb, and R? can be side chain of N-mcthylarginine. At least one of R4, R5, R(„ and R?can be side chain of Ν,Ν-dimethylarginine. At least one of R4, R?, Rô, and R7 can be side chain of 2,3-diaminopropionic acid. At least one of R4, R5, R<>, and R7 can be side chain of 2,4-diaminobutanoic acid, lysine. At least one of R4, R5, Rb, and R7 can be side chain of Ninethyllysine. At least one of R4, Rs, Rô, and Rrcan be side chain of Ν,Ν-diinethyllysine. At least one of R4, R5, Rb, and R?can be side chain of N-ethyllysine. At least one of R4, R?, Rb, and R?can be side chain of Ν,Ν,Ν-trimethyllysine, 4-guanidinophenylalanine. At least one of R4, R5, Rb, and R?can be side chain of citrulline. At least one of R4, R.s, Rb, and R?can be side chain of Ν,Ν-dimethyllysine, , β-homoargînine. At least one ofR4, Rs, Ro, and R?can be side chain of3-( l-piperidinyl)alanine.

Al least two of R4, R5, Ro, and R7 can be side chain of 3-guanidino-2-aminopropionic acid. At least two of R4, Rs, Rb, and R?can be side chain of 4-guanidino-2-aminobulanoic acid. At least two of R4, Rs, R₍„ and R7 can be side chain of arginine. At least two of R4, R5, Rb, and R7 can be side chain of homoarginine. Al least two of R4, Rs, Rb, and R? can be side chain of N-methylarginine. At least two of R4, Rs, Rb, and R?can be side chain of Ν,Ν-dimethylarginine. At least two of R4. Rs, Rb, and R7 can be side chain of 2,3-diaminopropionic acid. At least two of R4. Rs, Rb, and Rrcan be side chain of 2,4-diaminobutanoic acid, lysine. At least two of R4, R5, Rb, and R7 can be side chain of Nmethyllysine. At least two of R4. Rs, Rb, and R?can be side chain of Ν,Ν-dimethyllysine. At least two of R4- Rs, Rô, and R7 can be side chain of N-ethyllysine. At least two of R4, Rs, Rô. and R7 can be side chain of Ν,Ν,Ν-ti imethyllysine, 4-guanidinophenylalanine. At least two of R4, Rs, Rb, and R?can be side chain of citrulline. At least two of R4. Rs, Rô, and Rrcan be side chain of Ν,Ν-dimethyllysine, , β-homoarginine. Al least two of R4, Rs, Rô, and R?can be side chain of 3-(l-piperidinyl)alanine.

At least three of R4, Rs, Rô, and R?can be side chain of 3-guanidino-2-aminopropionîc acid. At least three of R4, Rs, Rb, and R7 can be side chain of 4-guanidino-2-aminobutanoic acid. Al least three of R4, Rs, Rb. and R7 can be side chain of arginine. At least three of R4. Rs, Rô, and R? can bc side chain of homoarginine. Al least three of R4, Rs, Rb, and R?can be side chain of N-methylarginine. At least three of R4. R.s, Rô, and R7can be side chain of Ν,Ν-dimethylarginine. At leasl three of R4, Rs, Rb, and R7 can be side chain of 2,3-diaminopropionic acid. At least three of R4, Rs, Ro, and R7 can be side chain of 2,4-diaminobutanoic acid, lysine. At least three of R4. Rs, Rb, and R?can be side chain of Nmelhyllysine. At least three of R4, Rs, R0, and R?can be side chain of Ν,Ν-dimethyllysine. At least three of R4, R5, R(„ and R?can be side chain of N-ethyllysine. At least three of R4, Rs, Rb, and R?can 98 be side chain of Ν,Ν,Ν-trimethyllysine, 4-guanidinophenylalanine. At least three of R4, R5, R&, and R7 can be side chain of citrulline,. At least three of R4, R5, Rô, and R7 can be side chain of N,Ndimethyllysine, , β-homoarginine. At least three of R4, R5, Rô, and R7 can be side chain of 3-(lpipcridinyl)alanine.

AAsc can be a side chain of a residue of asparagine, glutamine, or homoglutamine. AAsc can be a side chain of a residue of glutamine. The cCPP can further comprise a linker conjugated the AAsc, e.g., the residue of asparagine, glutamine, or homoglutamine. Hence, the cCPP can further comprise a linker conjugated to the asparagine, glutamine, or homoglutamine residue. The cCPP can further comprise a linker conjugated to the glutamine residue.

K) q can be l, 2, or 3. q can l or 2. q can be l. q can be 2. q can be 3. q can be 4.

m can be l-3. m can be l or 2. m can be 0. m can be l. m can be 2. m can be 3.

The cCPP of Formula (A) can comprise the structure of Formula (!)

NH (!) or protonated form thereof, wherein AAsc, Ri, R?, R3, R4, R7 m, and q are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-a) or Formula (l-b):

h₂n^^nh

NH

(I-b), or protonated form thereof, wherein AAsc, Rj, R2, Rj, R4, and m are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-l), (I-2), (l-3), or (I-4):

or protonated form thereof, wherein AAsc and m are as defined herein.

IOO

The cCPP of Formula (A) can comprise lhe structure of Formula (l-5) or (I-6):

protonated form thereof, wherein AAsc is as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-l):

wherein AAsc and m are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-2):

i I loi

NH ( I-2), or a protonated form thereof, wherein AAsc and m are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-3):

(1-3), or a protonated form thereof, wherein AAsc and m are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula (I-4):

102

wherein AAsc and m are as defined herein.

( I-4), or a protonated form thereof,

The cCPP of Formula (A) can comprise the structure of Formula (l-5):

NH_?

(l-5), or a protonated form thereof, 5 wherein AAsc and m are as defined herein.

The cCPP of Formula (A) can comprise the structure of Formula ( I-6):

103

(I-6), or a protonated form thereof, wherein AAsc and m are as defined herein.

The cCPP can comprise one of the following sequences: FGFGRGR (SEQ ID NO:68); GfFGrGr (SEQ 1D NO:69), FfOGRGR (SEQ ID NO:70); FfFGRGR (SEQ 1D NO:7I); or FfOGrGr (SEQ ID 5 NO:72). The cCPP can hâve one of the following sequences: FGFO (SEQ ID NO:73); GfFGrGrQ (SEQ ID NO:74), FfOGRGRQ (SEQ ID NO:75); FfFGRGRQ (SEQ ID NO:76); or FfOGrGrQ (SEQ ID NO:77).

The disclosure also relates to a cCPP having the structure of Formula (II):

wherein:

AAsc is an amino acid side chain;

R^la, R^lb, and R^lc are each independently a 6- to 14-membered aryl or a 6- to 14-membered heteroaryl;

R^2a, R^2b, R^2c and R^2d are independently an amino acid side chain;

104

, or a protonated form thereof;

at least one of R^2a, R^2b, R^2c and R^2d is guanidine or a protonated fonn thereof;

each n is independently an integer 0, l, 2, 3, 4, or 5;

each n’ is independently an integer from 0, l, 2, or3; and îf n’ îs 0 then R^2a, R^2b, R^2b or R^2d is absent.

O

A λ

H₂N

R and R’^d can be H , or a protonated form thereof, and the remaining of R^2a, R^2b, R^2cand R^2d can be guanidine or a protonated form thereof. At least two of R^2a, R^2b, R^2c and R^2d can be

H₂N N ^x

H , or a protonated form thereof, and the remaining of R^2a, R^2b, R^2c and R^2d can be guanidine, or a protonated fonn thereof.

105

Ο

Η₂Ν'^'Ν'^

Η , or a protonated form thereof, and the remaining of R^2a, R^2b, R^2c and R^2d can be guaninide or a protonated form thereof. At least two R^2a, R^2b, R^2c and R^2d groups can be

O

H₂N^N^

H , or a protonated form thereof, and the remaining of R^2a, R^2b, R^2c and R^2d are guanidine, or a protonated form thereof.

Each of R^2a, R^2b, R^2e and R^2d can indcpendently be 2,3-diaminopropionic acid, 2,4-diaminobutyric acid, the side chains of omithine, lysine, methyllysine, dimethyllysine, trimethyllysine, homo-lysine, serine, homo-serine, threonine, allo-threonine, histidine, l-methylhistidine, 2-aminobutanedioic acid, aspartic acid, glutamic acid, or homo-glutamic acid.

^-^^2

AAsc can be * or ^{1 *} wherein t can be an integer from 0 to 5. AAsc can be , wherein t can be an integer from 0 to 5. t can be 1 to 5. t is 2 or 3. t can be 2. t can be 3.

R^la, R^lb, and R^lc can each independcntly be 6- to 14-membered aryl. R^Ia, R^lh, and R^lc can be each indcpendently a 6- to 14-membered heteroaryl having one or more heteroatoms selected from N, O, or S. R^la, R^lb, and R^lc can each be indcpendently selected from phenyl, naphthyl, anthracenyl, pyridyl, quinolyl, or isoquinolyl. R^la, R^lb, and R^lc can each be indcpendently selected from phenyl, naphthyl, or anthracenyl. R¹¹¹, R^lh, and R^le can each be independcntly phenyl or naphthyl. R^la, R^lb, and R^lc can each be independcntly selected pyridyl, quinolyl, or isoquinolyl.

Each n’ can indcpendently be 1 or 2. Each n’ can be 1. Each n’ can be 2. At least one n’ can be 0. At least one n’ can be 1. At least one n’ can be 2. At least one n’ can be 3. At least one n’ can be 4. At least one n’ can be 5.

106

Each n” can indcpendently be an integer from l to 3. Each n” can independently be 2 or 3. Each n” can be 2. Each n” can be 3. At least one n” can be 0. At least one n” can be l. At least one n” can be 2. At least one n” can be 3.

Each n” can independently be l or 2 and each n’ can independently be 2 or 3. Each n” can be l and 5 each n¹ can independently be 2 or 3. Each n” can be 1 and each n’ can be 2. Each n” is 1 and each n’ is 3.

The cCPP of Formula (II) can hâve the structure of Formula ( 11-1 ):

ⁿ (Il-I), wherein R^la, R^lb, R^lc, R^2a, R^2b, R^2c, R^2d, AAsc.n’ and n” are as defined herein.

The cCPP of Formula (II) can hâve the structure of Formula (Ha):

wherein R^la, R^tb, R^lc, R^2a, R^2b, R^2c, R^2d, AAsc and n’ are as defined herein.

The cCPP of formula (II) can hâve the structure of Formula (Ilb):

107

I

wherein R^2a, R^2b, AAsc, and n’ are as defined herein.

(llb),

The cCPP can hâve the structure of Formula (lie):

wherein:

AAsc and n’ are as defined herein.

The cCPP of Formula (lia) has one ofthe following structures:

108

I

wherein AAsc and n are as defined herein.

The cCPP of Formula (lia) has one ofthe following structures:

109

wherein AAsc and n are as defined herein

The cCPP of Formula (Ha) has one of the following structures:

wherein AAsc and n are as defined herein.

110

The cCPP of Formula (II) can hâve the structure:

The cCPP can hâve the structure of Formula (III):

(III), wherein:

lll

AAsc is an amino acid side chain;

R^la, R^lb, and R^lc are each independently a 6- to 14-membered aryl or a 6- to 14-membered hetero aryl;

R^2b and R^2d are each independently guanidine or a protonated form thereof;

each n” is independently an înteger from l to 3;

each n’ is independently an integer from l to 5; and each p’ is independently an integer from 0 to 5.

The cCPP of Formula (III) can hâve the structure of Formula (III-l):

wherein:

AAsc, R^la, R^lb, R^lc, R^2a, R^2c, R^2b, R^2d n’, n”, and p’ are as defined herein.

The cCPP of Formula (III) can hâve the structure of Formula (Ilia):

H2

wherein:

AAsc, R^2a, R^2c, R^2h, R^2d n’, n”, and p’ are as defined herein.

In Formulas (III), (III-1), and (Ilia), R^a and R^c can be H. R^a and R^c can be H and R^b and R^d can each 5 independently be guanidine or protonated form thereof. R^a can be H. R^b can be H. p’ can be 0. R^aand R^c can be H and each p’ can be 0.

In Formulas (III), (III-1), and (Ilia), R^a and R^c can be H, R^b and R^d can each independently be guanidine or protonated form thereof, n” can be 2 or 3, and each p’ can be 0.

p’ can 0. p' can l. p’ can 2. p’ can 3. p’ can 4. p’ can be 5.

The cCPP can hâve the structure:

The cCPP of Formula (A) can be selected from:

CPP Sequence j SEQ ID NO:

H3

(FfORrRrQ)	78
(FfOCit-r-Cit-rQ)	79
(Ff^GrGrQ)	80
(FfFGRGRQ)	81
(FGFGRGRQ)	82
(GfFGrGrQ)	83
(FGFGRRRQ)	84
(FGFRRRRQ)	85

The cCPP of Formula (A) can be selected from:

CPP Sequence	SEQ ID NO:
F<DRRRRQ	86
fORrRrQ	87
FfftRrRrQ	78
FfOCit-r-Cit-rQ	79
FffoGrGrQ	80
Ff<DRGRGQ	88
FfFGRGRQ	81
FGFGRGRQ	82
GfFGrGrQ	83
FGFGRRRQ	84
FGFRRRRQ	85

In embodiments, the cCPP is selected from:

CPP sequences and SEQ ID NOs
FORRRQ	89	RRFRORQ	99	FfoRRRRQK	109
FORRRC	90	FRRRR0>Q	100	FtfcRRRRQC	HO
F0>RRRU	91	rRFROJRQ	I0l	ίΦΡι-RrRQ	lll
RRRtDFQ	92	RRd>FRRQ	102	FΦRRRRRQ	112
RRRROF	93	CRRRRFWQ	103	ΚΚΚΕΦΕϋΩΟ	H3
Hl>RRRR	94	FfdJRrRrQ	104	FΦRRR	H4
FφrRrRq	95	FFfoRRRRQ	105	FWRRR	115
ΡφιΚι^	96	RFRFR0RQ	106	ΚΚΗΦΡ	116
FtDRRRRQ	97	URRRRFWQ	107	RRRWF	H7
fORrRrQ	98	CRRRRFWQ	108

Where Φ = L-naphthylalanine; φ = D-naphthylalanine; Ω = L-norleucine

In embodiments, the cCPP is not selected from:

CPP sequences and SEQ ID NOs
F0RRRQ	89	RRFRΦRQ	99	FΦRRRRQK	109
FΦRRRC	90	FRRRRΦQ	100	F®RRRRQC	110

114

FΦRRRU	91	rRFRΦRQ	101	ItyRrRrRQ	111
RRRΦFQ	92	RRΦFRRQ	102	FΦRRRRRQ	112
RRRROF	93	CRRRRFWQ	103	RRRRΦFDΩC	113
IRDRRRR	94	Ff®RrRrQ	104	F®RRR	114
	95	FF®RRRRQ	105	FWRRR	115
ΡφΛι^ζ)	96	RFRFRΦRQ	106	RRRΦF	116
FΦRRRRQ	97	URRRRFWQ	107	RRRWF	117
RPRrRrQ	98	CRRRRFWQ	108

Where Φ = L-naphthylalanine; φ = D-naphthylalanine; Ω = L-norleucine

The cCPP can comprise the structure of Formula (D)

or a protonated form thereof, wherein:

Ri, R?, and Rj can each independently be H or an amino acid residue having a side chain 5 comprising an aromatic group;

at least one of Ri, Ri, and R, is an aromatic or heteroaromatic side chain of an amino acid;

R4 and R(, are independently H or an amino acid side chain;

115

AAsc is an amino acid side chain;

q is l, 2, 3 or 4;

each m is independently an integer 0, l, 2, or 3, and each n is independently an integer 0, l, 2, or 3.

The cCPP of Formula (D) can hâve the structure of Formula (D-I):

or a protonated form thereof, wherein:

Rt, R?, and Ri can each independently be H or an amino acid residue having a side chain comprising an aromatic group;

at least one of Ri, Rz, and Ri is an aromatic or heteroaromatic side chain of an amino acid;

Ri and R(, are independently H or an amino acid side chain;

AAsc is an amino acid side chain;

q is l, 2, 3 or 4;

each m is independently an integer 0, l, 2, or 3, and

H6

Ο

The cCPP of Formula (D) can hâve the structure of Formula (D-II):

or a protonated form thereof, wherein:

Ri, R2, and Ri can each independently be H or an amino acid residue having a side chain comprising an aromatic group;

at least one of Ri, Ri, and Rj is an aromatic or heteroaromatic side chain of an amino acid;

Rt and Rô are independently H or an amino acid side chain;

AAsc is an amino acid side chain;

q is 1,2, 3 or 4;

each m is independently an integer 0, 1,2, or 3, each m is independently an integer 0, 1,2, or 3, and

117

The cCPP of Formula (D) can hâve the structure of Formula (D-III):

or a protonated form thereof, wherein:

Ri, R2, and R3 can each independently be H or an amino acid residue having a side chain comprising an aromatic group;

at least one of R,, R2, and R3 is an aromatic or heteroaromatic side chain of an amino acid;

R4 and R<> are independently H or an amino acid side chain;

IO AAsc is an amino acid side chain;

q is l, 2, 3 or 4;

each m is independently an integer 0, l, 2, or 3, each n is independently an integer 0, l, 2, or 3, and

Yis .

The cCPP of Formula (D) can hâve the structure of Formula (D-IV):

ll8

or a protonated fonn thereof, w herein:

Ri, R?, and R3 can each independently be H or an amino acid residue having a side chain comprising an aromatic group;

at least one of Ri, R2, and R 3 is an aromatic or heteroaromatic side chain of an amino acid;

R4 and Rô are independently H or an amino acid side chain;

AAsc is an amino acid side chain;

q is l, 2, 3 or 4;

IO each m is independently an integer 0, l, 2, or 3, and

The cCPP of Formula (D) can hâve the structure of Formula (D-V);

H9

or a prolonated form thereof, wherein:

Ri, R₂, and R?, can each independently be H or an amino acid residue having a side chain comprising an aromatic group;

at least one of Ri, R₂, and Ra is an aromatic or hetcroaromatic side chain of an amino acid;

Rj and R(> are independently H or an amino acid side chain;

AAsc is an amino acid side chain;

q is l, 2, 3 or 4;

each m is independently an integerO, I, 2, or 3, and

The AAsc can be conjugated to a linker.

Linker

The cCPP of the disclosure can be conjugated to a linker. The linker can link a cargo to the cCPP. The linker can be attached to the side chain of an amino acid of the cCPP, and the cargo can be attached at a suitable position on linker.

The linker can be any appropriate moiety which can conjugale a cCPP to one or more additional moieties, e.g., an exocyclic peptide (EP) and/or a cargo. Prior lo conjugation to the cCPP and one or

120 more additional moieties, the linker has two or more functional groups, each of which are independently capable of forming a covalent bond to the cCPP and one or more additional moieties. If the cargo is an oligonucleotide, the linker can be covalently bound to the 5' end of the cargo or the 3' end of the cargo. The linker can be covalently bound to the 5’ end of the cargo. The linker can be covalently bound to the 3' end of the cargo. If the cargo is a peptide, the linker can be covalently bound to the N-terminus or the C-terminus of the cargo. The linker can be covalently bound to the backbone of the oligonucleotide or peptide cargo. The linker can be any appropriate moiety which conjugales a cCPP described herein to a cargo such as an oligonucleotide, peptide or small molécule.

The linker can comprise hydrocarbon linker.

The linker can comprise a cleavage site. The cleavage site can be a disulfide, or caspase-cleavage site (e.g, Val-Cit-PABC).

The linker can comprise: (i) one or more D or L amino acids, each of which is optionally substituted; (ii) optionally substituted alkylene; (iii) optionally substituted alkenylene; (iv) optionally substituted alkynylene; (v) optionally substituted carbocyclyl; (vi) optionally substituted heterocyclyl; (vii) one or more -(R^lJ-R²)z”- subunits, wherein each of R^l and R², at each instance, are independently selected from alkylene, alkenylene, alkynylene, carbocyclyl, and heterocyclyl, each J is independently C, N R³, -NR^JC(O)-, S, and O, wherein R³ is independently selected from H, alkyl, alkenyl, alkynyl, carbocyclyl, and heterocyclyl, each of which is optionally substituted, and z” is an integer from l to 50; (viiî) -(R¹ J)z”- or -(J-R¹ )z”-„ wherein each of R¹, at each instance, is independently alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each J is independently C, NR³, -NR³C(O)-, S, or O, wherein R³ is H, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, each of which is optionally substituted, and z” is an integer from 1 to 50; or (ix) the linker can comprise one or more of (i) through (x).

The linker can comprise one or more D or L amino acids and/or -( R¹ J-R²)z”-, wherein each of R¹ and R², at each instance, are independently alkylene, each J is independently C, NR³, -NR³C(O)-, S, and O, wherein R⁴ is independently selected from H and alkyl, and z” is an integer from 1 to 50; or combinations thereof.

The linker can comprise a -(OCHzCHzjz- (e.g., as a spacer), wherein z’ is an integer from 1 to 23, e.g., 2, 3,4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,20,21,22, or 23. “-(OCH₂CH₃) z’ can also be referred to as polyethylene glycol (PEG).

121

The linker can comprise one or more amino acids. The linker can comprise a peptide. The linker can comprise a -(OCHîCH?)/-, wherein z’ is an integer from l to 23, and a peptide. The peptide can comprise from 2 to 10 amino acids. The linker can further comprise a functional group (FG) capable of reacting through click chemistry. FG can be an azide or alkyne, and a triazole is formed when the cargo is conjugated to the linker.

The linker can comprises (i) a β alanine residue and lysine residue; (ii) -(J-R')z”; or(iii) a combination thereof. Each R¹ can independently be alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each J is independently C, NR³, -NR³C(O)-, S, or O, wherein R³ is H, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, each of which is optionally substituted, and z” can be an integer from 1 to 50. Each R¹ can be alkylene and each J can be O.

The linker can comprise (i) residues of β-alanine, glycine, lysine, 4-aminobutyric acid, 5aminopentanoic acid, 6-aminohexanoic acid or combinations thereof; and (ii) -(R^l‘J)z”- or -(J-R')z”. Each R¹ can independently be alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each J is independently C, NR³, -NR³C(O)-, S, or O, wherein R³ is H, alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl, each of which is optionally substituted, and z” can be an integer from 1 to 50. Each R¹ can be alkylene and each J can be O. The linker can comprise glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aminohexanoic acid, or a combination thereof.

The linker can be a trivalent linker. The linker can hâve the structure:

hydrocarbon linker (e.g., NRH-(CH2)n-COOH), a PEG linker (e.g., NRH-(CH₂O)_n-COOH, wherein R is H, methyl or ethyl) or one or more amino acid residue, and Z is independently a protecting group. The linker can also incoiporate a cleavage site, including a disulfide [NH2-(Cl~bO)n-S-S-(CH2O)nCOOH], or caspase-cleavage site (Val-Cit-PABC).

The hydrocarbon can be a residue of glycine or beta-alanine.

122

The linker can be bivalent and link the cCPP to a cargo. The linker can be bivalent and link the cCPP to an exocyclîc peptide (EP).

The linker can be trivalent and link the cCPP to a cargo and to an EP.

The linker can be a bivalent or trivalent C1-C50 alkylene, wherein l-25 methylene groups are optîonally and independently replaced by -N(H)-, -N(Ci-C4 alkyl)-, -N(cycloalkyl)-, -O-, -C(O)-, C(O)O-, -S-, -S(O)-, -S(O)₂-, -S(O)2N(Ci-C₄ alkyl)-, -S(O)₂N(cycloalkyl)-, -N(H)C(O)-, -N(Ci-C₄alkyl)C(O)-, -N(cycloalkyl)C(O)-, -C(O)N(H)-, -C(O)N(Ci-C4 alkyl), -C(O)N(cycloalkyl), aryl, heterocyclyl, heteroaryl, cycloalkyl, or cycloalkenyl. The linker can be a bivalent or trivalent C1-C50 alkylene, wherein l-25 methylene groups are optîonally and independently replaced by -N(H)-, -O-, -C(O)N(H)-, or a combination thereof.

The linker can hâve the structure:

, wherein: each AA is independently an amino acid residue; * is the point of attachment to the AAsc, and AAsc is side chain of an amino acid residue of the cCPP; x is an integer from l-ΊΟ; y is an integer from l-5; and z is an integer from l-ΊΟ. x can be an integer from l-5. x can be an integer from l-3. x can be l. y can be an integer from 2-4. y can bc 4. z can be an integer from l-5. z can be an integer from l-3. z can be l. Each AA can independently be selected from glycine, β-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, and 6-aminohexanoic acid.

The cCPP can be attached to the cargo through a linker (“L”). The linker can be conjugated to the cargo through a bonding group (“M”).

The linker can hâve the structure:

, wherein: x is an integer from l-ΊΟ; y is an integer from l-5; z is an integer from l-ΊΟ; each AA is independently an amino acid residue; * is the point of attachment to the AAsc, and AAsc is side chain of an amino acid residue of the cCPP; and M is a 123 bonding group defined herein.

The linker can hâve the structure:

wherein: x’ is an integer from l-23; y is an integer from l-5; z’ is an integer from l-23; * is the point of attachment to the AAsc, and AAsc is a side chain of an amino acid residue of the cCPP; and M is a bonding group defined herein.

The linker can hâve the structure:

wherein: x’ is an integer from l-23; y is an integer from l-5; and z’ is an integer from l-23; * is the point of attachment to the AAsc, and AAsc is a side chain of an amino acid residue of the cCPP.

x can be an integer from l-ΊΟ, e.g.,1,2, 3, 4, 5, 6, 7, 8, 9, or 10, inclusive of ail ranges and subranges therebetween.

x’can be an integer from 1-23, e.g., 1,2, 3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23, inclusive of ail ranges and subranges therebetween. x’ can be an integer from 5-15. x’ can be an integer from 9-13. x’ can be an integer from 1-5. x’ can be I.

y can be an integer from 1-5, e.g., 1,2, 3, 4, or 5, inclusive of ail ranges and subranges therebetween. y can be an integer from 2-5. y can be an integer from 3-5. y can be 3 or 4. y can be 4 or 5. y can be 3. y can be 4. y can be 5.

z can be an integer from 1-10, e.g.,1,2, 3, 4, 5, 6, 7, 8, 9, or 10, inclusive ofall ranges and subranges therebetween.

z’ can be an integer from 1-23, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, or 23, inclusive ofall ranges and subranges therebetween. z’ can be an integer from 5-15. z¹can be an integer from 9-13. z’ can be 11.

124

As discussed above, the linker or M (wherein M is part of the linker) can be covalently bound to cargo at any suitable location on the cargo. The linker or M (wherein M is part of the linker) can be covalently bound to the 3' end of oligonucleotide cargo or the 5' end of an oligonucleotide cargo. The linker or M (wherein M is paît of the linker) can be covalently bound to the N-terminus or the C5 terminus of a peptide cargo. The linker or M (wherein M is part of the linker) can be covalently bound to the backbone of an oligonucleotide or a peptide cargo.

The linker can be bound to the side chain of aspartic acid, glutamic acid, glutamine, asparagine, or lysine, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group), on the cCPP. The linker can be bound to the side chain of lysine on the cCPP.

The linker can be bound to the side chain of aspartic acid, glutamic acid, glutamine, asparagine, or lysine, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group), on a peptide cargo. The linker can be bound to the side chain of lysine on the peptide cargo.

The linker can hâve a structure:

HN'^My

AA_s-(AA_x)_p^ 1mh₂

H Π

O ?

wherein

Misa group that conjugales L to a cargo, for example, an oligonucleotide;

AA_S is a side chain or terminus of an amino acid on the cCPP;

each AA_X is independently an amino acid residue;

o is an integer from 0 to 10; and p is an integer from 0 to 5.

The linker can hâve a structure:

wherein

125

Μ is a group that conjugales L to a cargo, for example, an oligonucleotide;

AA_S is a side chain or terminus of an amino acid on the cCPP;

each AA_X is independently an amino acid residue; o is an integer from 0 to 10; and p is an integer from 0 to 5.

M can comprise an alkylene, alkenylene, alkynylene, carbocyclyl, or heterocyclyl, each of which is

ΙΟ optionally substituted. M can be selected from:

HS

S

N H

N

O , and , wherein R is alkyl, alkenyl, alkynyl, carbocyclyl, or heterocyclyl.

M can be selected from:

O

R¹⁰

N' Ν-ή

S

126

wherein:

R^lücan be

and a is 0 to ΙΟ. M can be

O

M can be a heterobifunctional crosslinker, e.g., O , which is disclosed in

Williams et al. Curr. Proioc Nitcleic Acid Chem. 2010, 42, 4.41.1-4.41.20, incorporated herein by référencé its entirety.

AAscan be a side chain or terminus of an amino acid on the cCPP. Non-limiting examples of AA_S include aspartic acid, glutamic acid, glutamine, asparagine, or lysine, or a modified side chain of glutamine or asparagine (e.g., a reduced side chain having an amino group). AA, can be an AAsc as defined herein.

Each AAx is independently a natural or non-natural amino acid. One or more AA_X can be a natural amino acid. One or more AA_X can be a non-natural amino acid. One or more AA, can be a β-amino 15 acid. The β-amino acid can be β-alanine.

127 o can be an integer from 0 to 10, e.g., 0, l, 2, 3, 4, 5, 6, 7, 8, 9, and 10. o can be 0, l, 2, or 3. o can be 0. o can be l. o can be 2. o can be 3.

p can be 0 to 5, e.g., 0, 1,2,3, 4, or 5. p can be 0. p can be 1. p can be 2. p can be 3. p can be 4. p can be 5.

The linker can hâve the structure:

whercin M, AA_S, each -(R'J-R²)z”-, o and z” are defined herein; r can be 0 or 1. r can be 0. r can be I.

The linker can hâve the structure:

O

wherein each of M, AA_S, o, p, q, r and z” can be as defined herein.

z” can be an integer from 1 to 50, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,42, 43, 44, 45, 46, 47, 48, 49, and 50, inclusive of ail ranges and values therebetween. z” can be an integer from 5-20. z” can be an integer from 10-15.

The linker can hâve the structure:

wherein:

M, AA_S and o are as defined herein.

Other non-limiting examples ofsuitable linkers include:

128

129 and

AAs

wherein M and AA_S are as defined herein.

Provided herein is a compound comprising a cCPP and an AC that is complementary to a target in a pre-mRNA sequence further comprising L, wherein the linker is conjugated to the AC through a

O

bonding group (M), wherein M is ' .

Provided herein is a compound comprising a cCPP and a cargo that comprises an antisense compound (AC), for example, an antisense oligonucleotide, that is complementary to a target in a pre-mRNA sequence, wherein the compound further comprises L, wherein the linker is conjugated to the AC through a bonding group (M), wherein M is selected from:

130

wherein t’ is 0 to 10 wherein each R is independently an alkyl, alkenyl, alkynyl, carbocyclyl, or λΛ heterocyclyl, wherein R¹ is ¹ , and t’ is 2.

The linker can hâve the structure:

wherein AA_S is as defined herein, and m’ is 0-10.

The linker can be of the formula:

131

nucleobase at the 3’ end of a cargo phosphorodiamidate morpholino oligomer.

The linker can be of the formula:

Base

wherein “base” corresponds to a nucleobase at the 3’ end of a cargo phosphorodiamidate morpholino oligomer.

The linker can be of the formula:

132

Base

wherein “base” is a nucleobase at the 3’ end ofa cargo phosphorodiamidate morpholino oligomer.

nucleobase at the 3’ end of a cargo phosphorodiamidate morpholino oligomer.

wherein “base” is a

The linker can be of the formula:

The linker can be covalently bound to a cargo at any suitable location on the cargo. The linker is covalently bound to the 3' end of cargo or the 5' end of an oligonucleotide cargo The linker can be lü covalently bound to the backbone ofa cargo.

133

c CPP-linker conjugates

The cCPP can be conjugated to a linker defined herein. The linker can be conjugated to an AAsc of the cCPP as defined herein.

The linker can comprise a -(OCHiCHî)?. - subunit (e.g., as a spacer), wherein z’ is an integer from l to23, e.g., 1,2,3,4, 5,6, 7, 8, 9, 10, H, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 or 23.

(OCH₂CH₂)z is also referred to as PEG. The cCPP-linker conjugale can hâve a structure selected from Table 7:

Table 7: cCPP-linker conjugales and SEQ ID NOs

cyclo( FfO-4gp-r-4gp-rQ)-PEG₄-K-NH₂	cyc/o(SEQ ID NO: 118)-PEG₄-K-NH₂
cye/o(FfO-Cit-r-Cit-rQ)-PEG4-K-NH₂	cyc/o(SEQ ID NO: 119)-PEG₄-K-NH₂
cyc/o(FfO-Pia-r-Pia-i-Q)-PEG4-K-NH2	cyc/o(SEQ ID NO: 120)-PEG₄-K-NH₂
cyclo{ FfO-Dml-r-Dml -rQ)-PEG4-K-NH₂	cyc/o(SEQ ID NO: 121 )-PEG₄-K-NH₂
cyc/o(FfO-Cit-r-Cit-rQ)-PEGi2-OH	çyc/o(SEQ ID NO:122)-PEG\|₂-OH
cyclo( fOR-Cit-R-Cit-Q)-PEG 12-OH	ck/o(SEQ ID NO:123)-PEGi2-OH

The linker can comprise a -(OCEECEhjz - subunit, wherein z¹ is an integer from 1 to 23, and a peptide subunit. The peplide subunit can comprise from 2 to 10 amino acids. The cCPP-linker conjugale can hâve a structure selected from Table 8:

Table 8: cCPP-linker conjugale and SEQ ID NOs

Ac-PKKKRKV-Lys(cyclo[FfO-R-r-Cit-rQ])- PEGi2-K(N₃)-NH₂	Ac-SEQ ID NO:42-Lys(cyclo[SEQ ID NO:127])-PEGi2-K(N₃)-NIl₂
Ac-PKKKRKV-Lys(cyclo[Ffd>-Cit-r-R-rQ])PEG,2-K(N₃)-NH₂	Ac- SEQ ID NO:42-Lys(cyclo[SEQ ID NO: 128])-PEG\|₂-K(N₃)-NH₂
Ac-PKKKRKV-K(cyclo(Ff<3>R-cit-R-cit-Q))- PEG\|₂-K(N₃)-NH₂	Ac- SEQ ID NO:42-K(cyclo(SEQ ID NO:129))-PEGi2-K(N₃)-NH₂
Ac-PKKKRKV-PEG2-Lys(cyclo[FfO-Cit-r-CitrQ])-B-k(N₃)-NH₂	Ac- SEQ ID NO:42-PEG2-Lys(cyclo[SEQ ID NO:130])-B-k(N₃)-NH₂
Ac-PKKKRKV-PEG2-Lys(cyclo[FfO-Cit-r-CitrQ])-PEG2-k(N₃)-NH₂	Ac- SEQ ID NO:42-PEG2-Lys(cyclo[SEQ ID NO:130])-PEG2-k(N₃)-NH₂
Ac-PKKKRKV-PEG2-Lys(cyclo[FfO-Cit-r-CitrQ])-PEG4-k(N₃)-NH₂	Ac- SEQ ID NO:42-PEG2-Lys(cyclo[SEQ ID NO:130])-PEG4-k(N₃)-NH₂

134

Ac-PKKKRKV-Lys(cyclo[FfO-Cit-r-Cit-rQ])- PEGl2-k(N₃)-NH₂	Ac- SEQ ID NO:42-Lys(cyclo[SEQ ID NO: 130])-PEG 12-k(N₃)-NH₂
Ac-pkkkrkv-PEG2-Lys(cyclo[Ffd>-Cit-r-Cit-rQ])PEGl2-k(N₃)-NH₂	Ac- SEQ ID NO:l3l-PEG2-Lys(cyclo[SEQ ID NO:I30])-PEGl2-k(N₃)-NH₂
Ac-rrv-PEG2-Lys(cyclo[FfO-Cit-r-Cit-rQ])- PEG12-OH	Ac-rrv-PEG2-Lys(cyclo[SEQ ID NO:l30])- PEG12-OH
Ac-PKKKRKV-PEG2-Lys(cyclo[FfO-Cit-r-Cit-rQ])-PEGl2-k(N₃)-NH₂	Ac- SEQ ID NO:42-PEG2-Lys(cyclo[SEQ ID NO:l30])-PEGl2-k(N₃)-NH₂
Ac-PKKK-Cit-KV-PEG2-Lys(cyclo[Ffd>-Cit-r- Cit-r-Q])-PEGl2-k(N₃)-NH₂	Ac- SEQ ID NO: !26-PEG2-Lys(cyclo[SEQ ID NO: 130])-PEG 12-k(N₃)-NH₂
Ac-PKKKRKV-PEG2-Lys(cvc/o[FfO-Cit-r-Cit-rQ]-PEGl2-K(N₃)-NH₂	Ac- SEQ ID NO:42-PEG2-Lys(cyc7o[SEQ ID NO: 130]-PEG 12-K(N₃)-NH₂

EEVs comprising a cyclic cell penetrating peptide (cCPP), linker and exocyclic peptide (EP) are provided. An EEV can comprise the structure of Formula (B):

wherein:

Ri, Rt, and R₃ are each independently H or an aromatic or heteroaromatic side chain of an amino acid;

R-i and R7 are independently H or an amino acid side chain;

EP is an exocyclic peptide as defined herein;

each m is independently an integer from 0-3;

135 n is an integer from 0-2;

x’ is an integer from l-20;

y is an integer from l-5;

q is l-4; and z’ is an integer from l-23.

Ri, R;, R3, R₄, R7, EP, m, q, y, x’, z’ are as described herein. n can be 0. n can be l. n can be 2.

The EEV can comprise the structure of Formula (B-a) or (B-b):

136

(B-b), or a protonated form thereof, wherein EP,

R¹, R², R³, R⁴, m and z’ are as defined above in Formula (B).

The EEV can comprises the structure of Formula (B-c):

or a protonated form thereof, wherein EP, R¹, R², R³, R⁴, and m arc as defined above in

Formula (B); AA is an amino acid as defined herein; M is as defined herein; n is an integer from 0

2; x is an integer from 1-10; y is an integer from 1-5; and z is an integer from 1-10.

The EEV can hâve the structure of Formula (B-1 ), (B-2), (B-3), or (B-4):

137

138

EP is as defined above in Formula (B).

The EEV can comprise Formula (B) and can hâve the structure: Ac-PKKKRKVAEEA

K(cyc/o[FGFGRGRQ])-PEGi2-OH (Ac-SEQ ID NO:l32- K(qyc/o[SEQ ID NO:82])-PEGi₂-OH) or

Ac-PK-KKR-KV-AEEA-K(cyc/o[GfFGrGrQ])-PEG|₂-OH (Ac- SEQ ID NO:l33-K(çvc/o[SEQ ID

NO:83])-PEGi₂-OH).

The EEV can comprise a cCPP of formula:

139

The EEV can comprise formula: Ac-PKKKRKV-miniPEG2-Lys(cyclo(FfFGRGRQ)-PEG₂’K(Nj) (Ac-SEQ ID NO:42-miniPEG₂-Lys(cyclo(SEQ ID NO:8l )-PEG₂-K(N₃)).

The EEV can be:

HN

The EEV can be

140

The EEV can be Ac-P-K(Tfa)-K(Tfa)-K(Tfa)-R-K(Tfa)-V-miniPEG₂-K(cyc/o(Ff-Nal-GrGrQ)PEGirOH (Ac-SEQ ID NO: l34-miniPEG₂-K(cyc/o(SEQ ID NO: l35)-PEGi₂-OH).

The EEV can be

The EEV can be Ac-P-K-K-K-R-K-V-miniPEG₂-K(cycfo(Ff-Nal-GrGrQ)-PEGi₂-OH (Ac- SEQ ID

NO:42-miniPEG₂-K(cvc/o(SEQ ID NO: 135)-PEGi₂-OH).

The EEV can be

I4l

HN

The EEV can be

HN

The EEV can be

142

HN >=NH

H₂N

The EEV can be

143

nh₂ nh₂ nh₂ V-N H Q - H o 7 H o ⁰ f HhC ^NH2 H_?N'^NH The EEV can be h₂n nh ΙΝΓΊ2 J Cl S ^HN1 N V°0 ] h ⁰ S H ^{( oHN}ÂVÂVr ^H 0^H 0 nh₂ nh₂ nh₂ The EEV can be:

n₃ ^{0 ,ÔHl)}‘ 0 ⁰ TY? fi ^H ANH %'NH Πήγ° ^HN x/A ^hn> AîA NH Un’cA o=/ ^UA HN ^<NH u /H A H HNA A, °Vⁿta °o h₂n h \ ⁰^-νΆ HN 3 _H 0 _? . 0 N Y ^N Ν^(^Ο^·Ο^^ΟΗ H Q z H ^x '11 ^(èH<⁴ o A A? fi n~NH V^NH sv HN \ /\ ^HN\'3 } NH O=V ^u' HN ΉΗ Λλ A H HNA A. crA-Nv·^{7 0}(\ A f AO h₂n h \ ⁰ \^/ Aⁿ-A HN

144

The EEV can be

145

146

The EEV can be

The EEV can be selected from

Ac-iT-miniPEG₂-Dap(cyc/o[FfO-Cit-r-Cit-rQ])-PEGi₂-OH	Ac-rr-miniPEG2-Dap(cvc/6i[SEQ IDNO:l36])-PEGi2-OH
Ac-frr-PEG2-Dap(cyc/o[FfO-Cit-r-Cit-rQ])-PEGi2-OH	Ac-frr-PEG₂-Dap(cyc/o[SEQ ID NO:136])-PEGi₂-OH
Ac-rfr-PEG₂-Dap(cyc/o[FfO-Cit-r-Cit-rQ])-PEGi₂-OH	Ac-rfr-PEG₂-Dap(cyc7o[SEQ ID NO:l36])-PEGi2-OH
Ac-rbfbr-PEG2-Dap(cpc/o[Ff®-Cit-r-Cit-rQ])-PEGi2-OH	Ac-SEQ ID NO:l37-PEG₂- Dap(cyc/o[SEQ IDNO:l36])- PEG_l2-OH
Ac-m-PEG2-Dap(cyc/o[FfO-Cit-r-Cit-rQ])-PEG_l2-OH	Ac-rrr-PEG₂-Dap(cyc7o[SEQ ID NO:l36])-PEGi2-OH
Ac-rbr-PEG₂-Dap(cK/o[Ff®-Cit-r-Cit-rQ])-PEGi2-OH	Ac-rbr-PEG₂-Dap(cyc/o[SEQ ID NO: l36])-PEGi2-OH
Ac-rbrbr-PEG₂-Dap(cyc7o[Ff<I>-Cit-r-Cit-rQ])-PEGi₂-OH	Ac-SEQ ID NO:l38-PEG₂- Dap(cyc/o[SEQ IDNO:l36])- PEG12-OH
Ac-hh-PEG₂-Dap(cvc/o[FfO-Cit-r-Cit-rQl)-PEG_l2-OH	Ac-hh-PEG₂-Dap(cyc7o[SEQ ID NO:136])-PEGi2-OH
Ac-hbh-PEG₂-Dap(cycto[FfO-Cit-r-Cit-rQ])-PEG\|₂-OH	Ac-hbh-PEG₂-Dap(qyt7o[SEQ ID NO: 136])-PEG\|2-OH

147

Ac-hbhbh-PEG₂-Dap(cyc7o[FfO-Cit-r-Cit-rQ])-PEGi₂-OH	Ac-SEQINNO: 139-PEG₂- Dap(cycZo[SEQ ID NO:l36])- PEG12-OH
Ac-rbhbh-PEG₂-Dap(cyc/o[Ff0-Cit-r-Cit-rQ])-PEGi2-OH	Ac- SEQID NO: !40-PEG₂- Dap(cyc/o[SEQ ID NO: 136])- PEG.2-OH
Ac-hbrbh-PEG₂-Dap(qycZo[FfO-Cit-i-Cit-rQ])-PEGi₂-OH	Ac-SEQ ID NO:l4l-PEG₂- Dap(cyc/o[SEQ ID NO:l36])- PEG12-OH
Ac-iT-Dap(qyi7o[Ff<t»-Cit-r-Cit-rQ])-b-OH	Ac-rr-Dap(cycZo[SEQ ID NO:l36])-b-OH
Ac-fn'-Dap(qyc/o[FfO-Cit-r-Cit-rQ])-b-OH	Ac-frr-Dap(cyc7o[SEQ ID NO: !36])-b-OH
Ac-rfr-Dap(cycZo[FfO-Cit-r-Cit-rQ])-b-OH	Ac-rfr-Dap(çycZo[SEQ ID NO: !36])-b-OH
Ac-rbfbr-Dap(cyc/o[FfO-Cit-r-Cit-rQ])-b-OH	Ac- SEQ ID NO:l37- Dap(qycZo[SEQ ID NO: !36])-b- OH
Ac-m-Dap(qyc/f>[Ff4>-Cit-i-Cit-rQ])-b-OH	Ac-nr-Dap(qycZo[SEQ ID NO: I36])-b-OH
Ac-rbr-Dap(qyc/o[FfftJ-Cit-r-Cit-rQ])-b-OH	Ac-rbr-Dap(cyc7o[SEQ ID NO: l36])-b-OH
Ac-rbrbr-Dap(cyc7o[FI'O-Cit-r-Cit-rQ])-b-OH	Ac- SEQ IDNO:l38- Dap(cyc/o[SEQ ID NO:l36])-b- OH
Ac-hh-Dap(qycZo[FfO-Cit-r-Cit-rQ])-b-OH	Ac-hh-Dap(cyc/o[SEQ ID NO: !36])-b-OH
Ac-hbh-Dap(cycZo[Ff®-Cit-i‘-Cit-rQ])-b-OH	Ac-hbh-Dap(cyc7o[SEQ ID NO: !36])-b-OH
Ac-hbhbh-Dap(^c/o[FfO-Cit-r-Cit-rQ])-b-OH	Ac- SEQ IN NO: 139- Dap(cycZo[SEQ ID NO:l36])-bOH
Ac-rbhbh-Dap(cycZc[Ff<t>-Cit-r-Cit-rQ])-b-OH	Ac-SEQ IDNO: 140- Dap(cycZo[SEQ ID NO: !36])-bOH
Ac-hbrbh-Dap(cycZo[FfO-Cit-r-Cit-rQ])-b-OH	Ac-SEQ IDNO:l4l- Dap(cycZo[SEQ ID NO: !36])-bOH
Ac-KKKK-miniPEG₂-Lys(cyc/o[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac- SEQ ID NO:7-miniPEG₂Lys(cKZo[SEQ ID NO:Z80])mimPEG₂-K(N₃)-NH₂
Ac-KGKK-miniPEG₂-Lys(cycZo[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac- SEQ ID NO:l3-miniPEG₂Lys(cycfo[SEQ ID NO:Z80])miniPEG₂-K(N₃)-NH₂

148

Ac-KKGK-mîniPEG₂-Lys(cyc/ofFf0GrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac- SEQ ID NO:l4-miniPEG₂- Lys(w/o[SEQ ID NO:Z80])miniPEG₂-K(N₃)-NH₂
Ac-KKK-miniPEG₂-Lys(qyc7o[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac-KKK-miniPEG₂- Lys(cyc/o[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-KK-miniPEG2-Lys(cyc/o[Ff0GrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac-KK.-miniPEG₂-Lys(cTc/o[SEQ ID NO:80])-miniPEG₂-k(N₃)-NH₂
Ac-KGK-miniPEG2-Lys(cyc7o[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac-KGK-miniPEG₂- Lys(qvcM[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-KBK-miniPEG₂-Lys(cjc/p[FfÎ>GrGrQ])-miniPEG2- K(N₃)-NH₂	Ac-KBK-miniPEG₂- Lys(cyc/o[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-KBKBK-miniPEG2-Lys(cyc/o[Ff®GrGrQ])-miniPEG2- K(N₃)-NH₂	Ac- SEQ ID NO:24-miniPEG₂Lys(qvcfo[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-KR-miniPEG₂-Lys(cyc/o[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac-KR-miniPEG₂-Lys(cpc/o[SEQ ID NO:80])-miniPEG₂-k(N₃)-NH₂
Ac-KBR-miniPEG₂-Lys(t7c/o[FfOGrGrQ])-miniPEG₂- K(N₃)-NH₂	Ac-KBR-miniPEG₂- Lys(cyc/o[SEQ ID NO;80])mimPEG₂-K(N₃)-NH₂
Ac-PKKKRKV-ininiPEG₂-Lys(qvc/o[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:42-miniPEG₂- Lys(cv<?/o[SEQ ID NO:80])mimPEG₂-K(N₃)-NH₂
Ac-PKKKRKV-miniPEG2-Lys(cyi/p[Ff<DGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:42-miniPEG₂Lys(cvc/o[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-PGKKRKV-miniPEG2-Lys(m/o[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:43-miniPEG₂- Lys(qyc/o[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂
Ac-PKGKRKV-miniPEG₂-Lys(cyc/o[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:44-miniPEG₂- Lys(cyc/o[SEQ ID NQ:80])- miniPEG₂-K(N₃)-NH₂
Ac-PKKGRKV-miniPEG₂-Lys(cyc7o[FfOGrGrQ])ininiPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:45-miniPEG₂- Lys(cyc/o[SEQ ID NO;8Û])miniPEG₂-K(N₃)-NH₂
Ac-PKKKGKV-miniPEG₂-Lys(cycM[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:46-minîPEG₂Lys(cvcM[SEQ ID NQ:80])miniPEG₂-K(N₃)-NH₂
Ac-PKKKRGV-miniPEG₂-Lys(cyc/o[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:47-miniPEG₂- Lys(cyc/o[SEQ ID NO:80])- miniPEG₂-K(N₃)-NH2
Ac-PKKKRKG-miniPEG₂-Lys(cycfo[FfOGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:48-miniPEG₂- Lys(cyc/o[SEQ ID NO:80])miniPEG₂-K(N₃)-NH₂

149

Ac-KKKRK-miniPEG₂-Lys(cyc7o[Ff<bGrGrQ])miniPEG₂-K(N₃)-NH₂	Ac- SEQ ID NO:l9-miniPEG₂Lys(cvc/o[SEQ ID NO:80])miniPEG2-K(N₃)-NH₂
Ac-KKRK-miniPEG₂-Lys(cyc/i?[Ff®GrGrQ])-miniPEG₂- K(N₃)-NH₂ and	Ac- SEQ ID NO:8-miniPEG₂Lys(cyc7o[SEQ ID NO:80])miniPEG₂-K(N3)-NH₂ and
Ac-K.RK.-miniPEG₂-Lys(cyc/o[Ff<DGrGrQ])-miniPEG₂K(N₃)-NH₂.	Ac-KRK-miniPEG2- Lys(çyc/o[SEQ ID NO:80])miniPEG2-K(N₃)-NH₂.

The EEV can be selected from:

Ac-PKKKRKV-Lys(cyc/o[FfOGrGrQ] )-PEG i2-K(Nj)-NH2 (Ac- SEQ ID NO:42-Lys(cyc/o[SEQ ID NO:80])-PEGi2-K(Nj)-NH₂)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/o[Ff®GrGrQl)-miniPEG₂-K(N₃)-NH₂ (Ac- SEQ ID NO:42-miniPEG₂-Lys(cyc/o[SEQ ID NO:80])-miniPEG2-K(N₃)-NH₂)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/6»[FGFGRGRQ])-miniPEG₂-K(N3)-NH2 (Ac- SEQ ID NO:42-miniPEG₂-Lys(çycfo[SEQ ID NO:82])-miniPEG₂-K(N₃)-NH₂)

Ac-KR-PEG2-K(cyc/o[FGFGRGRQ])-PEG2-K(N₃)-NH₂ (Ac-KR-PEG₂-K(çvc/o[SEQ ID NO:82])-PEG₂-K(N3)-NH₂)

Ac-PKKKGKV-PEG₂-K(cpc/o[FGFGRGRQ])-PEG₂-K(N3)-NH₂ (Ac- SEQ ID NO:46-PEG₂-K(cyc/o[SEQ ID NO:82])-PEG2-K(N₃)-NH₂)

Ac-PKKKRKG-PEG₂-K(cyt7o[FGFGRGRQ])-PEG₂-K(N₃)-NH₂ (Ac- SEQ ID NO:48-PEG₂-K(cyc7o[SEQ ID NO:82])-PEG₂-K(N₃)-NH₂)

Ac-KKKRK-PEG₂-K(cyc/o[FGFGRGRQ])-PEG₂-K(N3)-NH₂ (Ac- SEQ ID NO:l9-PEG₂-K(çyc7o[SEQ ID NQ:82])-PEG₂-K(N3)-NH₂)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/o[FFOGRGRQ])-miniPEG₂-K(N3)-NH₂ (Ac- SEQ ID NO:42-miniPEG₂-Lys(cyc7o[SEQ ID NO:80])-miniPEG₂-K(N3)-NH₂)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/o[phFfOGrGrQ])-miniPEG₂-K(N3)-NH₂ (Ac- SEQ ID NO;42-miniPEG₂-Lys(cyt7o[SEQ ID NO: l42])-rniniPEG₂-K(N3)-NH₂)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/a[Ff0SrSrQ])-miniPEG₂-K(N3)-NH₂ (Ac- SEQ ID NO:42-miniPEG₂-Lys(qycto[SEQ ID NO: l43])-miniPEG₂-K(N₃)-NH₂).

The EEV can be selected from:

Ac-PKKKRKV-miniPEG₂-Lys(cyc7o(GfFGrGrQ])-PEG|₂-OH

150 (Ac- SEQ ID NO:42-miniPEG₂-Lys(qyc7o(SEQ ID NO:l33])-PEG_[2-OH)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/o[FGFKRKRQ])-PEGi₂-OH (Ac- SEQ ID NO:42-miniPEG₂-Lys(qyc7a[SEQ ID NO:l44])-PEG_l2-OH)

Ac-PKKKRKV-miniPEG₂-Lys(çyc7o[FGFRGRGQ])-PEGi₂-OH (Ac- SEQ ID NO:42-miniPEG₂-Lys(cyc7o[SEQ ID NO: 145])-PEGi₂-OH)

Ac-PKKKRKV-miniPEG2-Lys(cyc/o[FGFGRGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:42-miniPEG2-Lys(cyc/o[SEQ ID NO: !46])-PEGi₂-0H)

Ac-PKKKRKV-miniPEG₂-Lys(cyc/o[FGFGRrRQ])-PEGi2-OH (Ac- SEQ ID NO:42-miniPEG₂-Lys(cyc7o[SEQ ID NO: l47])-PEGi₂-OH)

Ac-PKKKRKV-miniPEG2-Lys(cvc/o[FGFGRRRQ])-PEG|₂-OH (Ac- SEQ ID NO:42-miniPEG₂-Lys(qyc7o[SEQ ID NO:84])-PEGi₂-OH)and

Ac-PKKKRKV-miniPEG₂-Lys(qyc/o[FGFRRRRQ])-PEGi2-OH (Ac- SEQ ID NO:42-miniPEG₂-Lys(cyc7o[SEQ ID NO:85])-PEG_!2-OH).

The EEV can be selected from:

Ac-K-K-K-R-K-G-miniPEG₂-K(cyc/o[FGFGRGRQ])-PEGi2-OH (Ac-SEQ ID NO:l48-miniPEG₂-K(çyc7o[SEQ ID NO:82])-PEGi₂-OH)

Ac-K-K-K-R-K-miniPEG₂-K(cyc/o[FGFGRGRQ])-PEG|₂-OH (Ac- SEQ ID NO:l9-miniPEG₂-K(cyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-K-K-R-K-K-PEG₄-K(cyc/o[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:22-PEG₄-K(cyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-K-R-K-K-K-PEG₄-K(cyc7o[FGFGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:2l-PEG₄-K(cyc7o[SEQ ID NO:82])-PEGi₂-OH)

Ac-K-K-K-K-R-PEG₄-K(cyc/o[FGFGRGRQ])-PEG|₂-OH (Ac- SEQ ID NO:23-PEG₄-K(çyc7o[SEQ ID NO:82])-PEGi₂-OH)

Ac-R-K-K-K-K-PEG₄-K.(cyc/o[FGFGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:20-PEG₄-K(cyc7o[SEQ ID NO:82])-PEG_[2-OH) and

Ac-K-K-K-R-K-PEG₄-K(cycto[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO: I9-PEG₄-K(çvc/o[SEQ ID NO;82])-PEGi₂-OH).

The EEV can be selected from:

Ac-PKKKRKV-PEG₂-K(ckM[FGFGRGRQ])-PEG2-K(N₃)-NH₂

I5l (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:82])-PEG2-K(N₃)-NH₂)

Ac-PKKKRKV-PEG₂-K(cye/o[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-PKKKRKV-PEG₂-K(çyc/o[GfFGrGrQ])-PEG₂-K(N₃)-NH₂ (Ac- SEQ ID NO:42-PEG₂-K(cyc/o[SEQ ID NO: 133])-PEG₂-K(Nj)-NH₂) and Ac- PKKKRKV-PEG₂-K(cyc/o[GfFGrGrQ])-PEGi2-OH (Ac- SEQ ID NO:42-PEG₂-K(cyc/o[SEQ ID NO: !33])-PEGi₂-OH).

The cargo can be an AC and the EEV can be selected from:

Ac-PKKKRKV-PEG₂-K(çyc/o[FfOGrGrQ])-PEGi₂-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:80])-PEG|₂-OH)

Ac-PKKKRKV-PEG₂-K(cyc/o[FfŒ»Cit-r-Cit-i-Q])-PEGi2-OH (Ac- SEQ ID NO:42-PEG₂-K(cyc/o[SEQ ID NO:79])-PEG|₂-OH) Ac-PKKKRKV-PEG₂-K(cw/o[FfTGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:8I])-PEGi₂-OH)

Ac-PKKKRKV-PEG₂-K(çvc/o[FGFGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:82])-PEG|₂-OH) Ac-PKKKRKV-PEG₂-K(çyc/o[GiTGrGrQ])-PEG|₂-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO: 133])-PEG|₂-OH) Ac-PKKKRKV-PEG₂-K(ck/o[FGFGRRRQ])-PEGi₂-OH (Ac- SEQ ID NO:42-PEG₂-K(çyc/o[SEQ ID NO:84])-PEGi₂-OH)

Ac-PKKKRKV-PEG₂-K(cyc/o[FGFRRRRQ])-PEGi2-OH (Ac- SEQ ID NO:42-PEG₂-K(qyc/o[SEQ ID NO:85])-PEG|₂-OH) Ac-rr-PEG2-K(cyc/o[Ff<î>GrGrQ])-PEGi2-OH (Ac-iT-PEG₂-K(cyc/o[SEQ ID NO:80])-PEGi₂-OH)

Ac-rr-PEG2-K(qvc/o[Ff®Cit-r-Cit-rQ])-PEGi2-OH (Ac-rr-PEG₂-K(cyc/o[SEQ ID NO:79])-PEGi₂-OH) Ac-rr-PEG₂-K(cyc/o[FfF-GRGRQ])-PEG_l2-OH (Ac-rr-PEG₂-K(cyc/o[SEQ ID NO:8 l])-PEG_l2-OH) Ac-rr-PEG2-K(çycfo[FGFGRGRQ])-PEGi2-OH (Ac-rr-PEG₂-K(cyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-rr-PEG₂-K(çyc/o[GfFGrGrQ])-PEGi₂-OH

152 (Ac-rr-PEG₂-K(qyc/o[SEQ ID NO: 133])-PEG₍₂-OH)

Ac-rr-PEG₂-K(cvc/o[FGFGRRRQ])-PEG_t2-OH (Ac-rr-PEG₂-K.(qyc/o[SEQ ID NO:84])-PEG,₂-OH) Ac-ir-PEG₂-K(qyc/o[FGFRRRRQ])-PEGi₂-OH (Ac-n--PEG₂-K(qyc/o[SEQ ID NO:85])-PEG|₂-OH)

Ac-m-PEG₂-K(cyc/o[Ff<I>GrGrQ])-PEGi2-OH (Ac-nT-PEG₂-K.(qytto[SEQ ID NO:80])-PEG,₂-OH) Ac-rn-PEG₂-K(qyc/o[FfOCit-r-Cit-i-Q])-PEGi₂-OH (Ac-m-PEG₂-K(cyc/o[SEQ ID NO:79])-PEG,₂-OH) Ac-iiT-PEG₂-K(cyc/o[FfFGRGRQ])-PEGi₂-OH (Ac-rrr-PEG₂-K(cyc7o[SEQ ID NO:8l])-PEG_l2-OH)

Ac-in-PEG₂-K(cyc/o[FGFGRGRQ])-PEGi2-OH (Ac-nT-PEG₂-K(qyc/o[SEQ ID NO:82])-PEG|₂-OH) Ac-m-PEG₂-K(cyc/o[GfFGrGrQ])-PEGi2-OH (Ac-nT-PEG₂-K.{qyc/o[SEQ ID NO:l33])-PEG_l2-OH)

Ac-in--PEG₂-K(cyc/o[FGFGRRRQ])-PEG|₂-OH (Ac-rrr-PEG₂-K(cyc/o[SEQ ID NO:84])-PEG,₂-OH) Ac-rn-PEG₂-K(cyc/o[FGFRRRRQ])-PEGi₂-OH (Ac-rrr-PEG₂-K(cyc/o[SEQ ID NO:85])-PEG|₂-OH)

Ac-rhr-PEG₂-K(qyc/o[FfOGrGi-Q])-PEG|₂-OH (Ac-rhr-PEG₂-K(qyc/o[SEQ ID NO:80])-PEG|₂-OH) Ac-rhr-PEG₂-K(cTc/p[Ff0Cit-r-Cit-rQ])-PEGi2-OH (Ac-rhr-PEG₂-K(cyc/o[SEQ ID NO:79])-PEGi₂-OH)

Ac-rhr-PEG₂-K(cK/o[FfFGRGRQ])-PEGi2-OH (Ac-rhr-PEG₂-K(cyc/o[SEQ ID NO:8 l])-PEGi₂-OH)

Ac-rhr-PEG₂-K(cyc/o[FGFGRGRQ])-PEGi2-OH (Ac-rhi-PEG2-K(qyc/o[SEQ ID NO:82])-PEG|₂-OH) Ac-rhr’PEG₂-K(cyc/<?[GfFGrGrQ])-PEGi2-OH (Ac-rhr-PEG₂-K(qvc/o[SEQ ID NO: 133])-PEG|₂-OH)

153

Ac-rhr-PEG₂-K(w/o[FGFGRRRQ])-PEGi₂-OH (Ac-rhr-PEG₂-K(çyc/p[SEQ ID NO:84])-PEGi₂-OH)

Ac-rhr-PEG₂-K(qyc/o[FGFRRRRQ])-PEGi2-OH (Ac-ihr-PEG₂-K(cvtfo[SEQ ID NO:85])-PEGi₂-OH)

Ac-rbr-PEG₂-K(cyc/o[Ff(DGrGrQ])-PEGi₂-OH (Ac-rbr-PEG₂-K(cyc/o[SEQ ID NO:80])-PEG|₂-OH)

Ac-rbr-PEG₂-K(çyc/o[FfOCit-r-Cit-rQ])-PEGi₂-OH (Ac-rbr-PEG₂-K(cyc/o[SEQ ID NO:79])-PEG_i2-OH)

Ac-rbr-PEG₂-K(cyc/o[FfFGRGRQ] )-PEG 1₂-OH (Ac-ibr-PEG₂-K(cyc/o[SEQ ID NO:8 l])-PEG_l2-OH)

Ac-rbr-PEG₂-K(qyc/o[FGFGRGRQ])-PEGi2-OH (Ac-rbr-PEG₂-K(qyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-rbr-PEG₂-K(cyc/o[GfFGrGrQ])-PEG|₂-OH (Ac-rbr-PEG₂-K(cyc/o[SEQ ID NO:l33])-PEG,₂-OH)

Ac-rbr-PEG₂-K(cvc/a[FGFGRRRQ])-PEGi₂-OH (Ac-rbr-PEG₂-K(cyc/o[SEQ ID NO:84])-PEG_t2-OH) Ac-rbr-PEG₂-K(cyc/o[FGFRRRRQ])-PEGi₂-OH (Ac-rbr-PEG₂-K(qycfo[SEQ ID NO:85])-PEGi₂-OH)

Ac-rbrbr-PEG₂-K.(çyc7o[Ff<î>GrGrQ])-PEGi₂-OH (Ac-SEQ ID NO:l38-PEG₂-K(cyc7o[SEQ ID NO:80])-PEGi₂-OH)

Ac-rbrbr-PEG₂-K(cyc7o[FfOCit-r-Cit-rQ])-PEGi₂-OH (Ac- SEQ ID NO:l38-PEG₂-K(qyc/o[SEQ ID NO:79])-PEG|₂-OH)

Ac-rbrbr-PEG₂-K(cyc/o[FfFGRGRQ] )-PEG _l2-OH (Ac- SEQ ID NO:l38-PEG₂-K(cvc7o[SEQ ID NO:8I])-PEG_i2-OH)

Ac-rbrbr-PEG₂-K.(cyc/o[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:138-PEG₂-K(ck/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-rbrbr-PEG₂-K.(cyc/o[GfFGrGrQ])-PEGi2-OH (Ac- SEQ ID NO:l38-PEG₂-K(cyc/o[SEQ ID NO: I33])-PEGi₂-OH)

Ac-rbrbr-PEG₂-K(cyc/o[FGFGRRRQ])-PEGi₂-OH

154 (Ac- SEQ ID NO:l38-PEG₂-K(qyc/o[SEQ ID NO:84])-PEGi₂-OH)

Ac-rbrbr-PEG₂-K(cyc/o[FGFRRRRQ])-PEG|₂-OH (Ac- SEQ ID NO:138-PEG₂-K(qyc/o[SEQ ID NO:85])-PEG_]2-OH)

Ac-rbhbi-PEG₂-K(çyc/o[FfOGrGrQ])-PEG|₂-OH (Ac- SEQ ID NO:l49-PEG₂-K(qyc7o[SEQ ID NO:80])-PEG|₂-OH)

Ac-rbhbr-PEG₂-K(cyc/o[FfOCit-r-Cit-rQ])-PEGi₂-OH (Ac- SEQ ID NO:l49-PEG₂-K(cycfo[SEQ ID NO:79])-PEGi₂-OH)

Ac-rbhbr-PEG2-K(cyc/o[FfFGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:l49-PEG2-K(cyc/o[SEQ ID NO:8l])-PEGi₂-OH)

Ac-rbhbr-PEG₂-K(cFc/o[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:l49-PEG₂-K(cyc7o[SEQ ID NO:82])-PEGi₂-OH)

Ac-rbhbr-PEG₂-K(cvc/o[GfFGrGi-Q])-PEG_l2-OH (Ac- SEQ ID NO:l49-PEG₂-K(cyc7o[SEQ ID NO: 133])-PEG,₂-OH)

Ac-rbhbr-PEG₂-K(cK/o[FGFGRRRQ])-PEGi2-OH (Ac- SEQ ID NO:l49-PEG₂-K(cyc/o[SEQ ID NO:84])-PEGi₂-OH)

Ac-rbhbr-PEG₂-K(cvc/o[FGFRRRRQ])-PEGi2-OH (Ac- SEQ ID NO:l49-PEG₂-K(cyc/o[SEQ ID NO:85])-PEG|₂-OH)

Ac-hbrbh-PEG₂-K(cjc7o[FfOGrGrQ])-PEG|₂-OH (Ac- SEQ ID NO:l4l-PEG₂-K(qyc/o[SEQ ID NO:80])-PEG|₂-OH)

Ac-hbrbh-PEG2-K(çvr/o[FlWCit-r-Cit-rQ])-PEGi2-OH (Ac- SEQ ID NO:l4l-PEG₂-K(qvc/o[SEQ ID NO;79])-PEGi₂-OH)

Ac-hbrbh-PEG₂-K(qyc/o[FfFGRGRQ])-PEGi2-OH (Ac- SEQ ID NO:l4l-PEG₂-K(çyc7o[SEQ ID NO:8l])-PEGi₂-OH)

Ac-hbrbh-PEG₂-K(cyc/p[FGFGRGRQ])-PEGi₂-OH (Ac- SEQ ID NO:l4l-PEG₂-K(çyc/o[SEQ ID NO:82])-PEGi₂-OH)

Ac-hbrbh-PEG₂-K(cyc/o[G fFGrGrQ] )-PEG I2-OH (Ac- SEQ ID NO:l4l-PEG₂-K(cvc/o[SEQ ID NO: 133])-PEG_]2-OH)

Ac-hbrbh-PEG2-K(cyc/o[FGFGRRRQ])-PEGi2-OH (Ac- SEQ ID NO:l4l-PEG₂-K(çvc/o[SEQ ID NO:84])-PEGi₂-OH)

155

Ac- hbrbh -PEG₂-K(cyc/o[FGFRRRRQ])-PEGi₂-OH (Ac- SEQ ID NO:l4l-PEG₂-K(cyc7o[SEQ ID NO:85])-PEGi₂-OH), wherein b is beta-alanine, and the exocyclic sequence can be D or L stereochemistry.

Cargo

The cell penetrating peptide (CPP), such as a cyclic cell penetrating peptide (e.g., cCPP), can be conjugated to a cargo. As used herein, “cargo” is a compound or moiety for which delivery into a cell is desired. The cargo can be conjugated to a terminal carbonyl group of a linker. At least one atom of the cyclic peptide can be replaced by a cargo or at least one lone pair can form a bond to a cargo. The cargo can be conjugated to the cCPP by a linker. The cargo can be conjugated to an AAsc by a 10 linker. At least one atom of the cCPP can be replaced by the cargoty or at least one lone pair of the cCPP forms a bond to the cargo. A hydroxyl group on an amino acid side chain of the cCPP can be replaced by a bond to the cargo. A hydroxyl group on a glutamine side chain of the cCPP can be replaced by a bond to the cargo. The cargo can be conjugated to the cCPP by a linker. The cargo can be conjugated to an AAsc by a linker.

In embodiments, the amino acid side chain comprises a chemically reactive group to which the linker or cargo is conjugated. The chemically reactive group can comprise an amine group, a carboxylic acid, an amide, a hydroxyl group, a sulfhydryl group, a guanidinyl group, a phenolic group, a thioether group, an imidazolyl group, or an indolyl group. In embodiments, the amino acid of the cCPP to which the cargo is conjugated comprises lysine, arginine, aspartic acid, glutamic acid, asparagine, glutamine, 20 homoglutamine, serine, threonine, tyrosine, cysteine, arginine, tyrosine, méthionine, histidine or tryptophan.

The cargo can comprise one or more détectable moieties, one or more therapeutic moieties (TMs), one or more targeting moieties, or any combination thereof. In embodiments, the cargo comprises a TM. In embodiments, the cargo comprises an AC.

Cyclic cell penetrating peptides (cCPPs) conjugated to a cargo moiety

The cyclic cell penetrating peptide (cCPP) can be conjugated to a cargo moiety.

The cargo moiety can be conjugated to the linker at the terminal carbonyl group to provide the following structure:

156

, wherein:

EP is an exocyclic peptide and M, AA.sc, Cargo, x’, y, and z’ are as defined above, * is the point of attachaient to the AAsc.. x’ can be l. y can be 4. z’ can be 11. -(OCHzCHijx - and/or(OCIhCI lih - can be independently replaced with one or more amino acids, including, for example, 5 glycine, beta-alanine, 4-aminobutyric acid, 5-aminopentanoic acid, 6-aniinohexanoic acid, or combinations thereof.

An endosomal escape vehicle (EEV) can comprise a cyclic cell penetrating peptide (cCPP), an exocyclic peptide (EP) and linker, and can be conjugated to a cargo to fonn an EEV-conjugate comprising the structure of Formula (C):

NH

IO (C) or a protonated fonn thereof, wherein:

Ri, R2, and R3 can each independently be H or an amino acid residue having a side

157 chain comprising an aromatic group;

R₄ is H or an amino acid side chain;

EP is an exocyclic peptide as defined herein;

Cargo is a moiety as defined herein;

each m is independently an integer from 0-3;

n is an integer from 0-2;

x’ is an integer from 2-20;

y is an integer from l-5;

q is an integer from l-4; and z’ is an integer from 2-20.

Ri, Rz, R3.R4. EP, cargo, m, n, x¹, y, q, and z’ are as defined herein.

The EEV can be conjugated to a cargo and the EEV-conjugate can comprise the structure of Formula (C-a) or (C-b):

158

thereof, wherein EP, m and z are as defined above in Formula (C).

The EEV can be conjugated to a cargo and the EEV-conjugate can comprise the structure of Formula (C-c):

NH (C-c),

159 or a protonated form thereof, wherein EP, R¹, R², R³, R⁴, and m are as defined above in Formula (III); AA can be an amino acid as defined herein; n can be an integer from 0-2; x can be an integer from 1-10; y can be an integer from 1-5; and z can be an integer from 1-10.

The EEV can be conjugated to an oligonucleotide cargo and the EEV-oligonucleotide conjugale can 5 comprises a structure of Formula (C-1 ), (C-2), (C-3), or (C-4):

160

I6l

(C-4)

The EEV can be conjugated to an oligonucleotide cargo and the EEV-conjugate can comprise the structure:

162

Cytosolic Delivery Efficiency

Modifications to a cyclic cell penetrating peptide (cCPP) may improve cytosolic delivery efficiency.

Improved cytosolic uptake efficiency can be measured by comparing the cytosolic delivery efficiency 5 of a cCPP having a modified sequence to a control sequence. The control sequence does not include a particular replacement amino acid residue in the modified sequence (including, but not limited to arginine, phenylalanine, and/or glycine), but is otherwise identical.

As used herein cytosolic delivery efficiency refers to the ability of a cCPP to traverse a cell membrane and enter the cytosol of a cell. Cytosolic delivery efficiency of the cCPP is not necessarily dépendent 163 on a receptor or a cell type. Cytosolic delivery efficiency can refer to absolute cytosolic delivery efficiency or relative cytosolic delivery efficiency.

Absolute cytosolic delivery efficiency is the ratio of cytosolic concentration of a cCPP (or a cCPPcargo conjugale) over the concentration of the cCPP (or the cCPP-cargo conjugale) in the growth medium. Relative cytosolic delivery efficiency refers to the concentration of a cCPP in the cytosol compared to the concentration of a control cCPP in the cytosol. Quantification can be achieved by lluorescently labeling the cCPP (e.g., with a FITC dye) and measuring the fluorescence intensity using techniques well-known in the art.

Relative cytosolic delivery efficiency is determined by comparing (i) the amount of a cCPP of the invention intemalized by a cell type (e.g., HeLa cells) to (ii) the amount of a control cCPP intemalized by the same cell type. To measure relative cytosolic delivery efficiency, the cell type may be incubated in the presence of a cCPP for a specified period of time (e.g., 30 minutes, l hour, 2 hours, etc.) after which the amount of the cCPP intemalized by the cell is quantified using methods known in the art, e.g., fluorescence microscopy. Separately, the same concentration of the control cCPP is incubated in the presence of the cell type over the same period of time, and the amount of the control cCPP intemalized by the cell is quantified.

Relative cytosolic delivery efficiency can be determined by measuring the IC50 of a cCPP having a modified sequence for an intracellular target and comparing the IC50 of the cCPP having the modified sequence to a control sequence (as described herein).

The relative cytosolic delivery efficiency of the cCPPs can be in the range of from about 50% to about 450% compared to cyclo(FfORrRrQ, SEQ ID NO:l50), e.g., about 60%, about 70%, about 80%, about 90%, about 100%, about 110%, about 120%, about 130%, about 140%, about 150%, about 160%, about 170%, about 180%, about 190%, about 200%, about 210%, about 220%, about 230%, about 240%, about 250%, about 260%, about 270%, about 280%, about 290%, about 300%, about 310%, about 320%, about 330%, about 340%, about 350%, about 360%, about 370%, about 380%, about 390%, about 400%, about 410%, about 420%, about 430%, about 440%, about 450%, about 460%, about 470%, about 480%, about 490%, about 500%, about 510%, about 520%, about 530%, about 540%, about 550%, about 560%, about 570%, about 580%, or about 590%, inclusive of ail values and subranges therebetween. The relative cytosolic delivery efficiency of the cCPPs can be

164 improved by greater than about 600% compared to a cyclic peptide comprising cyclo(FfC>RrRrQ, SEQ IDNO:l50).

The absolute cytosolic delivery efficacy of from about 40% to about 100%, e.g., about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, 5 about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about

99%, inclusive of ail values and subranges therebetween.

The cCPPs of the présent disclosure can improve the cytosolic delivery efficiency by about 1.1 fold to about 30 fold, compared to an otherwise identical sequence, e.g., about 1.2, about 1.3, about 1.4, about 1.5, about 1.6, about 1.7, about 1.8, about 1.9, about 2.0, about 2.5, about 3.0, about 3.5, about 10 4.0, about 4.5, about 5.0, about 5.5, about 6.0, about 6.5, about 7.0, about 7.5, about 8.0, about 8.5, about 9.0, about 10, about 10.5, about 11.0, about 11.5, about 12.0, about 12.5, about 13.0, about 13.5, about 14.0, about 14.5, about 15.0, about 15.5, about 16.0, about 16.5, about 17.0, about 17.5, about 18.0, about 18.5, about 19.0, about 19.5, about 20, about 20.5, about 21.0, about 21.5, about 22.0, about 22.5, about 23.0, about 23.5, about 24.0, about 24.5, about 25.0, about 25.5, about 26.0, about 15 26.5, about 27.0, about 27.5, about 28.0, about 28.5, about 29.0, or about 29.5 fold, inclusive of ail values and subranges therebetween.

Détectable moiety

In embodiments, the compound disclosed herein includes a détectable moiety. In embodiments, the detectible moiety is attached to the cell penetrating peptide at the amino group, the carboxylate group, 20 or the side chain of any of the amino acids of the cell penetrating peptide moiety (e.g., at the amino group, the carboxylate group, or the side chain of any amino acid in the CPP). In embodiments, the therapeutîc moiety includes a détectable moiety. The détectable moiety can include any détectable label. Examples of suitable détectable labels include, but are not limited to, a UV-Vis label, a nearinfrared label, a luminescent group, a phosphorescent group, a magnetic spin résonance label, a 25 photosensitizer, a photocleavable moiety, a chelating center, a heavy atom, a radioactive isotope, an isotope détectable spin résonance label, a paramagnetic moiety, a chromophore, or any combination thereof. In embodiments, the label is détectable without the addition of further reagents.

In embodiments, the détectable moiety is a biocompatible détectable moiety, such that the compounds can be suitable for use in a variety of biological applications. “Biocompatible” and “biologically 30 compatible”, as used herein, generally refer to compounds that are, along with any métabolites or 165 dégradation products thereof, generally non-toxic to cells and tissues, and which do not cause any significant adverse effects to cells and tissues when cells and tissues are incubated (e.g., cultured) in their presence.

The détectable moiety can contain a luminophore such as a fluorescent label or near-infrared label. Examples of suitable luminophores include, but are not limited to, métal porphyiins; benzoporphyrins; azabenzoporphyrine; napthoporphyrin; phthalocyanine; polycyclic aromatic hydrocarbons such as perylene diimine, pyrenes; azo dyes; xanthene dyes; boron dipyoromethene, aza-boron dipyoromethene, cyanîne dyes, metal-ligand complex such as bipyridine, bipyridyls, phenanthroline, coumarin. and acelylacetonates of ruthénium and iridium; acridine, oxazine dérivatives such as benzophenoxazine; aza-annulene, squaraine; 8-hydroxyquinoline, polymethines, luminescent producing nanoparticle, such as quantum dots, nanocrystals; carbostyril; terbium complex; inorganic phosphor; ionophore such as crown ethers affiliated or derivatized dyes; or combinations thereof. Spécifie examples of suitable luminophores include, but are not limited to, Pd (II) octaethylporphyrin; Pt (Il)-octaethylporphyrin; Pd (II) tetraphenylporphyrin; Pt (11) tetraphenylporphyrm; Pd (II) meso-tetraphenylporphyrin tetrabenzoporphine; Pt (II) mesotetraphenyl metrylbenzoporphyrin; Pd (II) octaethylporphyrin ketone; Pt (II) octaethylporphyrin ketone; Pd (II) meso-tetra(pentafluorophenyl)porphyrin; Pt (II) meso-tetra (pentafluorophenyl) porphyrin; Ru (II) tris(4,7-diphenyl-l,l0-phenanthroline) (Ru (dpp)j); Ru (II) tris( l, 10phenanthroline) (Ru(phen)j), tris(2,2’-bipyridine)ruthenium (II) chloride hexahydrate (Ru(bpy)a); erythrosine B; fluorescein; fluorescein isothiocyanate (FITC); eosin; iridium (III) ((N-methylbenzimidazol-2-yl)-7-(diethylamino)-coumarin)); 166enzothiazole) ((benzothiazol-2-yl)-7(diethylamino)-coumarin))-2-(acetylacetonate); Lumogen dyes; Macroflex fluorescent red; Macrolex fluorescent yellow; Texas Red; rhodamine B; rhodamine 6G; sulfur rhodamine; m-cresol; thymol blue; xylenol blue; cresol red; chlorophenol blue; bromocresol green; bromcresol red; bromothymol blue; Cy2; a Cy3; a Cy5; a Cy5.5; Cy7; 4-nitirophenol; alizarin; phenolphthalein; o-cresolphthalein; chlorophenol red; calmagite; bromo-xylenol; phénol red; neulral red; nitrazine; 3,4,5,6tetrabromphenolphtalein; congo red; fluor’sc’in; eosin; 2',7'-dichlorofluorescein; 5(6)-carboxyfluorecsein; carboxynaphthofluorescein; 8-hydroxypyrene-l,3,6-trisulfonic acid; seminaphthorhodafluor; semi-naphthofluorescein; tris (4,7-diphenyl-l,l0-phenanthroline) ruthénium (II) dichloride; (4,7-diphenyl-l,l0-phenanthroline) ruthénium (II) tetraphenylboron; platinum (II) octaethylporphyin; dialkylcarbocyanine; dioctadecylcycloxacarbocyanine;

166 fluorenylmethyloxycarbonyl chloride; 7-amino-4-methylcourmarin (Ame); green fluorescent protein (GFP); and dérivatives or combinations thereof.

In some examples, the détectable moietycan include Rhodamine B (Rho), fluorescein isothiocyanate (FITC), 7-amino-4-methylcourmarin (Ame), green fluorescent protein (GFP), or dérivatives or combinations thereof.

Methods of Making

The compounds described herein can be prepared in a variety of ways known to one skilled in the art of organic synthesis or variations thereon as appreciated by those skilled in the art. The compounds described herein can be prepared from readily available starting materials. Optimum reaction conditions can vary with the particular reactants or solvents used, but such conditions can be determined by one skilled in the art.

Variations on the compounds described herein include the addition, subtraction, or movement of the various constituents as described for each compound. Similarly, when one or more chiral centers are présent in a molécule, the chirality of the molécule can be changed. Additionally, compound synthesis can involve the protection and deprotection of various Chemical groups. The use of protection and deprotection, and the sélection of appropriate protecting groups can be determined by one skilled in the art. The chemistry of protecting groups can be found, for example, in Wuts and Greene, Protective Groups in Organic Synthesis, 4th Ed., Wiley & Sons, 2006, which is incorporated herein by reference in its entirety.

The starting materials and reagents used in preparing the disclosed compounds and compositions are either available from commercial suppliers such as Aldrich Chemical Co., (Milwaukee, Wl), Acros Organics (Morris Plains, NJ), Fisher Scientific (Pittsburgh, PA), Sigma (St. Louis, MO), Pfizer (New York, NY), GlaxoSmithKline (Raleigh, NC), Merck (Whitehouse Station, NJ), Johnson & Johnson (New Brunswick, NJ), Aventis (Bridgewater, NJ), AstraZeneca (Wilmington, DE), Novartis (Basel, Switzerland), Wyeth (Madison, NJ), Bristol-Myers-Squibb (New York, NY), Roche (Basel, Switzerland), Lilly (Indianapolis, IN), Abbott (Abbott Park, IL), Schering Plough (Kenilworth, NJ), or Boehringer Ingelheim (Ingelheim, Germany), or are prepared by methods known to those skilled in the art following procedures set forth in référencés such as Fieser and Fieser’s Reagents for Organic Synthesis, Volumes 1-17 (John Wiley and Sons, 1991); Rodd’s Chemistry of Carbon Compounds, Volumes I-5 and Supplémentais (Elsevier Science Publishers, 1989); Organic Reactions, Volumes l167 (John Wiley and Sons, 1991 ); March’s Advanced Organic Chemistry, (John Wiley and Sons, 4th Edition); and Larock’s Comprehensive Organic Transformations (VCH Publishers Inc., 1989). Other materials, such as the pharmaceutical carriers disclosed herein can be obtained from commercial sources.

Reactions to produce the compounds described herein can be carried out in solvents, which can be selected by one of skill in the art of organic synthesis. Solvents can be substantially nonreactive with the starting materials (reactants), the intermédiares, or products under the conditions at which the reactions are carried out, i.e., température and pressure. Reactions can be carried out in one solvent or a mixture of more than one solvent. Product or intermediate formation can be monitored according to any suitable method known in the ail. For example, product formation can be monitored by spectroscopic means, such as nuclear magnetic résonance spectroscopy (e.g., 'H or ^l3C) înfrared spectroscopy, spectrophotometry (e.g., UV-visible), or mass spectrometry, or by chromatography such as high-performance liquid chromatography (HPLC) or thin layer chromatography.

The disclosed compounds can be prepared by solid phase peptide synthesis wherein the amino acid α-Ν-terminus is protected by an acid or base protecting group. Such protecting groups should hâve the properties of being stable to the conditions of peptide linkage formation while being readily removable without destruction of the growing peptide chain or racemization of any of the chiral centers contained therein. Suitable protecting groups are 9-fluorenylmethyloxycarbonyl (Fmoc), tbutyloxycarbonyl (Boc), benzyloxycarbonyl (Cbz), biphenylisopropyloxycarbonyl, tamyloxycarbonyl, isobomyloxycarbonyl, a,a-dimethyl-3,5-dimethoxybenzyloxycarbonyl, onitrophenylsulfenyl, 2-cyano-t-butyloxycarbonyl, and the like. The 9-fluorenylmethyloxycarbonyl (Fmoc) protecting group is particularly preferred for the synthesis of the disclosed compounds. Other preferred side chain protecting groups are, for side chain amino groups like lysine and arginine, 2,2,5,7,8-pentamethylchroman-6-sulfonyl (pmc), nitro, p-to!uenesulfonyl, 4-methoxybenzenesulfonyl, Cbz, Boc, and adamantyloxycarbonyl; for tyrosine, benzyl, o-bromobenzyloxy-carbonyl, 2,6-dichlorobenzyl, isopropyl, t-butyl (t-Bu), cyclohexyl, cyclopenyl and acetyl (Ac); for serine, tbutyl, benzyl and tetrahydropyranyl; for histidine, trityl, benzyl, Cbz, p-toluenesulfonyl and 2,4dinitrophenyl; for tryptophan, formyl; for asparticacid and glutamic acid, benzyl and t-butyl and for cysteine, triphenylmethyl (trityl).

168

In the solid phase peptide synthesis method, the α-C-tenninal amino acid is attached to a suitable solid support or resin. Suitable solid supports useful for the above synthesis are those materials which are inert to the reagents and reaction conditions of the stepwise condensation-deprotection reactions, as well as being insoluble in the media used. Solid supports for synthesis of α-C-terminal carboxy peptides is 4-hydroxymethylphenoxymethyl-copoly(slyrene-l% divinylbenzene) or 4-(2’,4'dimethoxyphenyl-Fmoc-aminomethyl)phenoxyacetamidoethyl resin available from Applied Biosystems (Foster City, Calif). The α-C-terminal amino acid is coupled to the resin by means of Ν,Ν'-dicyclohexylcarbodiimide (DCC), Ν,Ν'-diisopiOpylcarbodiimide (DIC) or O-benzotriazol-1 -ylΝ,Ν,Ν',Ν'-tetramethyluiOniumhexafluorophosphate (HBTU), with or without 4dimethylaminopyridine (DMAP), l-hydroxybenzotriazole (HOBT), benzotriazol-l -yloxytris(dimethylarnino)phosphoniumhexafluorophosphate (BOP) or bis(2-oxo-3oxazolidinyl)phosphine chloride (BOPCl), mediated coupling for from about l to about 24 hours at a température of between !0°C and 50°C in a solvent such as dichloromethane or DMF. When the solid support is 4-(2',4’-dimethoxyphenyl-Fmoc-aminomethyl)phenoxy-acetaniidoethyl resin. the Fmoc group is cleaved with a secondary amine, preferably piperidine, prior to coupling with the a-Cterminal amino acid as described above. One method for coupling to the deprotected 4 (2',4dimethoxyphenyl-Fmoc-aminomethyl)phenoxy-acetamidoethyl resin is O-benzotriazol-l-ylN,N,N',N'-tetramethyluroniumhexafluoiOphosphate (HBTU, l equiv.) and l-hydroxybenzotriazole (HOBT, l equiv.) in DMF. The coupling of successive protected amino acids can be carried oui in an automatic polypeptide synthesizer. In one example, the α-Ν-terminus in the amino acids of the growing peptide chain are protected with Fmoc. The removal of the Fmoc protecting group from the α-Ν-terminal side of the growing peptide is accomplished by treatnient with a secondary amine, preferably piperidine. Each protected amino acid is then introduced in about 3-fold niolar excess, and the coupling is preferably carried out in DMF. The coupling agent can be O-benzotriazol-l-ylΝ,Ν,Ν',Ν'-tetramethyluroniumhexafluorophosphate (HBTU, l equiv.) and l-hydroxybenzotriazole (HOBT, l equiv.). At the end ofthe solid phase synthesis, the polypeptide is removed from the resin and deprotected, either successively or in a single operation. Removal of the polypeptide and deprotection can be accomplished in a single operation by treating the resin-bound polypeptide with a cleavage reagent comprising thianisole, water, ethanedithiol and trifluoroacetic acid. In cases wherein the α-C-terminal of the polypeptide is an alkylamide, the resin is cleaved by aminolysis with an alkylamine. Alternatively, the peptide can be removed by transestérification, e.g. with methanol.

169 followed by aminolysis or by direct transamidation. The protected peptide can be purified at this point or taken to the next step directly. The removal of the side chain protecting groups can be accomplished using the cleavage cocktail described above. The fully deprotected peptide can be purified by a sequence of chromatographie steps employing any or ail of the following types: ion exchange on a 5 weakly basic resin (acetate fonn); hydrophobie adsorption chromatography on underivitized polystyrene-divinylbenzene (for example, Amberlite XAD); silica gel adsorption chromatography; ion exchange chromatography on carboxymethylcellulose; partition chromatography, e.g. on Sephadex G-25, LH-20 or countercurrent distribution; high performance liquid chromatography (HPLC), especially reverse-phase HPLC on octyl- or octadecylsilyl-silica bonded phase column 10 packing.

The above polymers, such as PEG groups, can be attached to an oligonucleotide, such as an AC, under any suitable conditions. Any means known in the art can be used, including via acylation, reductive alkylation, Michael addition, thiol alkylation or other chemoselective conjugation/ligation methods through a reactive group on the PEG moiety (e.g., an aldéhyde, amino, ester, thiol, a-haloacetyl, 15 maleimido or hydrazino group) to a reactive group on the AC (e.g., an aldéhyde, amino, ester, thiol, α-haloacetyl, maleimido or hydrazino group). Activating groups which can be used to link the water soluble polymer to one or more proteins include without limitation sulfone, maleimide, sulfhydryl, thiol, triflate, tresylate, azidirine, oxirane, 5-pyridyl, and alpha-halogenated acyl group (e.g., a-iodo acetic acid, α-bromoacetic acid, α-chloroacetic acid). If attached to the AC by reductive alkylation, 20 the polymer selected should hâve a single reactive aldéhyde so that the degree of polymerization is controlled. See, for example, Kinstler et al., Adv. Drug. Delivery Rev. (2002), 54: 477-485; Roberts et al., Adv. Drug Delivery Rev. (2002), 54: 459-476; and Zalipsky et al., Adv. Drug Delivery Rev. (1995), 16: 157-182.

In order to direct covalently link the AC or linker to the CPP, appropriate amino acid residues of the 25 CPP may be reacted with an organic derivatizing agent that is capable of reacting with a selected side chain or the N- or C-termini of an amino acids. Réactivé groups on the peptide or conjugale moiety include, e.g., an aldéhyde, amino, ester, thiol, α-haloacetyl, maleimido or hydrazino group. Derivatizing agents include, for example, maleimidobenzoyl sulfosuccinimide ester (conjugation through cysteine residues), N-hydiOxysuccinimide (through lysine residues), glutaraldehyde, succinic 30 anhydride or other agents known in the art.

170

Methods of making AC and conjugating AC to linear CPP are generally described in US Pub. No. 2018/0298383, which is herein incoiporated by référencé for ail puiposes. The methods may be applied to the cyclic CPPs disclosed herein.

Synthetic schemes are provided in FIG. 3A-3D and FIG. 4.

Non-limiting examples of compounds that include a CPPs and a reactive group useful for conjugation to an AC are shown in Table 9. Example linker groups are also shown. Example reactive groups include tetrafluorophenyl ester (TFP), free carboxylic acid (COOH), and azide (Nj). In Table 9, n is an integer from 0 to 20; Pipa6 is AcRXRRBR.RXRYQFLIR.XR.BR.XREi wherein B is β-Alanine and X is aminohexanoic acid; Dap is 2,3-diaminopropionic acid; NLS is a nuclear localization sequence;

β A is beta alanine; -ss- is a disulfide; PABC is poly(A) binding protein C-terminal domain; C_x where x is a number is an alkyl chain of length x; and BCN is bicyclo [6.1 .Ojnonyne.

Table 9: Compounds that include a CPPs and a reactive group

TFP-PEG_n-K(CPP)_____________________________________

TFP-PEG_n-K(CPP)-PEG_n-Dap(palmitoyl)_____________________

TFP-PEG_n-K(CPP)-PEG_n-Dap(CPP)________________________

TFP-Pip6a________________________________________________

CPP-PEGn-TFP_______________________________________

CPP-PEGn-K(CPP)-PEG_n-TFP____________________________

CPP-PEGn-LysÇNs)_______________________________________

CPP-K(CPP)-PEG_n-K(Nj)________________________________

CPP-PEG_n-K(PEGn-CPP)-PEGn-K(Nj)

CPP-PEG_n-K(PEG_n-CPP)-PEG_n-K(N₃)______________________

CPP-K(CPP)-K(CPP)-PEG_n-K(N₃)_________________________

CPP-PEGn-K(PEGn-CPP)-K(PEGn-CPP)-PEG„-K(N3)__________

CPP-PEGn-K(PEG_n-CPP)-K(PEG_n-CPP)-PEG_n-K(N3)

Ac-NLS-Lys(CPP)-PEG_n-K(N3)____________________________

K(N₃)- PEG_n-NLS-ss-PEG_n-CPP____________________________

BCN-NLS-ss-CPP____________________________________

CPP-PEG_n-Val-Cit-PABC-K(N₃)___________________________

CPP-PEG_n-Cys-ss-Cys-K(N₃)________________________________

CPP-PEG_n-Cys-ss-Cys-K(N3)________________________________

CPP-PEG_n-TFP______________________________________

CPP-PEG_n-Lys(N₃)_______________________________________

CPP-PEGn-Cys-prodisulfide-K(N₃)

CPP-PEG_n-K(N3)

171

CPP-K(CPP)-PEGn-K(Nj)____________________

CPP-PEG_n-K(CPP)-PEG_n-TFP________________

CPP-C&-TFP______________________________

CPP-PEG_n-K(PEG_n-CPP)PEG_n-K(N₃)__________

Ac-T9-PEG_n-Lys(CPP-PEG_n)-K(N₃)____________

Ac-MSP-PEG_n-K(CPP-PEG_n)-K(N₃)___________

CPP-PEGn-TFP (ENTRD 802)________________

CPP-Cô-TFP (ENTRD 696)__________________

CPP-PEG_n-K(CPP)-PEG_n-TFP(ENTRD-344)

CPP-PEGn-COOH_______________________

CPP-Cî2-TFP (ENTD-695) palmitoyl-PEG_n-K(CPP)-PEG_n-TFP (ENTD-343)

CPP-PEG_n-K(N₃) (ENTRD-617)_______________

Ac-T9-PEGn-K(CPP)-K(N₃) (ENTRD 673)______

Ac-MSP-PEGn-K(CPP-PEG_n)-K(N₃) (ENTRD 675) Ac-NLS-K(CPP)-PEG_n-K(N₃) (ENTRD 684)

K(N3)-PEG_n-NLS-ss-PEG_n-CPP (ETRD-681)

K(N3)-PEG_n-NLS-K- β A- β A-CPP (ETRD-682)

In embodiments, the CPPs hâve free carboxylic acid groups that may be utilized for conjugation to an AC. In embodiments, the EEVs hâve free carboxylic acid groups that may be utilized for conjugation to an AC.

The structure below is a 3’ cyclooctyne modified PMO used for a click reaction with a compound that

An example scheme of conjugation of a CPP and linker to the 3’ end of an AC via an amide bond is shown below.

172

O=P-NMe₂

O-P-NMe₂

CPP-Linker-COOH

PYAOP, DIPEA DMF, r.t

m=1-4

An example scheme of conjugation ofa CPP and linker to a 3’-cyclooctyne modified PMO via strainpromoted azide-alkyne cycloaddition is shown below:

173

O=P-NMe₂

Nuclease-free water 1-10 mM, r.t

CPP-Linker-N3

Mixture of regioisomers

An example of the conjugation chemistry used to connect an AC and CPP with an additional linker containing a polyethylene glycol moiety is shown below:

174

O=P-NMe₂

O-P-NMe₂

Nudease-free water 1-10 mM, ri

CPP-Linker-N3

O

N

Crt-PNh/e₂

O

N r₃

O = P-NMe₂

O=P-NMe₂

OP-NMej

Mixture of regioisomers

An example of conjugation of a CPP-linker to a 5’-cyclooctyne modified PMO via strain-promoted azide-alkyne cycloaddition (click chemistry) is shown below:

175

176

CPP-LiHker-N3

NudeasÉ-free water

1-lÛmM.r.l

O=P-NMa, 0

177

Methods of synthesizing oligomeric antisense compounds are known in the art. The présent disclosure is not limited by the method of synthesizing the AC. In embodiments, provided herein are compounds having reactive phosphorus groups useful for forming intemucleoside linkages including for example phosphodiester and phosphorothioate intemucleoside linkages. Methods of préparation and/or purification of precursors or antisense compounds are not a limitation of the compositions or methods provided herein. Methods for synthesis and purification of DNA, RNA, and the antisense compounds are well known to those skilled in the art.

Oligomérization of modified and unmodified nucleosides can be routinely performed according to literature procedures for DNA (Protocols for Oligonucleotides and Analogs, Ed. Agrawal ( 1993), Humana Press) and/or RNA (Scaringe, Methods (2001), 23, 206-217. Gait et al., Applications of Chemically synthesized RNA in RNA: Protein Interactions, Ed. Smith (199S), 1-36. Gallo et al., Tetrahedron (2001), 57, 5707-5713).

Antisense compounds provided herein can be conveniently and routinely made through the wellknown technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, CA). Any other means for such synthesis known in the art may additionally or altematively be employed. It is well known to use similar techniques to préparé oligonucleotides such as the phosphorothioates and alkylated dérivatives. The invention is not limited by lhe method of antisense compound synthesis.

Methods of oligonucleotide purification and analysis are known to those skilled in the art. Analysis methods include capillary electrophoresis (CE) and electrospray-mass spectroscopy. Such synthesis and analysis methods can be perfonned in multi-well plates, The method of the invention is not limited by the method of oligomer purification.

In the compounds disclosed herein, the AC is coupled to the CPP (e.g., cyclic peptide). As used herein, “coupled” can refer to a covalent or non-covalent association between the CPP to the AC, including fusion of the CPP to the AC and Chemical conjugation of the CPP (e.g., cyclic peptide) to the AC. A non-limiting example of a means to non-covalently attach the CPP to the AC is through the streptavidin/biolin interaction, e.g., by conjugating biotin to CPP and fusing AC to streptavidin.

In the resulting compound, the CPP is coupled to the AC via non-covalent association between biotin and streptavidin.

178

In embodiments, the CPP (e.g., cyclic peptide) is conjugated, directly or indirectly, to the AC to thereby fonn a CPP-AC conjugale. Conjugation of the AC to the CPP may occur at any appropriate site on these moieties. For example, In embodiments, the 5' or the 3' end of the AC may be conjugated to the C-terminus, the N-teiminus, or a side chain of an amino acid in the CPP.

In embodiments, the AC is covalently linked to the CPP (e.g., cyclic peptide). Covalent linkage, as used herein, refer to constructs where a CPP moiety is covalently linked to the 5' and/or 3' end of the AC moiety. Such conjugales may altematively be described as having a CPP moiety (e.g., cyclic peptide moiety) and an oligonucleotide moiety. A covalently-linked AC-CPP or CPP-AC conjugale, in accordance with certain embodiments, includes the AC component and the CPP component 10 associated with one another by a linker described herein.

In embodiments, the AC may be conjugated to the CPP (e.g. cyclic peptide) through a side chain of an amino acid on the CPP. Any amino acid side chain on the CPP which is capable of fonning a covalent bond, or which may be so modified, can be used to link AC to the CPP. The amino acid on lhe CPP can be a natural or non-natural amino acid. In embodiments, the amino acid on the CPP used I5 to conjugale the AC is aspartic acid, giutamic acid, glutamine, asparagine, lysine, ornithine, 2,3diaminopropionic acid, or analogs thereof, wherein the side chain is substituted with a bond to the AC or linker. In embodiments, the amino acid is lysine, or an analog thereof. In embodiments, the amino acid is giutamic acid, or an analog thereof. In embodiments, the amino acid is aspartic acid, or an analog thereof.

In embodiments, the CPP is cyclic. There are numerous possible configurations for the compounds disclosed herein. In embodiments, the compounds of the disclosure include compounds wherein AC is conjugated to the side chain of an amino acid in the cyclic peptide. In embodiments, the compounds disclosed herein hâve a structure (i.e., exocyclic) according to Formula I-A:

CPP-L-AC (l-A) wherein the linker is covalently bound to the side chain of an amino acid on the CPP and to the 5' end of the AC, the backbone of the AC, or the 3' end of the AC.

Diseases and Target Genes

179

In embodiments, compounds and methods are provided for treating a disease or disorder associated with one or more genes having an expanded nucléotide repeats (e.g., expanded trinucleotide repeats such as expanded trinucleotide repeats). In embodiments, compounds and methods are provided for treating a disease or disorder associated with one or more genes having an expanded CTG-CUG trinucleotide repeat. In embodiments, compounds and methods are provided for treating a disease or disorder associated with one or more genes having an expanded CTG-CUG trinucleotide repeat in the 3'-UTR of the gene. In embodiments, compounds and methods are provided for treating a disease or disorder associated with a gene that has an expanded CTG-CUG in the 3’ UTR such as DMPK, ATXN8OS ATXN 8, and/or JPH3. In embodiments, compounds and methods are provided for treating a disease or disorder associated with one or more genes having an expanded CTG-CUG trinucleotide repeat in the întron of a gene. In embodiments, compounds and methods are provided for treating a disease or disorder associated with an expanded CTGCUG trinucleotide repeat in an intron of TCF4. In embodiments, compounds and methods are provided for treating myotonie dystrophy type l (DMl), Fuchs’ Endothélial Comeal Dystrophy (FECD), Spinocerebellar Ataxia-8 (SCA8), and/or Huntington’s Disease-Like (HDL2).

Myotonie dystrophy type l (DMl)

In embodiments, compounds, compositions, and methods are provided to treat Myotonie dystrophy (DM l or Steinerfs disease). DM l is a multisystemic disorder often characterized by muscle degeneration and myotonia or delayed muscle relaxation due to répétitive action potentials in myofibers. Myotonie dystrophy type l (DMl) is the most common form of muscular dystrophy, affecting about l in 8000 people. DMl is a paradigm for genetic disorders caused by CTG CUG expansions. DMl is a neuromuscular disorder causcd by a CTG CUG repeat expansion in the 3'untranslated région (UTR) of the dystrophia myotonia protein kinase (DMPK) gene. At the RNA level, the DMPK transcript (e.g,, the expanded CUG repeat) sequesters splicing regulator proteins, for example, muscleblind-like (MBNL) protein, which results in incorrect splicing of a number of downstream pre-mRNAs (pre-mRNAs that do not contain an expanded CUG repeat) that are regulated by MBNLl. This gain-of-function is the cause of DMl.

The excessive number of CUG repeats impart toxic activity, referred to as a toxic gain-of- function. Multiple key proteins are misprocessed, and this contributes to the multisystemic nature of the disease, which includes generalized limb weakness, respiratory muscle impairment, cardiac abnormalities, fatigue, gastrointestinal complications, cataracts, incontinence, and excessive daytime sleepiness.

180

DMl patients with CTGCUG expansions within the 3'-untranslated région of DMPK gene are at increased risk for FECD and form CUGexp-MBNLl foci in comeal endothélium. (Mootha et al., Investigative ophthalmology &visual science, 2017; 58, 4579-4585). Association of MBNLl with mutant RNA affects the cellular pool of free MBNLl and triggers mis-splicing of some MBNLl target genes (e.g., regulated by MBNLl) in affected brain, muscle, and heart tissues (Jiang et al. Hum Mol Genet. 2004; 13: 3079 3088). Gattey et al. (Comea. 2014; 33: 96-98) reported FECD in four DMl subjects including a mother-daughter pair. Thus, the association between DMl and FECD is likely to be présent (FECD is described in more detail elsewhere herein).

Without being bound by theory, there are at least two hypothèses proposed to explain the pathogenesis of DML One is that the expanded CTG CUG repeats inhibit DMPK mRNA or protein production, resulting in DMPK haploinsufficiency. This was supported by studies demonstrating decreased expression of DMPK mRNA and protein in DMl muscle (Fu, Y.H.el al. (1993) Decreased expression ofmyotonin-protein kinase messenger RNA and protein in adult form of myotonie dystrophy. Science 260, 235-238). In embodiments, the compounds and methods described herein ameliorate DMPK haploinsufficiency. Another RNA gain-of-function hypothesis proposes that the mutant RNA transcribed from the expanded allele is sufficient to induce symptoms of the disease. This was suggested by observations: (i) the expanded CTG repeats are transcribed into CUG repeats that accumulate in discrète nuclear foci, (ii) expression of only the DMPK 3’-UTR with 200 CTG repeats is sufficient to inhibit myogenesis (Davis, B.M., et al. (1997) Expansion of a CUG trinucleotide repeat in the 31 untranslated région of myotonie dystrophy protein kinase transcripts results in nuclear rétention of transcripts. Proc. Natl. Acad. Sci. U.S.A. 94, 7388-7393; Amack, J.D. et al., (1999) Cis and trans effects of the myotonie dystrophy (DM) mutation in a cell culture model. Hum. Mol. Genet. 8, 1975-1984). In embodiments, the compounds and methods described herein reduce transcription of mutant RNA which are associated with the expanded allele.

The expanded CTG CUG trinucleotide repeats in the 3' untranslated région of DMPK mRNA form imperfect stable hairpin structures that accumulate in the cell nucléus in small ribonuclear complexes or microscopically visible inclusions, and impair the function of proteins implicated in transcription, splicing or RNA export. Although DMPK genes with CUG repeats are transcribed into mRNA, the mutant transcripts are sequestered in the nucléus as aggregates (foci), which results in a decrease in cytoplasmic DMPK mRNA levels. These aggregations lead to the deregulation of the alternative splicing of many different transcripts due to séquestration of two RNA-binding proteins: MBNLl 181 (muscleblind-like l) and CUGBPl (CUG-bindîng protein l), resultingin loss-of-function ofMBNLl and upregulation of CUGBPl (Lee and Cooper. (2009) “Pathogenic mechanisms of myotonie dystrophy,” Biochem Soc Trans. 37(06): 1281-1286).

In DMl, the RNA-binding protein MBNLl, is sequestered to the double-stranded hairpin structure formed by CUG repeats, depleting it from the nucleoplasm. Then, the CUG repeats to which MBNL l bound stimulate Protein Kinase C (PKC) activation through an unknown mechanism, which induces CUGBPl hyperphosphorylation and stabilization. The downstream effects include disruption of alternative splicing, mRNA translation and mRNA decay of downstream genes. An important molecular feature of DMl is the misregulation ofalternative splicingdueto séquestration ofMBNLl to CUG repeats with double-stranded hairpin structure. Among more than two dozen splicing events mis-regulated in DMl, the abnormal splicing ofthe skeletal muscle-specific CIC-l (chloride channel l ) is known to be one of the causes for myotonia. Increascd inclusion of exons containing prématuré stop codons resuit in down-regulation of CIC-l mRNA and protein, which is sufficient to cause myotonia (Charlet-B et al. (2002) Loss of the muscle-specific chloride channel in type 1 myotonie dystrophy due to misregulated alternative splicing. Mol. Cell 10, 45-53; Mankodi, A.et al. (2002) Expanded CUG repeats trigger aberrant splicing of CIC-l chloride channel pre-mRNA and hyperexcitability of skeletal muscle in myotonie dystrophy. Mol. Cell 10, 35-44). In embodiments, the compounds and methods described herein ameliorate the downstream effects, including disruption of alternative splicing, mRNA translation, and mRNA decay of downstream genes. In embodiments, the compounds and methods described herein reduce the number of splicing events mis-regulated in DMl compared to a subject with DMl that is not treated with compounds or methods of the disclosure. For example, in some embodiments, the compounds and methods described herein may reduce the number of mis-regulated splicing events in one or more downstream genes such as 4833439L19Rik, Abcc9, Atp2al, ArhgeflO, Arhgap28, Armcx6, Angel 1, Best3, Binl, Brd2, Cacnals, Cacna2dl, Cpd, Cpeb3, Ccpgl, Claspl, Clcnl, Clk4, Cpeb2, Camk2g, Capzb, Copz2, Coch, cTNT, Ctu2, Cyp2sl, Dctn4, Dnmll, Eya4, Efna3, Efna2, Fbxo31, Fbxo21, Frem2, Fgd4, Fucal, Fnl, Gogla4, Gpr3711, Grebl, Hegl, Insr, Impdh2, IR, Itgav, Jag2, Klcl, Kcan6, Kifl3a, Ldb3, Lrrfip2, Mapt, Macfl, Map3k4, Mapkapl, Mbnll, Mllt3, Mbnl2, Mef2c, Mpdz, Mrpll, Mxra7, Mybpcl, Myo9a, Ncapd3, Ngfr, Ndrg3, Ndufv3, Neb, Nfix, Numal, Opal, Pacsin2, Pcolce, Pdlim3, Pla2gl5, Phactr4, Phkal, Phtf2, Ppp 1 rl2b, Ppp3cc, Ppplcc, Ramp2, Rapgefl, Rurl, Ryrl, Sorcs2, Spsb4, Scube2, Sema6c, Sfc8a3, Slain2, Sorbsl, Spag9, Tmem28, Taccl, Tacc2, Ttc7, Tnik,

182

Tnfrsf22, Tnfrsf25, Trappc9, Trim55, Ttn, Txnl4a, Txlnb, Ube2d3, or Vsp39. In embodiments, the compounds and methods described herein reduce the number of exons containing prématuré stop codons which resuit in down-regulation of CIC-1 mRNA compared to a subject with DM l that is not treated with compounds or methods ofthe disclosure.

The levels of MBNLl and CUGBPl in the nucléus control a subset of developmentally regulatcd splicing events that are reversed in DMl. In the embryonic stage, MBNLl nuclear levels are low and CUGBPl levelsare high. Duringdevelopment, MBNLl nuclear levels increase whileCUGBPl levels decrease, inducing an embryonic-to-adult transition of downstream splice targets (including IR exon 11, CIC-1 exons containing stop codons and cTNT exon 5). However, in DMl, MBNL l is sequestered 10 to CUG repeats, resulting in a decrease of functional MBNLl, while CUGBPl levels are increased due to phosphorylation and stabilization. This simulâtes the embryonic condition and enhances expression of embryonic isofonns in adults, resulting in multiple disease symptoms (Lee and Cooper.;

2009). In embodiments, the compounds and methods described herein reduce amount of MBNLl sequestered, increase the amount of functional MBNLl, decrease CUGBPl levels compared to a 15 subject with DM l that is not treated with compounds or methods of the disclosure.

MBNLl and CELFl (also refened to as “CUGBPl”) are developmental regulators of splicing events during fêtai to adult transition and modification of their activities in DMl leads to expression of a fêtai splicing pattern in adult tissues. The downstream impact of low MBNLI and high CELFl includes disruption of alternative splicing, mRNA translation and mRNA decay in proteins such as 20 cardiac troponin T (cTNT), insulin receptor (INSR), muscle-specific chloride ion channel (CLCNI ) and sarcoplasmic/endoplasmic réticulum calcium ATPase l (ATP2A1) transcripts, in addition to MBNLL Konieczny et al. (2017) “Myotonie dystrophy: candidate small molécule therapeutics,” Drug Discovery Today. 22(l I ):1740-1748.

Compounds and methods for treating myotonie dystrophy using antisense oligomers targeting 25 polyCUG repeats in the 3'-UTR of DMPK gene are described in US10106796B2, US10111962B2, US20150080311A1, each of which is herein incorporated by reference in its entirety for ail purposes. However, such PMOs or PPMOs targeting CUG repeats to treat DM 1 might hâve limitations in oligo delivery to muscles, which is the disease affected tissues.

In embodiments, the présent disclosure teaches use of diverse cell penetrating peptide (CPP) to deliver 30 the AC (e.g., PMO or ASO) and a dégradation sequence described herein e.g., in Tables 2 and 10, to

183 the cytosol of the cell. In embodiments, the CPP or EEV conjugated with the AC delivers the AC of interest to the cellular location where the target sequence on pre-mRNA is located.

In embodiments, the disease is a form of myotonie dystrophy (e.g., myotonie dystrophy type l or myotonie dystrophy type 2). In embodiments, the target gene is the DMPK gene, which encodes myotonic-protein kinase. In embodiments, the compounds provided herein comprise an AC (e.g., ASO) that targets DMPK (e.g., the 3'-untranslated region/polyadenylation of DMPK gene) to dégradé DMPK gene. Exemplary oligonucleotides that target DMPK for dégradation are provided in Table 10. The dégradation sequence may be used in combination with an AC sequence comprising from 1040 CAG repeats, including but not limited to the AC provided in Table 2.

Table 10. Oligonucleotides (AC) targeting DMPK for dégradation.

Oligo (AC) ID	Sequence (5’-3’)	SEQ ID NO:	Target
	5’-CAG CAG CAG CAG CAG CAG CAG-3’-click- K-PEG12-Lys(CPP 12)-NLS-AC (ail PMO monomers)		DMPK
DMPK-A-17	GGGCCTTTTATTCGCGAGGGTCGGG	151	DMPK
DMPK-A-18	GAGGGCCTTTTATTCGCGAGGGTCG	152	DMPK
DMPK-A-19	TGGAGGGCCTTTTATTCGCGAGGGT	153	DMPK
DMPK-A-20	GATGGAGGGCCTTTTATTCGCGAGG	154	DMPK
DMPK-A-21	CAGATGGAGGGCCTTTTATTCGCGA	155	DMPK
DMPK-A-22	GGCAGATGGAGGGCCTTTTATTCGC	156	DMPK
DMPK-A-23	TGGGCAGATGGAGGGCCTTTTATTC	157	DMPK
DMPK-A-24	TTTGGGCAGATGGAGGGCCTTTTAT	158	DMPK
DMPK-A-25	GCTTTGGGCAGATGGAGGGCCTTTT	159	DMPK
DMPK-A-26	GAGCTTTGGGCAGATGGAGGGCCTT	160	DMPK
DMPK-A-27	CAGAGCTTTGGGCAGATGGAGGGCC	161	DMPK
DMPK-A-28	TCCAGAGCTTTGGGCAGATGGAGGG	162	DMPK
DMPK-A-29	AGTCCAGAGCTTTGGGCAGATGGAG	163	DMPK
DMPK-A-30	GGAGTCCAGAGCTTTGGGCAGATGG	164	DMPK
DMPK-A-31	GTGGAGTCCAGAGCTTTGGGCAGAT	165	DMPK
DMPK-A-32	CTGTGGAGTCCAGAGCTTTGGGCAG	166	DMPK
DMPK-A-33	CACTGTGGAGTCCAGAGCTTTGGGC	177	DMPK
DMPK-A-34	GACACTGTGGAGTCCAGAGCTTTGG	178	DMPK
DMPK-A-35	CGGACACTGTGGAGTCCAGAGCTTT	179	DMPK
DMPK-A-36	CGCGGACACTGTGGAGTCCAGAGCT	180	DMPK
DMPK-A-37	ACCGCGGACACTGTGGAGTCCAGAG	181	DMPK
DMPK-A-38	AAACCGCGGACACTGTGGAGTCCAG	182	DMPK
DMPK-A-39	GCAAACCGCGGACACTGTGGAGTCC	183	DMPK
DMPK-A-40	ACGCAAACCGCGGACACTGTGGAGT	184	DMPK
DMPK-A-41	CAACGCAAACCGCGGACACTGTGGA	185	DMPK

184

Spinocerebellar Ataxia-8 (SCA8)

In embodiments, compounds, compositions, and methods are provided to treat Spinocerebellar Ataxia-8 (SCA8). SCA8 is an inherited neurodegenerative condition that is characterized by slowly progressing ataxia. Symptoms normally emerge during the third to fifth décades of life. Symptoms include eye movement abnonnalities, sensory neuropathy, dysphagia, cerebellar ataxia, and cognitive impainnent.

SCA8 is associated with heterozygous abnormal expanded CTGCUG repeat in the 3’ UTR of two overlapping genes ATXN8OS and ATXN8. Healthy individuals generally hâve between 15 and 50 CTG CUG repeats in the ATXN8OS and ATXN8 genes. Patients with SCA8 hâve greater than 50 CTGCUG repeats, sometimes as many as 240 CTG CUG repeats in the ATXN8OS and ATXN8 genes.

Huntington’s disease like-2 (HDL2)

In embodiments, compounds, compositions and methods are provided to treat Huntington’s disease like-2 (HDL2) disease. HDL2 is an autosomal dominant neurodegenerative disorder that is phenotypically related to Huntington’s disease. HDL2 is characterized by symptoms that include chorea, dystonia, rigidity, bradykinesia, and psychiatrie symptoms such as dementia. Symptoms of HDL2 typically occur in mid-life and may lead to a prématuré death by about 10-15 years.

HDL2 is associated with an expanded CTG CUG in the 3’ UTR of the junctophilin 3 (JPH3) gene (I6q24.3). Healthy individuals generally hâve between 6 and 27 CTG-CUG repeats in the JPH3 gene. Patients with HDL2 hâve greater than 40 CTG CUG repeats, sometimes as many as 60 or more CTG CUG repeats in the JPH3 gene.

Fuchs' Endothélial Comeal Dystrophy

In embodiments, compounds, compositions, and methods are provided to treat Fuchs' Endothélial Comeal Dystrophy (FECD). FECD (MIM 136800) is an age-related degenerative disorder of the comeal endothélium. FECD is characterized by progressive loss of corneal endothélial cells, thickening of Descement’s membrane, and déposition of extracellular matrix in the form of guttae. When the number of endothélial cells becomes critically low, the comea swclls and causes loss of vision (Elhalis et al. Ocul Surf. 2010; 8(4):173-184).

185

FECD can be inherited as an autosomal dominant trait with genetic heterogeneity. Rare heterozygous mutations in collagen, type VIII, alpha 2 gene (COL8A2, MIM 120252) can give rise to an earlyonset comeal endothélial dystrophy. Other genes such as soluté carrier family 4, sodium borate transporter, member 11 (SLC4A11, MIM 610206), transcription factor 8 (TCF8, MIM 189909), lipoxygenase homology domains 1 (LOXHD1, MIM 613267), and ATP/GTP binding protein-like 1 (AGBLI, MIM 615523) are collectively associated with a small fraction of adult-onset FECD cases. The genome-wide association studies of adult-onset FECD hâve suggested that transcription factor 4 (TCF4, MIM 602272) and more recently KN motif- and ankyrin repeat domain-containing protein 4 (KANK.4, MIM 614612), laminin gamma-1 (LAMC1, MIM150290), Na⁺/ K* transporting ATPase, and beta-1 polypeptide (ATP1B1, MIM 182330), with the TCF4 locus noted hâve a prédominant effect on FECD (Mootha et al., Investigative ophthalmology &visual science, 2017; 58, 4579-4585).

Expanded trinucleotide repeats at the CTG18.1 locus in intron 2 of TCF4 are associated with FECD (Wieben et al., PLoS One. 2012; 7( 11 ):e49083). Each copy of the expanded CTG18.1 allele of more than 40 CTG CUG trinucleotide repeats leads to significant risk for development of FECD (Mootha et aL, Invest Ophthalmol Vis Sci. 2014; 55: 33-42). RNA nuclear foci, a hallmark of toxic gain of function RNA, has been reported in neurodegenerative disorders caused by simple repeat expansions. Expanded CUG repeat RNA accumulate as nuclear foci in the comeal endothélium ofFECD subjects with the CTG18.1 triplet repeat expansion while absent in control samples lacking the triplet expansion (Mootha et aL, Invest Ophthalmol Vis Sci. 2015;56(3):2003-2011). Expanded CUG repeat RNA colocalize with mRNA-splicing factor, muscleblind-like 1 (MBNL1), in nuclear foci in endothélium as a molecular hallmark. The triplet repeat expansion at the CTG 18.1 locus may médiate endothélial dysfunction via aberrant gene splicing as a resuit of the mutant CUG RNA transcripts sequestering the MBNL1 (Du et aL, J Biol Chem. 2015; 290: 5979-5990). Thus, two distinct triplet repeats converge on RNA foci and FECD, and it is likely that the foci may play a causal rôle for FECD.

In embodiments, compounds and methods useful in the treatment of Fuchs' Endothélial Comeal Dystrophy (FECD) that reduce expanded CUG repeat RNA with antisense oligonucleotides are described in WO2018165541 Al, US10760076B2, each of which is herein incorporated by référencé in its entirety for ail purposes. However, such phosphorodiamidate morpholino oligomers (PMOs) or peptide-conjugated PMOs (PPMOs) targeting CUG repeats such as those described in the prior art, to

186 treat DM l might hâve limitations in oligo delivery to the target tissue (e.g. endothélial layers in comea), which is the disease affected tissues.

In embodiments, the présent disclosure teaches use of diverse cell penetrating peptide (CPP) or endosomal escape vehicle (EEV) to deliver the AC (e.g. PMO or ASO) described herein, e.g., in Table 5 6 to the cytosol of the cell. In embodiments, the CPP or EEV conjugated to the AC delivers the AC of interest to the cellular location where the target sequence on pre-mRNA is located.

In embodiments, the disease is Fuchs' Endothélial Comeal Dystrophy (FECD). In embodiments, the target gene is TCF4, which encodes transcription factor 4 (TCF-4), which is also known as immunoglobulin transcription factor 2 (ITF-2). In embodiments, the compounds provided herein I0 comprise an antisense oligonucleotide that targets TCF4. Exemplary oligonucleotides that may be used to target TCF4 are provided in Table 2 and Table 11.

Table 11. Exemplary Oligonucleotides targeting the expanded triplet repeat ofTCF4

Oligo chemistry	Design	Target
21 -mer PMO	5’-CAG CAG CAG CAG CAG CAG CAG -3’ (ail PMO monomers; SEQ ID NO: 146)	TCF4 (CUG)n
25-mer PMO	5’-CAG CAG CAG CAG CAG CAG CAG CAG C-3’ (ail PMO monomers: SEQ ID NO: 147)	TCF4 (CUG)n
21-mer EEV-NLS- PMO	EEV-NLS PMO CAG 21 mer (conjugation from SEQ ID NO: 146)	TCF4 (CUG)n
30-mer PMO	5’-CAG CAG CAG CAG CAG CAG CAG CAG CAG CAG-3’ (ail PMO monomers; SEQ ID NO: 148)	TCF4 (CUG)n
30-mer PMO	5’-AGC AGC AGC AGC AGC AGC AGC AGC AGC AGC-3’ (ail PMO monomers; SEQ ID NO: 149)	TCF4 (CUG)n
30-mer PMO	5’-GCA GCA GCA GCA GCA GCA GCA GCA GCA GCA-3’ (ail PMO monomers; SEQ ID NO: 150)	TCF4 (CUG)n

PMO: Phosphorodiamidate Morpholino Oligomer

Mootha et al. (2017) reported that DM1 and FECD originale from noncoding CTG expansions even 15 though both are not identical diseases. The DMPK expansion in DM1 results in a multiorgan disease that involves various tissues in the eye including lens, retina, and comeal endothélium. In contrast, the TCF4 repeat expansion appears to affect the comeal endothélium without any clinically apparent sequela to other ocular tissues or bodily organs. Mutant expansions in DMPK and TCF4 share important similarities, such as (i) nuclear foci that contain expanded CUG repeats, (ii) association of 20 foci with MBNL1 protein, and (iii) an ability to cause FECD. It is suggested that the triplet expansions

187 in both DMPK and TCF4 may cause the same comeal endothélial tissue phenotype of FECD through shared molecular mechanisms.

See U. S. Patent No. 10760076B2, International Application Publication No. WO2018165541 Al, U.S. Pat. Appl. Publ. No. 2016/0355796 and U.S. Pat. Appl. Publ. No. 2018/0344817, each of which is 5 incorporated by référencé herein, and which discloses diseases and corresponding genes prone to forming and/or expanding tandem nucléotide repeats.

Compositions and Methods of Administration

The compounds of the présent disclosure may be formulated into compositions suitable for in vivo applications. The compounds and/or compositions may be administered to a patient that has, or is 10 suspected of having, a disease associated with an expanded trinucleotide repeat.

In vivo application of the disclosed compounds, and compositions containing them, can be accomplished by any suitable method and technique presently or prospectively known to those skilled in the art. For example, the disclosed compounds can be formulated in a physiologically- or pharmaceutically-acceptable composition and administered by any suitable route known in the art including, for example, oral and parentéral routes of administration. As used herein, the term parentéral includes subeutaneous, intradermal, intravenous, intramuscular, intraperitoneal, intrastemal, and intrathecal administration, such as by injection. Administration of the disclosed compounds or compositions can be a single administration, or at continuous or distinct intervals as can be readîly determined by a person skilled in the art.

The compounds disclosed herein, and compositions comprising them, can also be administered utilizing liposome technology, slow-release capsules, implantable pumps, and biodégradable containers. These delivery methods can, advantageously, provide a uniform dosage over an extended period of time. The compounds can also be administered in their sait dérivative forms or crystalline form s.

The compounds disclosed herein can be formulated into pharmaceutical compositions according to known methods for preparing pharmaceutically acceptable compositions. Formulations are described in detail in a number of sources which are well known and rcadily available to those skilled in the art. For example, Reniington's Pharmacentical Science by E.W. Martin (1995) describes formulations that can be used in connection with the disclosed methods. In general, the compounds disclosed herein can be formulated such that an effective amount of the compound is combincd with a suitable carrier 188 in order to facilitate effective administration of the compound. The compositions used can also be in a variety of forms. These include, for example, solid, semi-solid, and liquid dosage forms, such as tablets, pîlls, powders, liquid solutions or suspension, suppositoires, injectable and infusible solutions, and sprays. The form dépends on the intended mode of administration and therapeutic application. The compositions also include conventional pharmaceutically acceptable carriers and diluents which are known to those skilled in the art. Examples of carriers or diluents for use with the compounds include éthanol, dimethyl sulfoxide, glycerol, alumina, starch, saline, and équivalent carriers and diluents. To provide for the administration of such dosages for the desired therapeutic treatment, compositions disclosed herein can advantageously comprise between about 0.1 % and 100% by weight of the total of one or more of the subject compounds based on the weight of the total composition including carrier or diluent.

Formulations suitable for administration include, for example, aqueous stérile injection solutions, which can contain antioxidants, buffers, bacteriostats, and solutés that render the formulation isotonie with the blood of the intended récipient; and aqueous and nonaqueous stérile suspensions, which can include suspending agents and thickening agents. The formulations can be presented in unit-dose or multi-dose containers, for example sealed ampoules and vials, and can be stored in a freeze dried (lyophilized) condition requiring only the condition of the stérile liquid carrier, for example, water for injections, prior to use. Extemporancous injection solutions and suspensions can be prepared from stérile powder, granules, tablets, etc. It should be understood that in addition to the ingrédients particularly mentioned above, the compositions disclosed herein can include other agents conventional in the ail having regard to the type of formulation in question.

Compounds disclosed herein. and compositions comprising them, can be delivered to a cell either through direct contact with the cell or via a carrier means. Carrier means for delivering compounds and compositions to cells are known in the art and include, for example, encapsulating the composition in a liposome moiety. Another means for delivery of compounds and compositions disclosed herein to a cell comprises attaching the compounds to a protein or nucleic acid that is targeted for delivery to the target cell. U.S. Patent No. 6,960,648 and U.S. Application Publication Nos. 20030032594 and 20020120100 disclose amino acid sequences that can be coupled to another composition and that allows the composition to be translocated across biological membranes. U.S. Application Publication No. 20020035243 also describes compositions for transporting biological moieties across cell membranes for intracellular delivery. Compounds can also be incorporated into 189 polymers, examples of which include poly (D-L lactide-co-glycolide) polymer for intracranial tumors; poly[bis(p-carboxyphenoxy) propane:sebacic acid] in a 20:80 molar ratio (as used in GLIADEL); chondroitin; chitin; and chitosan.

Compounds and compositions disclosed herein, including pharmaceutically acceptable salts or prodrugs thereof, can be administered intravenously, intramuscularly, or intraperitoneally by infusion or injection. Solutions of the active agent or its salts can be prepared in water, optionally mixed with a nontoxic surfactant. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, triacetin, and mixtures thereof and in oils. Under ordinary conditions of storage and use, these préparations can contain a preservative to prevent the growth of microorganisms.

The pharmaceutical dosage forms suitable for injection or infusion can include stérile aqueous solutions or dispersions or stérile powders comprising the active ingrédient, which are adapted for the extemporaneous préparation of stérile injectable or infusible solutions or dispersions, optionally encapsulated in liposomes. The ultimate dosage form should be stérile, fluid and stable under the conditions of manufacture and storage. The liquid carrier or vehicle can be a solvent or liquid dispersion medium comprising, for example, water, éthanol, a polyol (for example, glycerol, propylene glycol, liquid polyethylene glycols, and the like), vegetable oils, nontoxic glyceryl esters, and suitable mixtures thereof. The proper fluidity can be maintained, for example, by the formation of liposomes, by the maintenance ofthe required particle size in the case ofdispersions or by the use of surfactants. Optionally, the prévention of the action of microorganisms can be brought about by various other antibacterial and antifungal agents, for example, parabens, chlorobutanol, phénol, sorbic acid, thimerosal, and the like. In many cases, isotonie agents may be included, for example, sugars, buffers or sodium chloride. Prolonged absorption of the injectable compositions can be brought about by the inclusion of agents that delay absoiption, for example, aluminum monostearate and gelatin.

Stérile injectable solutions are prepared by incorporating a compound and/or agent disclosed herein in the required amount in the appropriate solvent with various other ingrédients enumerated above, as required, followcd by filter sterilization. In the case of stérile powders for the préparation of stérile injectable solutions, methods of préparation include vacuum drying and the freeze-drying techniques, which yield a powder of the active ingrédient plus any additional desired ingrédient présent in the previously sterile-filtered solutions.

190

Useful dosages of the compounds and agents and pharmaceutical compositions disclosed herein can be determined by comparing their in vitro activity, and in vivo activity in animal models. Methods for the extrapolation of effective dosages in mice, and other animais, to humans are known to the art.

The dosage ranges for the administration of the compositions are those large enough to produce the 5 desired effect in which the symptoms or disorder are affected. The dosage should not be so large as to cause adverse side effects, such as unwanted cross-reactions, anaphylactic reactions, and the like. Generally, the dosage will vary with the âge, condition, sex and extent of the disease in the patient and can be determined by one of skill in the art. The dosage can be adjusted by the individual physician in the event of any counterindications. Dosage can vary, and can be administered in one or more dose 10 administrations daily, for one or several days.

Also disclosed are pharmaceutical compositions that comprise a compound disclosed herein in combination with a pharmaceutically acceptable carrier. Pharmaceutical compositions adapted for oral, topical or parentéral administration, comprising an amount of a compound are disclosed herein. The dose administered to a patient, particularly a human, should be sufficient to achieve a therapeutic 15 response in the patient over a reasonable time frame, without léthal toxicity, and causing no more than an acceptable level of side effects or morbidity. One skilled in the art will recognize that dosage will dépend upon a variety of factors including the condition (health) of the subject, the body weight of the subject, kind of concurrent treatment, if any, frequency of treatment, therapeutic ratio, as well as the severity and stage of the pathological condition.

Also disclosed are kits that comprise a compound disclosed herein and/or pharmaceutical compositions containing the same, in one or more containers. The disclosed kits can optionally include pharmaceutically acceptable carriers and/or diluents. In one embodiment, a kit includes one or more other components, adjuncts, or adjuvants as described herein. In one embodiment, a kit includes instructions or packaging materials that describe how to administer a compound or composition of the kit. Containers of the kit can be of any suitable material, e.g., glass, plastic, métal, etc., and of any suitable size, shape, or configuration. In one embodiment, a compound and/or agent disclosed herein is provided in the kit as a solid, such as a tablet. pill, or powder form. In another embodiment, a compound and/or agent disclosed herein is provided in the kit as a liquid or solution. In one embodiment, the kit comprises an ampoule or syringe containing a compound and/or agent 30 disclosed herein in liquid or solution form.

I9l

In embodiments, the compound and or composition of the of the disclosure is administered to a patient diagnosed with a disease associated with a nucléotide repeat expansion at a dose of between about 0.1 mg/kg and about 1000 mg/kg, for example, about 0.1 mg/kg, about 0.2 mg/kg, about 0.3 mg/kg, about 0.4 mg/kg, about 0.5 mg/kg, about 0.6 mg/kg, about 0.7 mg/kg, about 0.8 mg/kg, about 0.9 mg/kg, 5 about l mg/kg, about 2 mg/kg, about 3 mg/kg, about 4 mg/kg, about 5 mg/kg, about 6 mg/kg, about mg/kg, about 8 mg/kg, about 9 mg/kg, about 10 mg/kg, about 11 mg/kg, about 12 mg/kg, about 13 mg/kg, about 14 mg/kg, about 15 mg/kg, about 16 mg/kg, about 17 mg/kg, about 18 mg/kg, about 19 mg/kg, about 20 mg/kg, about 21 mg/kg, about 22 mg/kg, about 23 mg/kg, about 24 mg/kg, about 25 mg/kg, about 26 mg/kg, about 27 mg/kg, about 28 mg/kg, about 29 mg/kg, about 30 mg/kg, about 31 mg/kg, about 32 mg/kg, about 33 mg/kg, about 34 mg/kg, about 35 mg/kg, about 36 mg/kg, about 37 mg/kg, about 38 mg/kg, about 39 mg/kg, about 40 mg/kg, about 41 mg/kg, about 42 mg/kg, about 43 mg/kg, about 44 mg/kg, about 45 mg/kg, about 46 mg/kg, about 47 mg/kg, about 48 mg/kg, about 49 mg/kg, about 50 mg/kg, about 51 mg/kg, about 52 mg/kg, about 53 mg/kg, about 54 mg/kg, about 55 mg/kg, about 56 mg/kg, about 57 mg/kg, about 58 mg/kg, about 59 mg/kg, about 60 mg/kg, about 61 mg/kg, about 62 mg/kg, about 63 mg/kg, about 64 mg/kg, about 65 mg/kg, about 66 mg/kg, about 67 mg/kg, about 68 mg/kg, about 69 mg/kg, about 70 mg/kg, about 71 mg/kg, about 72 mg/kg, about 73 mg/kg, about 74 mg/kg, about 75 mg/kg, about 76 mg/kg, about 77 mg/kg, about 78 mg/kg, about 79 mg/kg, about 80 mg/kg, about 81 mg/kg, about 82 mg/kg, about 83 mg/kg, about 84 mg/kg, about 85 mg/kg, about 86 mg/kg, about 87 mg/kg, about 88 mg/kg, about 89 mg/kg, about 90 mg/kg, about 91 mg/kg, about 92 mg/kg, about 93 mg/kg, about 94 mg/kg, about 95 mg/kg, about 96 mg/kg, about 97 mg/kg, about 98 mg/kg, about 99 mg/kg, about 100 mg/kg, about 110 mg/kg, about 120 mg/kg, about 130 mg/kg, about 140 mg/kg, about 150 mg/kg, about 160 mg/kg, about 170 mg/kg, about 180 mg/kg, about 190 mg/kg, about 200 mg/kg, about 210 mg/kg, about 220 mg/kg, about 230 mg/kg, about 240 mg/kg, about 250 mg/kg, about 260 mg/kg, about 270 mg/kg, about 280 mg/kg, about 290 mg/kg, 25 about 300 mg/kg, about 310 mg/kg, about 320 mg/kg, about 330 mg/kg, about 340 mg/kg, about 350 mg/kg, about 360 mg/kg, about 370 mg/kg, about 380 mg/kg, about 390 mg/kg, about 400 mg/kg, about 410 mg/kg, about 420 mg/kg, about 430 mg/kg, about 440 mg/kg, about 450 mg/kg, about 460 mg/kg, about 470 mg/kg, about 480 mg/kg, about 490 mg/kg, about 500 mg/kg, about 510 mg/kg, about 520 mg/kg, about 530 mg/kg, about 540 mg/kg, about 550 mg/kg, about 560 mg/kg, about 570 30 mg/kg, about 580 mg/kg, about 590 mg/kg, about 600 mg/kg, about 610 mg/kg, about 620 mg/kg, about 630 mg/kg, about 640 mg/kg, about 650 mg/kg, about 660 mg/kg, about 670 mg/kg, about 680

192 mg/kg, about 690 mg/kg, about 700 mg/kg, about 710 mg/kg, about 720 mg/kg, about 730 mg/kg, about 740 mg/kg, about 750 mg/kg, about 760 mg/kg, about 770 mg/kg, about 780 mg/kg, about 790 mg/kg, about 800 mg/kg, about 810 mg/kg, about 820 mg/kg, about 830 mg/kg, about 840 mg/kg, about 850 mg/kg, about 860 mg/kg, about 870 mg/kg, about 880 mg/kg, about 890 mg/kg, about 900 mg/kg, about 910 mg/kg, about 920 mg/kg, about 930 mg/kg, about 940 mg/kg, about 950 mg/kg, about 960 mg/kg, about 970 mg/kg, about 980 mg/kg, about 990 mg/kg, or about 1000 mg/kg, including ail values and ranges therein and in between.

Methods of Treatment

The présent disclosure provides a method of treating disease in a subjeet in need thereof, comprising administering a compound and/or composition containing the compound disclosed herein. In embodiments, the disease is any of the diseases provided in the présent disclosure. in embodiments, the target gene or gene transcript is any of the target genes or gene transcripts provided in the présent disclosure.

In embodiments, the patient is identified as having, or at risk of having, any disease as described herein. In embodiments, a method is provided for treating a disease associated with a CTG-CUG repeat in a 3’ untranslated région of a gene/transcript. In embodiments, a method is provided for treating myotonie dystrophy. In embodiments, a method is provided for treating myotonie dystrophy type l (DM l ). In embodiments, a method is provided for treating SCA8. In embodiments, a method is provided for treating HDL2. In embodiments, a method is provided for treating FECD.

In embodiments, treatment refers to partial or complété alleviation, amelioration, relief, inhibition, delaying onset, reducing severity and/or incidence of one or more symptoms in a subjeet.

Treatment of the disease and/or symptoms of the disease may occur through a variety of molecular mechanisms such as those described herein.

In embodiments, a method is provided for altering the expression and/or activity of a target gene in a subjeet in need thereof, comprising administering a compound disclosed herein. In embodiments, the treatment results in the lowered expression of a target protein from a target transcript. In embodiments, treatment results in the lowered levels of a target transcript. In embodiments, treatment results in the modulation of splicing of downstream gene transcripts that are regulated by the target transcript and/or proteins thaï bind to the target transcript. In embodiments, modulation of splicing of downstream gene transcripts results in an increase in downstream transcripts and/or downstream proteins isoforms that 193

are associated with healthy phenotypes. In embodiments, the alternative splicing results in a decrease in downstream transcripts and/or downstream proteins isoforms that are associated with disease phenotypes.

In embodiments, a method is provided for treating DMl by reducing séquestration of at least one 5 RNA-binding protein to a pre-mRNA comprising at least one expanded CUG repeat. In embodiments, a method is provided for treating DMl by reducing accumulation of a pre-mRNA comprising at least one expanded CUG repeat. In embodiments, a method is provided for treating DMl by correcting splicing defects of downstream gene transcripts.

In embodiments, treatment according to the présent disclosure results in a decreased level of the 10 target transcript and/or expression of the target transcript (e.g., DMPK, TCF4, JPH3, ATXN80S and/or ATXN8) genes by more than about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level of the protein in the subject before the treatment or of one or more control indivîduals 15 with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein. In embodiments, treatment according to the présent disclosure results in a decreased level of the target transcript (e.g., DMPK, TCF4, JPH3, ATXN8OS and/or ATXN8) and/or expression of the target transcript by about 5% to about 100%, about 10% to about 100%, about 20% to about 100%, about 50% to about 100%, about 70% to about 100%, about 80% to about 20 100%, about 90% to about 100%, about 95% to about 100%, about 40% to about 95%, about 50% to about 95%, about 70% to about 95%, or about 90% to about 95% as compared to the average level of the transcript and/or protein in the subject before the treatment or of one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in a decreased number of CUG repeat RNA nuclear foci of a target gene (e.g., DMPK, TCF4, JPH3, ATXN8OS and/or ATXN8) by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level 30 foci in the subject before the treatment or of one or more control individuals with similar discase without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed 194 herein. In embodiments, treatment according to the présent disclosure results în a decreased number of CUG repeat RNA nuclear foci of a target gene (e.g., DMPK. TCF4, JPH3, ATXN8OS and/or ATXN8) by about 5% to about 100%, about 10% to about 100%, about 20% to about 100%, about 50% to about 100%, about 70% to about 100%, about 80% to about 100%, about 90% to about 100%, 5 about 95% to about 100%, about 40% to about 95%, about 50% to about 95%, about 70% to about 95%, or about 90% to about 95% as compared to the average level of foci in the subject before the treatment or of one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in a decreased level of 10 downstream transcript and/or expression of a downstream gene product that is associated with a disease phenotype by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level of the protein in the subject before the treatment or of one or more control 15 individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein. In embodiments, treatment according to the présent disclosure results in a decreased level of downstream transcript and/or expression of a downstream gene product that is associated with a disease phenotype by about 5% to about 100%, about 10% to about 100%, about 20% to about 100%, about 50% to about 100%, about 70% to about 100%, about 20 80% to about 100%, about 90% to about 100%, about 95% to about 100%, about 40% to about 95%, about 50% to about 95%, about 70% to about 95%, or about 90% to about 95% as compared to the average level of the protein in the subject before the treatment or of one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in an increased level of downstream transcript and/or expression of a downstream gene product that is associated with a healthy phenotype by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared 30 to the average level of the protein in the subject before the treatment or of one or more control individuals with similar disease without treatment, or compared to treatment with an AC not 195 conjugated to a cyclic CPP disclosed herein. in embodiments, treatment according to the présent disclosure results in an increased level of downstream transcript and/or expression of a downstream gene product that is associated with a healthy phenotype by about 5% to about 100%, about 10% to about 100%, about 20% to about 100%, about 50% to about 100%, about 70% to about 100%, about 80% to about 100%, about 90% to about 100%, about 95% to about 100%, about 40% to about 95%, about 50% to about 95%, about 70% to about 95%, or about 90% to about 95% as compared to the average level of the protein in the subject before the treatment or of one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in decreased expression of a protein isoform associated with a disease phenotype in a subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, triceps, tibialis anterior, gastrocnemius, or heart by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level of the protein in the subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, triceps, tibialis anterior, gastrocnemius, or heart before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein. In embodiments, treatment according to the présent disclosure results in decreased expression of a protein isoform associated with a disease phenotype in a subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, triceps, tibialis anterior, gastrocnemius, or heart by more than about by about 5% to about 100%, about 10% to about 100%, about 20% to about 100%, about 50% to about 100%, about 70% to about 100%, about 80% to about 100%, about 90% to about 100%, about 95% to about 100%, about 40% to about 95%, about 50% to about 95%, about 70% to about 95%, or about 90% to about 95% as compared to the average level of the protein in the subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, triceps, tibialis anterior, gastrocnemius, or heart before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in increased expression of an altemately spliced downstream protein in a subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, or heart by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 196

25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 150%, about 200%, about 250%, about 300%, about 350%, about 400%, about 450%, about 500%, about 550%, about 600%, about 650%, about 700%, about 750%, about 800, about 850%, about 900%, 5 about 950%, or about 1000% or more, as compared to the average level ofthe downstream protein in the subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, or heart before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in increased or decreased 10 expression of a wild type protein isomer in a subject’s comeal tissue, muscle tissue, diaphragm tissue, quadriceps, or heart by more than about 5%, e.g,, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level of the wild type protein isomer in the subject’s corneal tissue, muscle tissue, 15 diaphragm tissue, quadriceps, or heart before the treatment. compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in decreased expression of a protein in a subject’s tissue of interest by more than about 5%, e.g., about 5%, about 10%, about 15%, 20 about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level of the protein in the subject’s tissue of interest before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in increased expression of an alternately spliced downstream protein in a subject’s tissue of interest by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, about 100%, about 150%, about 200%, about 250%, about 300%, about 350%, 30 about 400%, about 450%, about 500%, about 550%, about 600%, about 650%, about 700%, about 750%, about 800, about 850%, about 900%, about 950%, or about 1000% or more, as compared to 197 the average level ofthe downstream protein in the subject’s tissue of interest before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, treatment according to the présent disclosure results in increased or decreased expression of a wild type downstream protein isomer in a subject’s tissue of interest by more than about 5%, e.g., about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 95%, and about 100%, as compared to the average level ofthe downsteam protein in the subject’s tissue of interest before the treatment, compared to one or more control individuals with similar disease without treatment, or compared to treatment with an AC not conjugated to a cyclic CPP disclosed herein.

In embodiments, the subject’s tissue of interest is comeal tissue or muscle tissue.

The ternis, “improve,” “increase,” “reduce,” “decrease,” and the like, as used herein, indicate values that are relative to a control. In embodiments, a suitable control is a baseline measurement, such as a measurement in the same individual prior to initiation of the treatment described herein, or a measurement in a control individual (or multiple control individuals) in the absence of the treatment described herein. A “control individual” is an individual afflicted with the same disease, who is about the same âge and/or gender as the individual being treated (to ensure that the stages ofthe disease in the treated individual and the control individual(s) are comparable).

The individual (also referred to as “patient” or subject) being treated is an individual (fétus, infant, child, adolescent, or adult human) having a disease or having the potential to develop a disease. The individual may hâve a disease mediated by aberrant gene expression or aberrant gene splicing. In various embodiments, the individual having the disease may hâve downstream protein expression or activity levels that are less than about 1-99% of normal wild type protein expression or activity levels in an individual not afflicted with the disease. In embodiments, the range includes, but is not limited to less than about 80-99%, less than about 65-80%, less than about 50-65%, less than about 30-50%, less than about 25-30%, less than about 20-25%, less than about 15-20%, less than about 10-15%, less than about 5-10%, less than about 1 -5% of normal wild type protein expression or activity levels. In embodiments, the individual may hâve downstream protein expression or activity levels that are 1 500% higher than normal wild type target protein expression or activity levels in an individual not

198 afflicted with the disease. In embodiments, the range includes, but is not limited to, greater than about 1-10%, about 10-50%, about 50-100%, about 100-200%, about 200-300%, about 300-400%, about 400-500%, or about 500-1000% of normal wild type target protein expression or activity levels.

In embodiments, the individual is an individual who has been recently diagnosed with the disease. Typically, early treatment (treatment commencing as soon as possible after diagnosis) is important to minimize the effects of the disease and to maximize the benefits of treatment.

Certain Définitions

As used in the description and the appended daims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictâtes otherwise. Thus, for example, reference to “a composition” includes mixtures of two or more such compositions, reference to “an agent” includes mixtures of two or more such agents, reference to “the component” includes mixtures of two or more such components, and the like.

The term “about” when immediately preceding a numerical value means a range (e.g., plus or minus 20%, 10%, or 5% of that value). For example, “about 50” can mean 45 to 55, “about 25,000” can mean 22,500 to 27,500, etc., unless the context of the disclosure indicates otherwise, or is inconsistent with such an interprétation. For example, in a list of numerical values such as “about 49, about 50, about 55, ...”, “about 50” means a range extending to less than half the interval(s) between the preceding and subséquent values, e.g., more than 49.5 to less than 52.5. Furthermore, the phrases “less than about” a value or “greater than about” a value should be understood in view of the définition of the term “about” provided herein. Similarly, the term “about” when preceding a sériés of numerical values or a range of values (e.g., “about 10, 20, 30” or “about 10-30”) refers, respectively to ail values in the sériés, or the endpoints of the range.

As used herein, “cell penetrating peptide” or “CPP” refers to a peptide that facilitâtes delivery of a cargo, e.g., a therapeutic moiety (TM) into a cell. In embodiments, the CPP is cyclic, and is represented as “cCPP”. In embodiments, the cCPP is capable of directing a therapeutic moiety to penetrate the membrane of a cell. In embodiments, the cCPP delivers the therapeutic moiety to the cytosol of the cell. In embodiments, the cCPP delivers an antisense compound (AC) to a cellular location where a pre-mRNA is located.

199

As used herein, the term “endosomal escape vehicle” (EEV) refers to a cCPP that is conjugated by a Chemical linkage (i.e., a covalent bond or non-covalent interaction) to a linker and/or an exocyclic peptide (EP). The EEV can be an EEV of Formula (B).

As used herein, the term “EEV-conjugate” refers to an endosomal escape vehicle defined herein conjugated by a Chemical linkage (i.e., a covalent bond or non-covalent interaction) to a cargo. The cargo can be a therapeutic moiety (e.g., an oligonucleotide, peptide, or small molécule) that can be delivered into a cell by the EEV. The EEV-conjugate can be an EEV-conjugate of Formula (C).

As used herein, the term exocyclic peptide (EP) and “modulatory peptide” (MP) may be used interchangeably to refer to two or more amino acid residues linked by a peptide bond that can be conjugated to a cyclic cell penetrating peptide (cCPP) disclosed herein. The EP, when conjugated to a cyclic peptide disclosed herein, may alter the tissue distribution and/or rétention of the compound. Typically, the EP comprises at least one positively charged amino acid residue, e.g., at least one lysine residue and/or at least one arginine residue. Non-limiting examples of EP are described herein. The EP can be a peptide that has been identified in the art as a “nuclear localization sequence” (NLS). Non-limiting examples of nuclear localization sequences include the nuclear localization sequence of the SV40 virus large T-antigen, the minimal functional unit of which is the seven amino acid sequence PKKKRKV (SEQ ID NO:42), the nucleoplasmin bipartite NLS with the sequence NLSKRPAAIKKAGQAKKKK(SEQ ID NO:52), the c-myc nuclear localization sequence having the amino acid sequence PAAKRVKLD (SEQ ID NO:53) or RQRRNELKRSF(SEQ ID NO:54), the sequence RMRKFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO:50) of the IBB domain from importin-alpha, the sequences VSRKRPRP (SEQ ID NO:57) and PPKKARED (SEQ ID NO:58)of the myoma T protein, the sequence PQPKKKPL (SEQ ID NO:59) of human p53, the sequence SALIKKKKKMAP (SEQ ID NO:60) of mouse c-abl IV, the sequences DRLRR (SEQ ID NO:6l) and PKQKKRK (SEQ ID NO:62) of the influenza viras NSI, the sequence RKLKKKIKKL (SEQ ID NO:63) ofthe Hepatitis viras delta anligen, the sequence REKKKFLKRR (SEQ ID NO:64) of the mouse Mxl protein, the sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID NO:65) of the human poly(ADP-ribose) polymerase, and the sequence RKCLQAGMNLEARKTKK (SEQ ID NO:66) of the steroid hormone receptors (human) glucocorticoid. International Publication No. 2001/038547 describes additional examples of NLSs and is incorporated by référencé herein in its entirety.

200

As used herein, “linker” or “L” refers to a moiety that covalently bonds one or more moieties (e.g., an exocyclic peptide (EP) and a cargo, e.g., an oligonucleotide, peptide or small molécule) to the cyclic cell penetrating peptide (cCPP). The linker can comprise a natural or non-natural amino acid or polypeptide. The linker can be a synthetîc compound containing two or more appropriate functional 5 groups suitable to bind the cCPP to a cargo moiety, to thereby form the compounds disclosed herein.

The linker can comprise a polyethylene glycol (PEG) moiety. The linker can comprise one or more amino acids. The cCPP may be covalently bound to a cargo via a linker.

The terms “peptide,” “protein,” and “polypeptide” are used interchangeably to refer to a natural or synthetîc molécule comprising two or more amino acids linked by the carboxyl group of one amino 10 acid to the alpha amino group of another. Two or more amino acid residues can be linked by the carboxyl group of one amino acid to the alpha amino group. Two or more amino acids of the polypeptide can be joined by a peptide bond. The polypeptide can include a peptide backbone modification in which two or more amino acids are covalently attached by a bond other than a peptide bond. The polypeptide can include one or more non-natural amino acids, amino acid analogs, or other 15 synthetîc molécules that are capable of integrating into a polypeptide. The terni polypeptide includes naturally occurring and artificially occurring amino acids. The terni polypeptide includes peptides, for example, that include from about 2 to about 100 amino acid residues as well as proteins, that include more than about 100 amino acid residues, or more than about 1000 amino acid residues, including, but not limited to therapeutic proteins such as antibodies, enzymes, receptors, soluble 20 proteins, and the like.

As used herein, the terni “contiguous” refers to two amino acids, which are connected by a covalent bond. For example, in the context of a représentative cyclic cell penetrating peptide (cCPP) such as -AA-]

AA5 \

I aa₂aa₄.._ / ^3 , AA1/AA3. AA2/AA3. AA3/AA4. and AA5/AA1 exemplify pairs of contiguous amino acids.

A residue of a Chemical species, as used herein, refers to a dérivative of the Chemical species that is présent in a particular product. To form the product, at least one atom of the species is replaced by a bond to another moiety, such that the product contains a dérivative, or residue, of the Chemical species. For example, the cyclic cell penetrating peptides (cCPP) described herein hâve amino acids (e.g., 201 arginine) incorporated therein through formation of one or more peptide bonds. The amino acids incorporated into the cCPP may be referred to residues, or simply as an amino acid. For example, arginine or an arginine residue refers to

The terni “protonated forin thereof’ refers to a protonated form of an amino acid or side chain. For example, the guanidine group on the side chain of arginine may be protonated to form a guanidinium

NH₂©A H₂N N

group. The structure of a protonated form of arginine is

As used herein, the terni “chirality” refers to a molécule that has more than one stereoisomer that differs in the three-dimensional spatial arrangement of atoms, in which one stereoisomer is a nonsuperimposable mirror image of the other. Amino acids, except for glycine, hâve a chiral carbon atom adjacent to the earboxyl group. The terni “enantiomer refers to stereoisomers that are chiral. The chiral molécule can be an amino acid residue having a “D” and “L” enantiomer. Molécules without a chiral center, such as glycine, can be referred to as “achiral.”

As used herein, the terni “hydrophobie” refers to a moiety that is not soluble in water or has minimal solubility in water. Generally, neutral moieties and/or non-polar moieties, or moieties that are predominately neutral and/or non-polar are hydrophobie. Hydrophobicity can be measured by one of the methods disclosed herein.

As used herein “aromatic” refers to an unsaturated cyclic molécule having 4n + 2 π électrons, wherein n is any integer. “Heteroaromatic,” defined below, is a subset of aromatic. Examples of aromatic amino acids include phenylalanine and napthylalinine. The terni “non-aromatic” refers to any molécule that does not fall within the définition of aromatic. For example, any linear, branched or cyclic molécule which does not fall within the définition of aromatic is non-aromatîc. Examples of non-aromatic amino acids include, but are not limited to, glycine and citrulline.

202 “Alkyl”, “alkyl chain” or “alkyl group” refer to a fully saturated, straight or branched hydrocarbon chain radical having from one to forty carbon atoms, and which is attached to the rest of the molécule by a single bond. Alkyls comprising any number of carbon atoms from l to 40 are included. An alkyl comprising up to 40 carbon atoms is a C1-C40 alkyl, an alkyl comprising up to 10 carbon atoms is a Ci-Cio alkyl, an alkyl comprising up to 6 carbon atoms is a Ci-Cô alkyl and an alkyl comprising up to 5 carbon atoms is a C1-C5 alkyl. A C1-C5 alkyl includes C5 alkyls, C4 alkyls, C3 alkyls, C₂ alkyls and Ci alkyl (i.e., methyl). A Ci-Ce alkyl includes ali moieties described above for C1-C5 alkyls but also includes Ce alkyls. A Ci-Cio alkyl includes ail inoieties described above for Ci-Cs alkyls and Ci-Cô alkyls, but also includes C7, Ce, C9 and Cio alkyls. Similarly, a C1-C12 alkyl includes ail the foregoing moieties, but also includes Ch and Ci₂ alkyls. Non-limiting examples of C1-C12 alkyl include methyl, ethyl, n-propyl, /-propyl, seopropyl, /7-butyl, 7-butyl, xec-butyl, ί-butyl, w-pentyl, Z-amyl, «-hexyl, nheptyl, n-octyl, /z-nonyl, H-decyl, n-undecyl, and «-dodecyl. Uniess stated otherwise specifically in the spécification, an alkyl group can be optîonally substituted.

“Alkylene”, “alkylene chain” or “alkylene group” refers to a fully saturated, straight or branched divalent hydrocarbon chain radical, having from one to forty carbon atoms. Non-limiting examples of C2-C40 alkylene include ethylene, propylene, M-butylene, ethenylene, propenylene, /z-butenylene, propynylene, n-butynylene, and the like. Uniess stated otherwise specifically in the spécification, an alkylene chain can be optîonally substituted.

“Alkenyl”, “alkenyl chain” or “alkenyl group” refers to a straight or branched hydrocarbon chain radical having from two to forty carbon atoms and having one or more carbon-carbon double bonds. Each alkenyl group is attached to the rest of the molécule by a single bond. Alkenyl groups comprising any number of carbon atoms from 2 to 40 are included. An alkenyl group comprising up to 40 carbon atoms is a C2-C40 alkenyl, an alkenyl comprising up to 10 carbon atoms is a C2-C10 alkenyl, an alkenyl group comprising up to 6 carbon atoms is a C₂-Cô alkenyl and an alkenyl comprising up to 5 carbon atoms is a C2-C5 alkenyl. A C2-C5 alkenyl includes C5 alkenyls, C4 alkenyls, C3 alkenyls, and C₂alkenyls. A C₂-Cô alkenyl includes ail inoieties described above for C2-C5 alkenyls but also includes G, alkenyls. A C2-C10 alkenyl includes ail moieties described above for C2-C5 alkenyls and C2-C0 alkenyls, but also includes C7, Cs, C9 and Cio alkenyls. Similarly, a C2-C12 alkenyl includes ail the foregoing moieties, but also includes Cn and C12 alkenyls. Non-limiting examples of C2-C12 alkenyl include ethenyl (vinyl), 1-propenyl, 2-propenyl (allyl), iso-propenyl, 2-methyl-l-propenyl, 1-butenyl, 2-butenyl, 3-butenyl, 1-pentenyl, 2-pentenyl, 3-pentenyl, 4-pentenyl, 1-hexenyl, 2-hexenyl, 3203 hexenyl, 4-hexenyl, 5-hexenyl, l-heptenyl, 2-heptenyl, 3-heptenyl, 4-heptenyl, 5-heptenyl, 6heptenyl, l-octenyl, 2-octenyl, 3-octenyl, 4-octenyl, 5-octenyl, 6-octenyl, 7-octenyl, l-nonenyl, 2nonenyl, 3-nonenyl, 4-nonenyl, 5-nonenyl, 6-nonenyl, 7-nonenyl, 8-nonenyl, l-decenyl, 2-decenyl, 3-decenyl, 4-decenyl, 5-decenyl, 6-decenyl, 7-decenyl, 8-decenyl, 9-decenyl, l-undecenyl, 2undecenyl, 3-undecenyl, 4-undecenyl, 5-undecenyl, 6-undecenyl, 7-undecenyl, 8-undecenyl, 9undecenyl, 10-undecenyl, l-dodecenyl, 2-dodecenyl, 3-dodecenyl, 4-dodecenyl, 5-dodecenyl, 6dodecenyl, 7-dodecenyl, 8-dodecenyl, 9-dodecenyl, 10-dodecenyl, and l l-dodecenyl. Unless stated otherwise specifically in the spécification, an alkyl group can be optionally substituted.

“Alkenylene”, “alkenylene chain” or “alkenylene group” refers to a straight or branched divalent hydrocarbon chain radical, having from two to forty carbon atoms, and having one or more carboncarbon double bonds. Non-limiting examples of C2-C4Ü alkenylene include ethene, propene, butene, and the like. Unless stated otherwise specifically in the spécification, an alkenylene chain can be optionally.

“Alkoxy” or “alkoxy group” refers to the group -OR, where R is alkyl, alkenyl, alkynyl, cycloalkyl, or heterocyclyl as defined herein. Unless stated otherwise specifically in the spécification, an alkoxy group can be optionally substituted.

“Acyl” or “acyl group” refers to groups -C(O)R, where R is hydrogen, alkyl, alkenyl, alkynyl, carbocyelyl, or heterocyclyl, as defined herein. Unless stated otherwise specifically in the spécification, acyl can be optionally substituted.

“Alkylcarbamoyl” or “alkylcarbamoyl group” refers to the group -O-C(O)-NR_aRb, where R_aand Rb are the same or different and are independently an alkyl, alkenyl, alkynyl, aryI, heteroaryl, as defined herein, or R_aRh can be taken together to form a cycloalkyl group or heterocyclyl group, as defined herein. Unless stated otherwise specifically in the spécification, an alkylcarbamoyl group can be optionally substituted.

“Alkylcarboxamidyl” or “alkylcarboxamidyl group” refers to the group -C(O)-NR_aRb, where R_a and Rb are the same or different and are independently an alkyl, alkenyl, alkynyl, aryl, heteroaryl, cycloalkyl, cycloalkenyl, cycloalkynyl, or heterocyclyl group, as defined herein, or R_aRbcan be taken together to form a cycloalkyl group, as defined herein. Unless stated otherwise specifically in the spécification, an alkylcarboxamidyl group can be optionally substituted.

“Aryl” refers to a hydrocarbon ring system radical comprising hydrogen, 6 to 18 carbon atoms and at least one aromatic ring. For purposes ofthis invention, the aryl radical can be a monocyclic, bicyclic,

204

tricyclic or tetracyclic ring System, which can include fused or bridged ring Systems. Aryl radicals include, but are not limited to, aryl radicals derived from aceanthrylene, acenaphthylene, acephenanthrylene, anthracene, azulene, benzene, chrysene, fluoranthene, fluorene, os-indacene, Λ-indacene, indane, indene, naphthalene, phenalene, phenanthrene, pleiadene, pyrene, and 5 triphenylene. Unless stated otherwise specifically in the spécification, the terni “aryl” is meant to include aryl radicals that are optionally substituted.

“Heteroaryl” refers to a 5- to 20-membered ring System radical comprising hydrogen atoms, one to thirteen carbon atoms, one to six heteroatoms selected from nitrogen, oxygen and sulfur, and at least one aromatic ring. For purposes of this invention, the heteroaryl radical can be a monocyclic, bicyclic, I0 tricyclic or tetracyclic ring System, which can include fused or bridged ring Systems; and the nitrogen, carbon or sulfur atoms in the heteroaryl radical can be optionally oxidized; the nitrogen atom can be optionally quatemized. Examples include, but are not limited to, azepinyl, acridinyl, benzimidazolyl, benzothiazolyl, benzindolyl, benzodioxolyl, benzofuranyl, benzooxazolyl, benzothiazolyl, benzothiadiazolyl, benzo[/>][1,4]dioxepinyl, l,4-benzodioxanyl, benzonaphthofuranyl, benzoxazolyl, 15 benzodioxolyl, benzodioxinyl, benzopyranyl, benzopyranonyl, benzofuranyl, benzofuranonyl, benzothienyl (benzothiophenyl), benzotriazolyl, benzo[4,6]imidazo[l,2-a]pyridinyl, carbazolyl, cinnolinyl, dibenzofuranyl, dibenzothiophenyl, furanyl, furanonyl, isothiazolyl, imidazolyl, indazolyl, indolyl, indazolyl, isoindolyl, indolinyl, isoindolinyl, isoquinolyl, indolizinyl, isoxazolyl, naphthyridinyl, oxadiazolyl, 2-oxoazepinyl, oxazolyl, oxiranyl, l-oxidopyridinyl, 20 l-oxidopyrimidinyl, l-oxidopyrazinyl, l-oxidopyridazinyl, l-phenyl-l/7-pyrrolyl, phenazinyl, phenothiazinyl, phenoxazinyl, phthalazinyl, pteridinyl, purinyl, pyrrolyl, pyrazolyl, pyridinyl, pyrazinyl, pyrimidinyl, pyridazinyl, quinazolinyl, quinoxalinyl, quinolinyl, quinuclidinyl, isoquinolinyl, tetrahydroquinolinyl, thiazolyl, thiadiazolyl, triazolyl, tetrazolyl, triazinyl, and thiophenyl (i.e. thienyl). Unless stated otherwise specifically in the spécification, a heteroaryl group 25 can be optionally substituted.

The terni “substituted” used herein means any of the above groups (i.e., alkyl, alkenyl, alkynyl, cycloalkyl, cycloalkenyl, cycloalkynyl, heterocyclyl, aryl, heteroaryl, alkoxy, aryloxy, acyl, alkylcarbamoyl, alkylcarboxamidyl, alkoxycarbonyl, alkylthio, or arylthio) wherein at least one atom is replaced by a non-hydrogen atoms such as, but not limited to: a halogen atom such as F, Cl, Br, and 30 I; an oxygen atom in groups such as hydroxyl groups, alkoxy groups, and ester groups; a sulfur atom in groups such as thiol groups, thioalkyl groups, sulfone groups, sulfonyl groups, and sulfoxide

205

21931 groups; a nitrogen atom in groups such as amines, amides, alkylamines, dialkylamines, arylamines, alkylarylamines, diarylamines, N-oxides, imides, and enamines; a Silicon atom in groups such as trialkylsilyl groups, dialkylarylsilyl groups, alkyldiarylsilyl groups, and triarylsilyl groups; and other heteroatoms in various other groups. “Substituted” also means any of the above groups in which one or more atoms are replaced by a higher-order bond (e.g., a double- or triple-bond) to a heteroatom such as oxygen in oxo, carbonyl, carboxyl, and ester groups; and nitrogen in groups such as imines, oximes, hydrazones, and nitriles. For example, “substituted” includes any of the above groups in which one or more atoms are replaced with -NR_gRh, -NR_gC(=O)Rh, -NR_gC(=O)NR_gR_h, -NR_gC(=O)OR_h, -NR_gSO₂Rh, -OC(=O)NR_gR_h, -O R_g, -SR_g, -SORg, -SOsRg, -OSO₂R_g, -SO₂OR_g, =NSO₂R_g, and -SO₂NR_gRh. “Substituted also means any of the above groups in which one or more hydrogen atoms are replaced with -C(=O)R_g, -C(=O)OR_g, -C(=O)NR_gR_h, -CH₂SO₂R_g, -CH₂SO₂NR_gR_h. In the foregoing, R_g and Rh are the same or different and independently hydrogen, alkyl, alkenyl, alkynyl, alkoxy, alkylamino, thioalkyl, aryl, aralkyl, cycloalkyl, cycloalkenyl, cycloalkynyl, cycloalkylalkyl, haloalkyl, haloalkenyl, haloalkynyl, heterocyclyl, /V-heterocyclyl, heterocyclylalkyl, heteroaryl, jV-heteroaryl and/or heteroarylalkyl. “Substituted” further means any of the above groups in which one or more atoms are replaced by an amino, cyano, hydroxyl, imino, nitro, oxo, thioxo, halo, alkyl, alkenyl, alkynyl, alkoxy, alkylamino, thioalkyl, aryl, aralkyl, cycloalkyl, cycloalkenyl, cycloalkynyl, cycloalkylalkyl, haloalkyl, haloalkenyl, haloalkynyl, heterocyclyl, jV-heterocyclyl, heterocyclylalkyl, heteroaryl, .V-heteroaryl and/or heteroarylalkyl group. “Substituted” can also mean an amino acid in which one or more atoms on the side chain are replaced by alkyl, alkenyl, alkynyl, acyl, alkylcarboxamidyl, alkoxycarbonyl, carbocyclyl, heterocyclyl, aryl, or heteroaryl. In addition, each of the foregoing substituents can also be optionally substituted with one or more of the above substituents.

As used herein, the Symbol “ ~” (hereinafter can be referred to as “a point of attachment bond”) dénotés a bond that is a point of attachment between two Chemical entities, one of which is depicted as being attached to the point of attachment bond and the other of which is not depicted as being attached to the point of attachment bond. For example, “ î ” indicates that the Chemical entity “XY” is bonded to another Chemical entity via the point of attachment bond. Furthermore, the spécifie 206

point of attachment to the non-depicted Chemical entity can be specified by inference. For example, _XY_i_ the compound CH3-R³, wherein R³ is H or “ i ” infers that when R³ is “XY”, the point of attachment bond is the saine bond as the bond by which R³ is depicted as being bonded to CHj.

As used herein, by a “subject” is meant an individual. Thus, the “subject” can include domesticated 5 animais (e.g., eats, dogs, etc.), livestock (e.g., cattle, horses, pîgs, sheep, goats, etc.), laboratory animais (e.g., mouse, rabbit, rat, guinea pig, etc.), and bîrds. “Subject” can also include a mammal, such as a primate or a human. Thus, the subject can be a human or vcterinary patient. The tenu “patient” refers to a subject under the treatment of a clinician, e.g., physician.

The tenus “inhibit”, “inhibiting” or “inhibition” refer to a decrease in an activity, expression, function 10 or other biological parameter and can include, but does not require complété ablation of the activity, expression, function or other biological parameter. Inhibition can include, for example, at least about a 10% réduction in the activity, response, condition, or disease as compared to a control. In embodiments, expression, activity or function of a gene or protein is decreased by a statistically significant amount. In embodiments, activity or function is decreased by at least about 10%, about 15 20%, about 30%, about 40%, about 50%, and up to about 60%, about 70%, about 80%, about 90% or about 100%.

By “reduce” or other forms of the word, such as “reducing” or “réduction,” is meant lowering of an event or characteristic (e.g., tumor growth). It is understood that this is typically in relation to some standard or expected value, in other words it is relative, but that it is not always necessary for the 20 standard or relative value to be referred to. For example, “reduces tumor growth” means reducing the rate of growth of a tumor relative to a standard or a control (e.g., an untrcated tumor).

As used herein, “treat,” “treating,” “treatment and variants thereof, refers to any administration of the disclosed compounds that partially or completely alleviates, améliorâtes, relieves, inhibits, delays onset of, reduces severity of, and/or reduces incidence of one or more symptoms or features of a 25 disease as described herein. In reference to a patient, the term “treatment” refers to the medical management of a patient with the intent to cure, ameliorate, stabilize, or prevent a disease, pathological condition, or disorder. This terni includes active treatment, that is, treatment direeted specifically toward the improvement of a disease, pathological condition, or disorder, and also includes causal treatment, that is, treatment direeted toward removal of the cause of the associated 30 disease, pathological condition, or disorder. In addition, this term includes palliative treatment, that is, treatment designed for the relief of symptoms rather than the curing of the disease, pathological 207 condition, or disorder; preventative treatment, that is, treatment directed to minimizing or partially or completely inhibiting the development of the associated disease, pathological condition, or disorder; and supportive treatment, that is, treatment employed to supplément another spécifie therapy directed toward the improvement of the associated disease, pathological condition, or disorder.

The term “therapeutically effective” refers to the amount of the disclosed compound and/or composition used is of sufficient quantity to ameliorate one or more causes or symptoms of a disease or disorder. Such amelioration only requîtes a réduction or alteration, not necessarily élimination.

The term “pharmaceutically acceptable” refers to those compounds, matériels, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and/or animais without excessive toxicity, irritation, allergie response, or other problems or complications commensurate with a reasonable benefit/risk ratio.

The term “carrier” means a compound, composition, substance, or structure that, when in combination with a compound or composition of the présent disclosure, aids or facilitâtes préparation, storage, administration, delivery, effectiveness, selectivity, or any other feature of the compound or composition for its intended use or purpose, or combinations thereof. For example, a carrier can be selected to minimize any dégradation of the active ingrédient and to minîmize any adverse side effects in the subject.

As used herein, the term pharmaceutically acceptable carrier refers to a carrier suitable foiadministration to a patient. A pharmaceutical carrier may be a substance that aids or facilitâtes préparation, storage, administration, delivery, effectiveness, selectivity, or any other feature of the compound or composition of the présent disclosure for its intended use or puipose, or combinations thereof. For example, a carrier can be selected to reduce dégradation of the compound or to reduce adverse side effects in the patient. In embodiments, a pharmaceutically acceptable carrier can be a stérile aqueous or nonaqueous solutions, dispersions, suspensions or émulsions, as well as stérile powders for reconstitution into stérile injectable solutions or dispersions just prior to use. Examples of suitable aqueous and nonaqueous carriers, diluents, solvents or vehicles include water, éthanol, polyols (such as glycerol, propylene glycol, polyethylene glycol and the like), carboxymethylcellulose and suitable mixtures thereof, vegetable oils (such as olive oil) and injectable organic esters such as ethyl oleate. Proper fluidity can be maintained, for example, by the use of coaling materials such as lecithin, by the maintenance of the rcquired particle size in the case of dispersions and by the use of surfactants. These compositions can also contain adjuvants such as preservatives, wetting agents,

208

21931 emulsifying agents and dispersing agents. Prévention of the action of microorganisms can be ensured by the inclusion of various antibacterial and antifungal agents such as paraben, chlorobutanol, phénol, sorbic acid, and the like. It can also be désirable to include isotonie agents such as sugars, sodium chloride, and the like. The injectable formulations can be sterilized, for example, by filtration through a bacterial-retaining filter or by incorporating sterilizing agents in the form of stérile solid compositions which can be dissolved or dispersed in stérile water or other stérile injectable media just prier to use. Suitable inert carriers can include sugars such as lactose.

The terni “pharmaceutically acceptable salts include those obtained by reacting the active compound functioning as a base, with an inorganic or organic acid to form a sait, for example, salts of hydrochloric acid, sulfuric acid, phosphoric acid, methanesulfonic acid, camphorsulfonic acid, oxalic acid, maleîc acid, succinic acid, citric acid, formic acid, hydrobromic acid, benzoîc acid, tartaric acid, fumaric acid, salicylic acid, mandelic acid, carbonic acid, etc. Those skilled in the art will further recognize that acid addition salts may be prepared by réaction of the compounds with the appropriate inorganic or organic acid via any of a number of known methods. The terni “pharmaceutically acceptable salts” also includes those obtained by reacting the active compound functioning as an acid, with an inorganic or organic base to form a sait, for example salts of ethylenediamine, N-methylglucamine, lysine, arginine, omithine, choline, Ν,Ν’-dibenzylethylenediamine, chloroprocaine, diethanolamine, procaine, N-benzylphenethylamine, dîethylamine, piperazine, tris-(hydiOxymethyl)aminomethane, tétraméthylammonium hydroxide, triethylamine, dibenzylamine, ephenamine, dehydroabietylamine, N-ethylpiperidine, benzylamine, tétraméthylammonium, tetraethylanimonium, methylamine, dimethylamine, trimethylamine, ethylamine, basic amino acids, and the like. Non limiting exemples of inorganic or métal salts include lithium, sodium, calcium, potassium, magnésium salts, and the like.

As used herein, the terni parentéral administration, refers to administration through injection or infusion. Parentéral administration includes, but is not limited to, subeutaneous administration, intravenous administration, or intramuscular administration.

As used herein, the terni subeutaneous administration refers to administration just below the skin. Intravenous administration means administration into a vein.

As used herein, the terni dose refers to a specified quantity of a pharmaceutical agent provided in a single administration. In embodiments, a dose may be administered in two or more boluses, tablets, or injections. In embodiments, where subeutaneous administration is desired, thedesired dose rcquires

209

a volume not easily accommodated by a single injection. In such embodiments, two or more injections may be used to achieve the desired dose. In embodiments, a dose may be administered in two or more injections to reduce injection site reaction in a patient.

As used herein, the tenn dosage unit refers to a form in which a pharmaceutical agent is provided.

In embodiments, a dosage unit is a vial that includes lyophilized compounds or compositions described herein. In embodiments, a dosage unit is a vial that includes reconstituted compounds or compositions described herein.

The term “therapeutic moiety” (TM) refers to a compound that can be used for treating, at least one symptom of a disease or disorder and can include, but is not limited to, therapeutic polypeptides, I0 oligonucleotides, small molécules and other agents that can be used to treat at least one symptom of a disease or disorder. In embodiments, the TM modulâtes the activity, expression, and/or levels of a target transcript. In embodiments, the TM decreases the levels of the target transcript through decay mechanisms. In embodiments the activity is the ability of the target transcript to bind to (e.g., sequester) one or more proteins. In embodiments, the TM modulâtes the activity of the target 15 transcript by reducing the affmity between the target transcript and one or more proteins that bind to the target transcript. As a resuit of reducing the affmity between the target transcript and the one or more proteins the activity of the one or more proteins may be modulated. For example, if the one or more proteins are not bound to the target transcript, they are available to carry out their functions, such as, for example, facilitating the splicing, alternative splicing, and/or exon skîpping of other 20 transcripts. As a resuit of the function of a TM, the activity, expression, and/or levels of the downstream genes that are regulated by the one or more proteins whose interaction with the target transcript is disrupted by the TM may be modulated.

The terms “modulate”, “modulating” and “modulation” refer to a perturbation of expression, function or activity when compared to the level of expression, function or activity prior to modulation.

Modulation can include an increase (stimulation or induction) or a decrease (inhibition or réduction) in expression, function, or activity. In embodiments, the activity of a target transcript is modulated. In embodiments, modulating the activity of the target transcript includes decreasing the ability of the target transcript to bind to one or more proteins. In embodiments, decreasing the affmity between the target transcript and the one or more proteins results in the modulation of the activity of the one or more proteins that interact with the target transcript. For example, if the one or more proteins are not bound to the target transcript, they are available to carry out their functions, such as, for example,

210 facilitating the splicing, alternative splicing, and/or exon skipping of other transcripts (e.g., downstream transcripts). As such, modulating the activity of the target transcrîpt may resuit in the modulation of the activity, expression, and/or levels of the downstream genes that are regulated by the one or more proteins whose interaction with the target transcrîpt may be disrupted.

“Amino acid” refers to an organic compound that includes an amino group and a carboxylic acid

R i H.N — C„ — COOH I group and has the general formula ^H where R can be any organic group. An amino acid may be a naturally occuiTÎng amino acid or non-naturally occun ing amino acid. An amino acid may be a proteogenic amino acid or a non-proteogenic amino acid. An amino acid can be an L-amino acid or a D- amino acid. The term amino acid side chain or side chain refers to the characterizing substituent (“R”) bound to the α-carbon of a natural or non-natural α-amino acid. An amino acid may be incorporated into a polypeptide via a peptide bond.

As used herein, an “uncharged” amino acid is an amino acid having a side chain that has a net neutral charge at pH 7.35 to 7.45. Examples of uncharged amino acids include, but are not limited to, glycine and citrulline.

As used herein, a “charged” amino acid is an amino acid having a side chain having a net charge at a pH of 7.35 to 7.45. An example of a charged amino acid is arginine.

As used herein, the terni “sequence identity” refers to the percentage of nucleic acids or amino acids between two oligonucleotide or polypeptide sequences, respectively, that are the same and in the same relative position. As such, one sequence has a certain percentage of sequence identity compared to another sequence. For sequence comparison, typically one sequence acts as a référencé sequence, to which test sequences are compared. Those of ordinary skill in the art will appreciate that two sequences are generally considered to be “substantially identical” if they contain identical residues in corresponding positions. In embodiments, the sequence identity between sequences may be determined using the Needleman-Wunsch algorithm (Needleman and Wunsch, 1970, J. Moi. Biol. 48: 443-453) as implemented in the Needle program of the EMBOSS package (EMBOSS: The European Molecular Biology Open Software Suite, Rice et aL, Trends Genet.(2000), 16: 276-277), in the version that exists as of the date of filing. The parameters used are gap open penalty of 10, gap extension penalty of 0.5, and the EBLOSUM62 (EMBOSS version of BLOSUM62) substitution matrix. The output of Needle labeled “longest identity” (obtained using the -nobrief option) is used

2ll as the percent identity and is calculated as follows: (Identical Residues* 100)/(Length of Alignment Total Number of Gaps in Alignment)

In other embodiments, sequence identity may be determined using the Smith-Waterman algorithm, in the version that exists as of the date of filing.

As used herein, “sequence homology” refers to the percentage of amino acids between two polypeptide sequences that are homologous and in the same relative position. As such one polypeptide sequence has a certain percentage of sequence homology compared to another polypeptide sequence. As wîll be appreciated by those of ordinary skill in the art, two sequences are generally considered to be “substantially homologous” if they contain homologous residues in corresponding positions.

Homologous residues may be identical residues. Altematively, homologous residues may be nonidentical residues with appropriately similar structural and/or functional characteristics. For example, as is well known by those of ordinary skill in the art, certain amino acids are typically classified as “hydrophobie” or “hydrophilic” amino acids, and/or as having “polar” or “non-polar” side chains, and substitution of one amino acid for another of the same type may often be considered a “homologous” 15 substitution.

As is well known in this art, amino acid sequences may be compared using any of a variety of algorithms, including those available in commercial computer programs such as BLASTP, gapped BLAST, and PSI-BLAST, in existence as of the date of filing. Such programs are described in Altschul, et al., J. Mol. Biol., (1990),215(3): 403-410; Altschul, et al., Nucleic Acids Res. (1997), 20 25:3389-3402; Baxevanis et al., Bîoinibrmatics A Practical Guide to the Analysis of Genes and

Proteins, Wiley, 1998; and Misener, et al., (eds.), Bioinfonnatics Methods and Protocols (Methods in Molecular Biology, Vol. 132), Humana Press, 1999. In addition to identifying homologous sequences, lhe programs mentioned above typically provide an indication of the degree of homology.

As used herein, “cell targeting moiety” refers to a molécule or macromolecule that specifically binds 25 to a molécule, such as a receptor, on the surface of a target cell. in embodiments, the cell surface molécule is expressed only on the surface of a target cell. In embodiments, the cell surface molécule is also présent on the surface of one or more non-target cells, but the amount of cell surface molécule expression is higher on the surface of the target cells. Examples of a cell targeting moiety include, but are not limited to, an antibody, a peptide, a protein, an aptamer, or a small molécule.

As used herein, the terms antisense compound and AC are used interchangeably to refer to a polymeric nucleic acid structure which is at least partially complementary to a target nucleic acid

212 molécule to which it (the AC) hybridizes. The AC may be a short (in embodiments, less than 50 bases) polynucleotide or polynucleotide homologue that includes at least a portion of a sequence complimentary to a target sequence. In embodiments, the AC is a polynucleotide or polynucleotide homologue that includes a portion that has a sequence complimentary to a target sequence in a target pre-mRNA strand. The AC may be formed of natural nucleic acids, synthetic nucleic acids, nucleic acid homologues, or any combination thereof. In embodiments, the AC includes oligonucleosides. In embodiments. AC includes antisense oligonucleotides. In embodiments, the AC includes conjugale groups. Nonlimiting examples of ACs include, but are not limited to, primera, probes, antisense oligonucleotides, extemal guide sequence (EGS) oligonucleotides, siRNAs, oligonucleotides, oligonucleosides, oligonucleotide analogs, oligonucleotide mimetics, and chimeric combinations of these. As such, thèse compounds can be introduced in the form of single-stranded, double-stranded, circulai', branched or hairpins and can contain structural éléments such as internai or terminal bulges or loops. Oligomeric double-stranded compounds can be two strands hybridized to form doublestranded compounds or a single strand with sufficient self-complementarity to allow for hybridization and formation of a fully or partially double-stranded compound. In embodiments, an AC modulâtes (increases, decreases, or changes) the expression, levels, and/or activity of a target transcript (e.g., target nucleic acid). In embodiments, the AC decreases the level of the target transcript through inducing decay mechanisms. In embodiments, the AC modulâtes the activity of the target transcript. In embodiments, the AC modulâtes the activity of the target transcript by decreasing the ability ofthe target transcript to bind one or more proteins. In embodiments, decreasing the affinity between the target transcript and the one or more proteins may resuit in the modulation of the activity of the one or more proteins. For example, if the one or more proteins are not bound to the target transcript, they are available to carry out their functions, such as, for example, facilitating the splicing, alternative splicing, and/or exon skipping of other transcripts (downstream transcripts). As such, AC mediated modulation of the activity of the target transcript may resuit in modulation of the activity, expression, and/or levels of the downstream genes that are regulated by the one or more proteins whose interaction with the target transcript may be disrupted.

As used herein, the tenus “targeting” or “targeted to” refer to the association of a therapeutic moiety, for example, an antisense compound with a target nucleic acid molécule or a région of a target nucleic acid molécule. In embodiments, the therapeutic moiety includes an antisense compound that is capable of hybridizing to a target nucleic acid under physiological conditions. In embodiments, the

213 antisense compound targets a spécifie portion or site within the target nucleic acid, for example, a portion ofthe target nucleic acid having at least one identifiable structure, function, or characteristic such as a particular exon or intron, or selected nucleobases or motifs within an exon or intron.

As used herein, the terms target nucleic acid sequence, “target nucléotide sequence, and “target sequence” refer to the nucleic acid sequence or the nucléotide sequence to which a therapeutic moiety, such as an antisense compound, binds or hybridizes. Target nucleic acids include, but are not limited, to a portion of a target transcript, target RNA (including, but not limited to pre-mRNA and mRNA or portions thereof), a portion of target cDNA derived from such RNA, as well as a portion of target non-translated RNA, such as miRNA. For example, in embodiments, a target nucleic acid can be a portion of a target cellular gene (or mRNA transcribed from such gene) whose expression or transcription îs associated with a particular disorder or disease State. The term “portion” refers to a defined number of contiguous (Le., linked) nucléotides ofa nucleic acid.

As used herein, the term “transcript” or “gene transcript” refers to an RNA molécule transcribed from DNA and includes, but is not limited to mRNA, pre-mRNA, and partially processed RNA.

The terms “target transcript” and “target RNA” refer to the pre-mRNA or mRNA transcript that is bound by the therapeutic moiety. The target transcript may include a target nucléotide sequence. In embodiments, the target transcript includes a target nucléotide sequence that includes an expanded CUG trinucleotide repeat.

The term “target gene” and “gene of interest” refer to the gene of which modulation ofthe expression and/or activity is desired or intended. The target gene may be transcribed into a target transcript that includes a target nucléotide sequence. The target transcript may be translated into a protein of interest. The term target protein refers to the polypeptide or protein encoded by the target transcript (e.g., target mRNA).

As used herein, the term “mRNA” refers to an RNA molécule that encodes a protein and includes pre-mRNA and mature mRNA. Pre-mRNA refers to a newly synthesized eukaryotic mRNA molécule directly after DNA transcription. In embodiments, a pre-mRNA is capped with a 5' cap, modified with a 3’ poly-A tail, and/or spliced to produce a mature mRNA sequence. In embodiments, pre-mRNA includes one or more introns. In embodiments, the pre-mRNA undergoes a process known as splicing to remove introns and join exons. In embodiments, pre-mRNA includes one or more splicing éléments or splice regulatory éléments. In embodiments, pre-mRNA includes a polyadenylation site.

214

As used herein, the term “expression, gene expression, “expression of a gene,” or the like refers to ail the functions and steps by which information encoded in a gene is converted into a functional gene product, such as a polypeptide or a non-coding RNA, in a cell. Examples of non-coding RNA include transfer RNA (tRNA) and ribosomal RNA. Gene expression of a polypeptide includes transcription of the gene to form a pre-mRNA, processing of the pre-mRNA to form a mature mRNA, translocating the mature mRNA from the nucléus to the cytoplasm, translation of the mature mRNA into the polypeptide, and assembly of the encoded polypeptide. Expression includes partial expression. For example, expression of a gene may be referred to as génération of a gene transcript. Translation of a mature mRNA may be referred to as expression of the mature mRNA.

As used herein, “modulation of gene expression” or the like refers to modulation of one or more of the processes associated with gene expression. For example, modification of gene expression may include modification of one or more of gene transcription, RNA processing, RNA translocation from the nucléus to the cytoplasm, and translation of mRNA into a protein.

As used herein, the tenir gene refers to a nucleic acid sequence that encompasses a 5' promoter région associated with the expression of the gene product, and any intron and exon régions and 3' untranslated régions (UTR) associated with the expression of the gene product.

The tenu “immune cell” refers to a cell of heinatopoietic origin and that plays a rôle in the immune response. Immune cells include, but are not limited to, lymphocytes (e.g., B cells and T cells), natural killer (NK) cells, and myeloid cells. The term “myeloid cells” includes monocytes, macrophages and granulocytes (e.g., basophils, neutrophile, eosinophils and mast cells). Monocytes are lymphocytes that circulate through the blood for l-3 days, after which time, they either migrate into tissues and differentiate into macrophages or inflammatory dendritic cells or die. The tenu “macrophage” as used herein includes fetal-derived macrophages (which also can be referred to as résident tissue macrophages) and macrophages derived from monocytes that hâve migrated from the bloodstream into a tissue in the body (which can be referred to as monocyte-derived macrophages). Depending on which tissue the macrophage is located, it be referred to as a Kupffer cell (lîver), an intraglomular mesangial cell (kidney), an alveolar macrophage (lungs), a sinus histiocyte (lymph nodes), a hofbauer cell (placenta), microglia (brain and spinal cord), or langerhans (skin), among others.

As used herein, the term oligonucleotide refers to an oligomeric compound comprising a plurality of linked nucléotides or nucleosides. One or more nucléotides of an oligonucleotide can be modified. An oligonucleotide can comprise ribonucleic acid (RNA) or deoxyribonucleic acid (DNA).

215

Oligonucleotides can be composed of natural and/or modified nucleobases, sugars and covalent întemucleoside linkages, and can further include non-nucleic acid conjugales.

As used herein, the tenir nucleosîde refers to a glycosylamine that includes a nucleobase and a sugar. Nucleosides include, but are not limited to, natural nucleosides, abasic nucleosides, modified nucleosides, and nucleosides having mîmetic bases and/or sugar groups. A natural nucleosîde or unmodifîed nucleosîde is a nucleosîde that includes a natural nucleobase and a natural sugar. Natural nucleosides include RNA and DNA nucleosides.

As used herein, the tenu natural sugar refers to a sugar of a nucleosîde that is unmodifîed from its naturally occuning form in RNA (2'-OH) or DNA (2'-H).

As used herein, the tenn nucléotide refers to a nucleosîde having a phosphate group covalently linked to the sugar. Nucléotides may be modified with any of a variety of substituents.

As used herein, the tenn nucleobase refers to the base portion of a nucleosîde or nucléotide. A nucleobase may include any atom or group of atoms capable of hydrogen bonding to a base of another nucleic acid. A natural nucleobase is a nucleobase that is unmodifîed from its naturally occurring form in RNA or DNA.

As used herein, the tenn heterocyclic base moiety refers to a nucleobase that includes a heterocycle. As used herein întemucleoside linkage refers to a covalent linkage between adjacent nucleosides. As used herein natural întemucleoside linkage refers to a 3' to 5' phosphodiester linkage.

As used herein, the term modified întemucleoside linkage refers to any linkage between nucleosides or nucléotides other than a naturally occumng întemucleoside linkage.

As used herein oligonucleoside refers to an oligonucleotide in which the întemucleoside linkages do not contain a phosphores atom.

As used herein the tenn chimeric antisense compound refers to an antisense compound, having at least one sugar, nucleobase, and/or întemucleoside linkage that is differentially modified as compared to the other sugars, nucleobases, and întemucleoside linkages within the same oligomeric compound. The remainder of the sugars, nucleobases, and întemucleoside linkages can be independently modified or unmodifîed. In general, a chimeric oligomeric compound will hâve modified nucleosides that can be in isolated positions or grouped together in régions that will define a particular motif. Any combination of modifications and or mimetic groups can include a chimeric oligomeric compound as described herein.

216

As used herein, the term mixed-backbone antisense oligonucleotide refers to an antisense oligonucleotide wherein at least one intemucleoside linkage of the antisense oligonucleotide is different from at least one other intemucleoside linkage of the antisense oligonucleotide.

As used herein, the term nucleobase complementarity refers to a nucleobase that is capable of base pairing with another nucleobase. For example, in DNA, adenine (A) is complementary to thymine (T). For example, in RNA, adenine (A) is complementary to uracil (LJ). In embodiments, complementary nucleobase refers to a nucleobase of an antisense compound that is capable of base pairing with a nucleobase of its target nucleic acid. For example, if a nucleobase at a certain position of an antisense compound is capable of hydrogen bonding with a nucleobase at a certain position of a target nucleic acid, then the position of hydrogen bonding between the oligonucleotide and the target nucleic acid is considered to be complementary at that nucleobase pair.

As used herein, the term non-complementary nucleobase refers to a pair of nucleobases that do not form hydrogen bonds with one another or otherwise support hybridization.

As used herein, the term complementary refers to the capacity of an oligomeric compound to hybridize to another oligomeric compound or nucleic acid through nucleobase complementarity. In embodiments, an antisense compound and its target are complementary to each other when a sufficient number of corresponding positions in each molécule are occupied by nucleobases that can bond with each other to allow stable association between the antisense compound and the target. One skilled in the art recognizes that the inclusion of mismatches is possible without eliminating the ability of the oligomeric compounds to remain in association. Therefore, described herein are antisense compounds that may include up to about 20% nucléotides that are mismatched (Le., are not nucleobase complementary to the corresponding nucléotides of the target). In embodiments, the antisense compounds contain no more than about 15%, for example, not more than about 10%, for example, not more than 5%, or no mismatches. The rcmaining nucléotides are nucleobase complementary or otherwise do not disrupt hybridization (e.g., universal bases). One of ordinary skill in the art would recognize the compounds provided herein are at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% nucleobase complementary to a target nucleic acid.

As used herein, hybridization means the pairing of complementary oligomeric compounds (e.g., an antisense compound and its target nucleic acid). While not limited to a particular mechanism, the most common mechanism of pairing involves hydrogen bonding, which may be Watson-Crick, Hoogsteen

217

or reversed Hoogsteen hydrogen bonding, between complementary nucleoside or nucléotide bases (nucleobases). For example, the natural base adenine is nucleobase complementary to the natural nucleobases thymine and uracil which pair through the formation of hydrogen bonds. The natural base guanine is nucleobase complementary to the natural bases cytosine and 5-methyl cytosine. Hybridization can occur under varying circumstances.

As used herein, the term specifically hybridizes refers to the ability of an oligomeric compound to hybridize to one nucleic acid site with greater affinity than it hybridizes to another nucleic acid site. In embodiments, an antisense oligonucleotide specifically hybridizes to more than one target site. In embodiments, an oligomeric compound specifically hybridizes with its target under stringent hybridization conditions.

Stringent hybridization conditions and stringent hybridization wash conditions in the context of nucleic acid hybridization are sequence dépendent and are different under different environmental parameters. An extensive guide to the hybridization of nucleic acids is found in Tijssen, Laboratory Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic Acid Probes part I chapter 2 OverView of principles of hybridization and the strategy of nucleic acid probe assays Elsevier, New York (1993). Generally, highly stringent hybridization and wash conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the spécifie sequence at a defined ionic strength and pH. The Tm is the température (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent hybridization conditions for hybridization of complementary nucléotide sequences which hâve more than 100 complementary residues on a fïlter in a Southern or Northern blot is 50% formamide with I mg of heparin at 42°C, with the hybridization being carried out ovemight. An example of highly stringent wash conditions is 0.15M NaCl at 72°C for about 15 minutes. An example of stringent wash conditions is a 0.2x SSC wash at 65°C for 15 minutes (see, Sambrook and Russel, Molecular Cloning: A laboratory Manual, 3^rd ed., Cold Spring Harbor Laboratory Press, 2001 for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example of a medium stringency wash for a duplex of, e.g., more than 100 nucléotides, is Ix SSC at 45°C for 15 minutes. An example of a low stringency wash for a duplex of, e.g., more than I00 nucléotides, is 4-6x SSC at 40°C for 15 minutes. For short probes (e.g., about 10 to 50 nucléotides), stringent conditions typically involve sait concentrations of less than about l.O M

2I8

Na ion, typically about 0.01 to l.O M Na ion concentration (or other salts) at pH 7.0 to 8.3, and the température is typically at least about 30°C. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.

As used herein, the term 2'-modified or 2'-substituted” means a sugar that includes substituent at the 2' position other than H or OH. 2'-modified monomers, include, but are not limited to, BNA's and monomers (e.g., nucleosides and nucléotides) with 2'- substituents, such as allyl, amino, azido, thîo, O-allyl, O-Cl-CIO alkyl, -OCF3, O-(CH2)2-O-CH3, 2'-O(CH2)2SCH3, O-(CH2)2-O-N(Rm)(Rn), or O-CH2-C(=O)-N(Rm)(Rn), where each Rm and Rn is, independently, H or substituted or unsubstituted Ci-Cio alkyl.

As used herein, the term MOE refers to a 2'-O-methoxyethyl substituent.

As used herein, the terni high-affinity modified nucléotide refers to a nucléotide having at least one modified nucleobase, intemucleoside linkage or sugar moiety, such that the modification increases the afhnity of an antisense compound that includes the modified nucléotide to a target nucleic acid. High-affinity modifications include, but are not limited to, BNAs, LNAs and 2'-MOE.

As used herein the term mimetic refers to groups that are substituted for a sugar, a nucleobase, and/ or intemucleoside linkage in an AC. Generally, a mimetic is used in place of the sugar or sugarintemucleoside linkage combination, and the nucleobase is maîntained for hybridization to a selected target. Représentative examples of a sugar mimetic include, but are not limited to, cyclohexenyl or morpholino. Représentative examples of a mimetic for a sugar-intemucleoside linkage combination include, but are not limited to, peptide nucleic acids (PNA) and morpholino groups linked by uncharged achiral linkages. In some instances, a mimetic îs used in place of the nucleobase. Représentative nucleobase mimetics are well known in the art and include, but are not limited to, tricyclic phenoxazine analogs and universal bases (Berger et al., Nue Acid Res. 2000, 28:291 1-14, incorporated herein by reference). Methods of synthesis of sugar, nucleoside, and nucleobase mimetics are well known to those skilled in the art.

As used herein, the tenu bicyclic nucleoside or BNA refers to a nucleoside wherein the furanose portion of the nucleoside includes a bridge connecting two atoms on the furanose ring, thereby forming a bicyclic ring System. BNAs include, but are not limited to, a-L-LNA, β-D-LNA, ENA, Oxyamino BNA (2'-O-N(CH3)-CH2-4') and Aminooxy BNA (2'-N(CH3)-O-CH2-4').

219

As used herein, the term 4' to 2' bicyclie nucleoside refers to a BNA wherein the bridge connecting two atoms of the furanose ring bridges the 4' carbon atom and the 2' carbon atom of the furanose ring, thereby fomiing a bicyclic ring System.

As used herein, a locked nucleic acid or LNA refers to a nucléotide modified such that the 2’hydroxyl group of the ribosyl sugar ring is linked to the 4' carbon atom of the sugar ring via a methylene group, thereby forming a 2'-C,4'-C-oxymethylene linkage. LNAs include, but are not limited to, α-L-LNA, and β-D-LNA.

As used herein, the term cap structure or terminal cap moiety refers to Chemical modifications, which hâve been incorporated at either end of an AC.The tenu therapeutic polypeptide” refers to a naturally occurring or recombinantly produced macromolecule that includes two or more amino acids and has therapeutic, prophylactic or other biological activity.

The term “small molécule” refers to an organic compound with pharmacological activity and a molecular weight of less than about 2000 Daltons, or less than about 1000 Dallons, or less than about 500 Daltons. Small molécule therapeutics are typically manufactured by Chemical synthesis.

Wild type target protein refers to a native, functional protein isomer produced by a wild type, normal, or unmutated version of the target gene. The wild type target protein also refers to a protein resulting from a target pre-mRNA that has been re-spliced.

A re-spliced target protein, as used herein, refers to the protein encoded by the mRNA resulting from the splicing of the target pre-mRNA to which the AC hybridizes. Re-spliced target protein may be identical to a wild type target protein, may be homologous to a wild type target protein, may be a functional variant of a wild type target protein, may be an isofonn of a wild type target protein, or may be an active fragment of a wild type target protein.

As used herein, an “expanded trinucleotide repeat,” such as an “expanded CUG or and “expanded” CTG repeat, means a gene containing or encoding the trinucleotide repeat contains a number of repeated consecutive trinucleotides that is greater than présent in a wild type gene. Expanded nucléotide repeats may be written as XXX-NNN or (XXX-NNN) where XXX refers to the DNA repeat and NNN refers to the RNA repeat that is transcribed from the DNA repeat. For example, the CTG-CUG repeat, refers to a gene having a CTG DNA repeat from which a RNA having a CUG repeat is transcribed. In embodiments, the number of repeats in an expanded trinucleotide repeat is 5 or more, 10 or more 15, or more or 20 or more than the wild type gene. In embodiments, the 220 expanded trinucleotide repeat includes 2x, 3x, 4x, 5x, l Ox, 20x, 50x or more trinucleotide repeats than the wild type gene. The expanded trinucleotide repeat may resuit in a disease in a subject having a gene that contains the expanded trinucleotide repeat. For example, a subject having an expanded CTG repeat in a gene may suffer from DMl or FECD. In DMl, the DPMK gene contains an expanded 5 CTG repeat. Subjects that suffer from DMl may hâve 50 or more CTG repeats in the 3’ untranslated région (UTR) of the DPMK gene, while non-disease subjects typically hâve 5 to 34 CTG repeats in the 3’ UTR ofthe DPMK gene. In FECD, the TCF4 gene contains an expanded CTG repeat. Subjects that suffer from FECD may hâve 40 or more CTG repeats in a CTG18.1 locus ofthe TCF4 gene, while non-disease subjects typically hâve 30 or less CTG repeats in the CTG18.1 locus ofthe TCF4 10 gene. mRNA transcribed from a gene having an expanded CTG repeat will hâve an expanded CUG repeat.

The term “downstream” in the présent disclosure, as it relates to a gene, mRNA, or protein, refers to a gene, mRNA, or protein that is affected by binding of AC to the target nueleotide (e.g., target transcript) but is not the gene, mRNA, or protein corresponding to the target nueleotide. Binding of 15 the AC to the target nueleotide may reduce aggregation or séquestration of RNA binding protein such as MBNLl or CUGBPl on accumulated mRNA having CUG repeats, which may make available such RNA binding proteins for proper transcription, RNA processing, and/or expression of downstream gene products.

As used herein, functional fragment or active fragment refers to a portion of a eukaryotic wild 20 type target protein that exhibits an activity, such as one or more activities of a full-length wild type target protein, or that possesses another activity. In embodiments, a re-spliced target protein that shares at least one biological activity of wild type target protein is considered to be an active fragment of the wild type target protein. Activity can be any percentage of activity (Le., more or less) of the full-length wild type target protein, including but not limited to, about l% of the activity, about 2%, 25 about 3%, about 4%, about 5%, about 10%, about 20%, about 30%, about 40%, about 50%, about

60%, about 70%, about 80%, about 90%, about 95%, about 96%, about 97%, about 98%, about 99%, about 100%, about 200%, about 300%, about 400%, about 500%, or more (including ail values and ranges in between these values) activity compared to the wild type target protein. Thus, in embodiments, the active fragment may retain at least a portion of one or more biological activities of 30 wild type target protein. In embodiments, the active fragment may enhance one or more biological activities of wild type target protein.

221

Wild type target protein refers to a native, functional protein isomer produced by a wild type, normal, or unmutated version of the target gene. The wild type target protein also refers to the protein resulting from a target pre-mRNA that has been properly spliced.

As used herein, the ternis “splicing” and “processing” refer to the modification of a pre-mRNA following transcription, in which introns are removed and exons are joined. Splicing occurs in a sériés of reactions that are catalyzed by a large RNA-protein complex composed of five small nuclear ribonucleoproteins (snRNPs) refened to as a spliceosome. Within an intron, a 3' splice site, a 5' splice site, and a branch site are required for splicing. The RNA components of snRNPs internet with the intron and may be involved in catalysis.

As used herein, alternative splicing refers to the splicing of different combinations of exons présent in a gene, which results in the génération of different mRNA transcripts from a single gene.

A re-spliced target protein, as used herein, refers to the protein encoded by the mRNA resulting from the splicing of the target pre-mRNA to which the AC hybridizes. Re-spliced target protein may be identîcal to a wild type target protein, may be homologous to a wild type target protein, may be a functional variant of a wild type target protein, or may be an active fragment of a wild type target protein.

Ail publications, patents and patent applications mentioned in the spécification are indicative of the level of skill of those skilled in the art to which this invention pertains. AU publications, patents and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

222

EXAMPLES

Example I. Assessment ot PMOs and PMO-EEV compounds for impacts on RNA foci formation and splicing rescue in DMl-related cell lines

The effect of a (CUG)? repeat PMO and a (CUG)? repeat PMO-EEV compound (A and D in Table 12) on RNA foci formation and splicing rescue was evaluated in DMl-related cell lines. A PMOEEV compound having a mismatched PMO sequence and a PMO having a scrambled sequence were also included (B and C in Table 12).

Experimental

Cells. The PMOs and PMO-EEVs were evaluated in a DMl HeLa cell model (HeLa-480), which is a stable cell line with high CUG repeat load and downstream splicing defects, and DMl myoblasts derived from a DMl patient with about 2600 CTG repeats and downstream splicing defects. HeLa480 récapitulâtes pathogenic hallmarks of DMl, including CUG ribonuclear foci and mis-splicing of pre-mRNA targets of the muscleblind (MBNL) alternative splicing factors. It was noted that DMl myoblasts grow quite slowly (doubling time of about 7 days) and do not transfect well. Control HeLa and HeLa-480 cells were treated with ENDOPORTER without additional compounds. HeLa-480 cells represent the disease State and HeLa cells represent the un-diseased State.

RNA foci analysis. HeLa-480 cells or DMl myoblasts were treated with 1 μΜ, 3 μΜ, or 10 μΜ of compounds A-D (Table 12). Ail compounds were transfcctcd without a transfection reagent or using the ENDOPORTER (available from GENETOOLS LLC in Philomath, Oregon) transfection agent desîgncd to deliver naturally charged PMOs into cells. Cells were incubated for 24 hours and then fixed for qualitative RNA foci analysis via microscopy. Upon qualitative visual inspection of the treated Hela-480 cells, compound A (PMO-EEV) showed the least amount of RNA foci. No conclusion was drawn from the DMl myoblasts.

Splicing analysis and RT-PCR. HeLa-480 cells treated as described above were harvested after 24 or 48 hours of the incubation and the total RNA was extracted. RT-PCR was performed to measure and/or quantify splicing patterns of DM-affected exons in target RNAssuchas MBNL1 and CLASP1. The percentage of exon inclusion of interest was evaluated.

223

Table 12: PMOs and PMO-EEVs tested in Example 1.

ID	Sequence (ail PMO)	Purity	Solvent
A	CAGCAGCAGCAGCAGCAGCAG-click-K-PEG 12- K(Ff<bRrRrQ)-PKKKRKV-Ac (SEQ ID NO: 154)-click-K-PEG12-(SEQ ID NO:78)-(SEQ ID NO: 42)-Ac	91	Saline (0.9% NaCl)
B	GTAACTGTATTTGGTACTTCC-C3-NH2-PEG4-COT- PEG12-PKKKRKV-(Ff(DRrRrQ) (SEQ ID NO: 317)-C3-NH2-PEG4-COT-PEG12-(SEQ ID NO: 78)-(SEQ ID NO:42)-Ac	99	PBSX1
C	AGCCAGAGCACCGCAACCGGACGAG (SEQ ID NO: 318)	81	Saline (0.9% NaCl)
D	CAGCAGCAGCAGCAGCAGCAG (SEQ ID NO: 154)	75	Saline (0.9% NaCl)

Results. FIG. 6A-6D show the RT-PCR results ofthe alternative splicing events ofMBNLl (6A and 6C, exon 5 inclusion) and CLASP1 (6B and 6D, exon 19 inclusion) 24 hours (6A and 6B) and 48 5 hours (6C and 6D) after HeLa-480 cells were treated with compounds A-D and the ENDOPORTER transfection agent. A réduction of exon 5 inclusion in MBNLl was observed for cells treated with each of compounds A-D at both the 24- and 48-hour time points (FIG. 6A and 6C). Cells treated with compound A showed the largest rescue (decrease in exon 5 inclusion). An increase in exon 19 inclusion in CLASP1 was observed for cells treated with each of compounds A-D at both the 24- and 10 48-hour time points (FIG. 6B and 6D). Cells treated with compound A showed the largest rescue (increase in exon 19 inclusion). In contrast, when HeLa-480 cells were treated with A-D without the ENDOPORTER transfection reagent no change in the splicing events of MBNLl (FIG. 6E) or CLASP1 (FIG. 6F) were observed, except for compound A.

FIG. 7A-7B show the RT-PCR results of alternative splicing events ofMBNLl and CLASP1 after 15 DM 1 myoblasts were treated wdth compound A-D, the négative control DM-04, or the positive control DM-05. Treatment with ail the compounds resulted in the rescue ofthe splicing events ofMBNLl (FIG. 7A) and CLASP1 (FIG. 7B).

Example 2. Assessment of EEV-PMO 221-1 106 for impacts on splicing rescue in DMl-mouse model

224

The effect of a PMO only and a PMO-EEV (Table 13) on splicing rescue was evaluated in vivo using an HSA-LR DM1-mouse model.

Experimental.

Mouse Model. HSA-LR is a transgenic mouse model having expanded long CUG repeats (LR) in the 5 3'-UTRofa human skeletal actin (HSA) transgene and expresses CUGexp RNA (e.g., expanded CUG

RNA) at high levels in skeletal muscle (Mankodi et al., Science 2000, 289(5485):1769-1773). The HSA-LR mouse shows myotonie phenotype along with splicing defects. A Friend Virus B NIH Jackson (FVB/NJ) mouse model was used as a control and to make the HSA-LR transgenic mice.

Experimental design. Compounds A-D were adminîstered to the mice via Retro-orbital injection or 10 intravenous (IV) injection at a single dose of 100 pL compound solution per 20 g body weight. The scale was proportional to body weight of each mouse (e.g., 150 pL per 30 g body weight).

Table 13: PMOs and PMO-EEVs tested in Example 2.

ID	Compound	Sequence (ail PMOs)	Con. (mg/ml)	Solvent
(A)	Saline	Vehicle control	NA	Saline (0.9% NaCl)
(B)	221	5’-CAG CAG CAG CAG CAG CAG CAG3’ (SEQ IDNO: 154)	6	Saline (0.9% NaCl)
(C)	221-1106 at 20 mpk	5’-CAG CAG CAG CAG CAG CAG CAG-3’- PEG4COT -click-K-miniPEG2- Lys(FfFGRGRE)-miniPEG2-VKRKKKP-Ac (SEQ IDNO: 154)-PEG4COT-click-KmimPEG2-K(SEQ ID NO: 76)-miniPEG2(SEQ ID NO: 42)-Ac	4	Saline (0.9% NaCl)
(D)	221-1106 at 40 mpk	5’-CAG CAG CAG CAG CAG CAG CAG-3’PEG4COT-click-K-miniPEG2- Lys(FfFGRGRE)-miniPEG2-VKRKKKP-AC (SEQ ID NO: 154)-PEG4COT-click-KminiPEG2-K(SEQ ID NO: 76)-miniPEG2(SEQ ID NO: 42)-Ac	8	Saline (0.9% NaCl)

225

Animais were âge matched and assigned into six treatment groups. Two control groups were used; Group l: FVB/NJ mice (FVB/NJ; un-diseased control) and Group 2: HSA-LR (diseased control) mice, each of which were injected with saline. Four treatment groups (Groups A-D) were used; HSALR mice injected with compounds A, B, C, or D. The FVB/NJ mice were 5 weeks and 4 days old when injected, while the HSA-LR mice were 6 weeks and I or 2 days old when injected. Four mice (two males and two females) per group were utilized for this experiment. Mice were sacrificed l week post treatment. Tissues (Gastrocnemius, Quadriceps, Tibialis Anterior (TA)) were harvested and flash frozen in liquid nitrogen and stored at -80°C for further évaluation of splicing rescue analysis.

Total RNA was extracted from tissue samples and analyzed by RT-PCR to assess AC-induced alternative RNA splicing rescue events on (i) Atp2al exon 22, (ii) Nfix exon 7, (iii) Clcnl exon 7a, and (iv) Mbnl l exon 5. The pcrcentage of exon inclusion of interest was evaluated.

Results. Prior to treatment ail HSA-LR mice had myotonia. After injection of compounds, ail mice were disoriented. Mice injected with compounds A and B (Groups A and B) recovered within 15 minutes, while mice injected with compounds C and D (Groups C and D) took several hours to recover. Ail treated mice completely recovered by the next day. At time of sacrifice, Groups A and B had myotonia; however, Groups C and D clearly did not hâve myotonia.

FIGs. 8A-10D show the RNA splicing measurements for Atp2al (for exon 22 inclusion; FIGs. 8A, 9A, and 10A), Nfix (for exon 7 inclusion; FIG. 8B, 9B, and 10B), Clcnl (for exon 7a inclusion; FIG. 8C, 9C, and 10C) and Mbnll (for exon 5 inclusion; FIG. 8D, 9D, and 10D) in gastrocnemius (FIGs. 8A-8D), quadricep (FIG. 9A-9D), and tibialis anterior (FIG. 10A-10D) muscle tissue of treated mice. Mice treated with compounds C and D (PMO-EEV) showed a rescue of Atp2al and Nfix splicing events in gastrocnemius, quadriceps, and tibialis anterior tissue while the PMO and saline groups did not show rescue of splicing events (FIG. 8A, 8B, 9A, 9B, 10A, and 10B. Similarly, in the gastrocnemius and quadricep muscle tissue, mice treated with compounds C and D showed a rescue of Clcnl (FIG. 8C and 9C) and Mbnll (FIG. 8D and 9D) splicing events while the PMO and saline groups did not show rescue of splicing events. Regarding the splicing of Clcnl and Mbnll in the tibialis tissue (FIG. 10C-10D), no alternative splicing defects were detected in the control mouse line (FVB/NJ) and the DMl mouse model (HSA-LR), As such, treatment with compounds A-D did not resuit in splicing rescue in the Clcnl and Mbnll genes in the tibialis tissue.

226

These results demonstrate that positive impacts of PMO-EEV treatment on splicing rescue in vivo study using DMl mouse model as well as potential use of PMO-EEV compounds to treat myotonie dystrophy (DM).

Example 3. Assessment of various PMO-EEV compounds for correcting mis-splicing events in

Immortalized Myoblasts from DMl patients

The eflect of two DMPK CtJG-targeting PMO-EEVs ( 197-777 and 221-1106; Table 14) on splicing rescueof DMl-related genes was evaluated in vitro using DMl patient derived muscle myoblasts and myotubes.

Experimental.

Cell culture. Immortalized myoblasts from DM l patients (ASA308DM l ), and unaffected individuals (KM1421; ABl 190) were obtained. DMl patient myoblasts harbor 2600 CTG repeats in the 3'-UTR of DMPK. Myoblasts were cultured in a growth medium of Skeletal Muscle Cell Growth Medium (available from PromoCell in Heidelberg, Germany), 2% horse sérum (avaiiabie from Gibco in Bristol, RI), l% chick embryo extract (available from USB Coip in Cleveland, OH), and 0.5 mg/inL penicillin/streptomycin (Gibco). For myogénie différentiation, confluent cultures were switched to différentiation medium of DMEM supplemented with 2% horse sérum and cultured for four days.

Table 14: Compounds tested in Example 3

Compound ID	EEV Sequence (N to C)	PMO sequence (5-3')	PMO modifications	Conjugation Chemistry
PMO only	NA	CAGCAGCA GCAGCAGC AGCAG (SEQ ID NO: 154)	5ΌΗ	NA
197-777 (DMI-1 )	Ac-PKKKRKV- Lys(cyclo[Ff-Nal-RrRrQ])PEGi2-K(N3)-NH2 Ac-(SEQ ID NO: 42)— Lys(SEQ ID NO: 78)- PEGi2-K(N3)-NH2	CAGCAGCA GCAGCAGC AGCAG (SEQ IDNO: 154)	5’-sarcosine amide; 3’-C4cyclooctyne	Click
221-1106 (DM 1-2)	Ac-PKKKRKV-minîPEG2Lys(cyclo[FfFGRGRQ]miniPEG2-K(N3)-NH2	CAGCAGCA GCAGCAGC AGCAG	5’-OH; 3’secondary amine morpholino	?Click with a PEG4COT linker

227

Ac-(SEQ ID NO: 42)miniPEG2-Lys(SEQ ID NO: 76)-miniPEG2K(N3)NH2

(SEQ ID NO; 154)

Treatment. DMl patient muscle cells were treated with 10 μηι, 3 μηι, l μιπ, or 0.3 μηι of the compounds using two different treatment conditions. In the first condition, myoblasts were plated at 75-80% confluence, the compounds were serially diluted in growth medium, and cells were bathed for 24 hours to allow for free-uptake of compound. The compound-containing media was removed, myoblasts washed with IX DPBS (Gibco), and differentiated for four days prior to harvest. For the second condition ran in parallel, myoblasts were differentiated three days prior to treatment, compounds were serially diluted in différentiation medium, and myotubes were harvested for analysis 24 hours later.

RNA isolation and PCR. Total RNA was îsolated with the RNEASY Mini Kit (available from Qiagen in Germantown, MD) according to the manufacturées instructions. For exon inclusion, 100 ng RNA was reverse transcribed and used for PCR (OneStep RT-PCR Kit, Qiagen). Samples were analyzed by LabChip (available from PerkinElmer in Waltham, MD) with the HT DNA High Sensitivity Assay Kit.

Results. DMPK CUG targeting PMO (not conjugated with EEV) improved mis-splicing ofMBNLl exon 5 (data not shown). FIG. IIA-IIF shows mixed rescue of splicing dcfect ofMBNLl (FIG. 11 A) and its targets (SOS l, IR, DMD, B IN l, LDB3; FIG. 11B-11 F) in DM 1 patient derived muscle cells treated with various concentrations of DMPK CUG-targeting EEV-PMOs (CUG^P 197-777 and CUG^exp 221-1 106). EEV-PMO 197-777 elicited moderate correction of mis-splicing events in DMl patient muscle cells. MBNL1 and SOSl showed the best response of mis-splicing correction. EEVPMO 197-777 chosen as tool compound for follow-up experiments described below.

DMl patient derived myoblasts and myotubes were treated with 10 μηι, 3 μηι, and 1 μιη of DMPK CUG-targeting EEV-PMO 197-777 using methods similar to those described above. Rescue of alternative RNA splicing events of MNBL1 and MNBL1 targets was evaluated. Various extendts of splicing correction was observed for MNBL1 (exon 5 exclusion; FIG. 12A), SOSl (exon 25 inclusion; FIG. 12B), 1NSR (exon 11 inclusion; FIG. 12C), DMD (exon 78 inclusion, FIG. 12D), BINl(exon 11 inclusion; FIG. 12E), and LDB3 (exon 11 exclusion; FIG. 12F) after myoblasts and myotubes were treated with EEV-PMO.

228

FIGS. 44A-44D show the reversai of myotonia phenotypes in HSA-LR mice treated with 20 mpk 22l-Il06 quantified by muscle relaxation assay. FIG. 44A and 44C show plots of relaxation time to 80% ol peak isométrie force and FIG. 44B shows the force trace raw data. FIG. 44D shows the reversai of myotonia phenotypes in HSA-LR mice treated with 20 mpk 22I-H06 quantified by 5 représentative electromyography (EMG) traces.

Example 4. Evaluation of PMO-EEV 22l-l 120 in a DM! mouse model

A DMl mouse model was done to study the effect of EEV-PMO 22l-l 120 (also referred to as EEVPMO- DMl-3 or DMl-3; PMO 221 = 5’-CAG CAG CAG CAG CAG CAG CAG-3’ (SEQ ID NO: 154; ail PMO monomers); EEV 1120 = Ac-PKKK.RKV-AEEA-Lys(cyclo[FGFGRGRQ]-PEG12-OH 10 (Ac-(SEQ ID NO: 42)-AEEA-Lys(SEQ ID NO: 82)-PEG 12-OH)) on the splicing and mRNA levels of downstream genes in the same HSA-LR transgenic mouse model as described in Example 2. The PMO and EEV were conjugated using amide chemistry.

Experimental. There were two general treatment groups: I) wild-type mice; and 2) HSA-LR mice (DMl disease model). Within the HSA-LR treatment group there were two sub-treatment groups: 1 ) 15 HSA-LR treated with saline (control); and 2) HSA-LR + EEV-PMO 221-1120. Mice were treated with 15 mpk, 30 mpk, 60 mpk, or 90 inpk (based on the PMO) of the PMO-EEV or saline via tail intravenous injection. Seven days after treatment, mice were sacrificed, and tissues were collected for analysis.

Rl-PCR ussays (correction oj splicing). Tissues were homogenized by OMNI BEAD MILL 20 HOMOGENIZER and the RNA was extracted by QIACUBEQ. RT-PCR assays were performed using one-step RT-PCT kit (Qiagen) following the manufacturer’s protocols with 35 PCR cycles of 94° C for 30 seconds; 60° C for 30 seconds and 72° C for 30 seconds. Sequence of gene spécifie primers arc as follows: Clcnl exon 7a inclusion Forward primer = 5’TTCACATCGCCAGCATCTGTGC-3’ (SEQ ID NO: 319), Reverse primer = 5’25 CACGGAACACAAAGGCACTGAATGT-3’ (SEQ ID NO: 320); Mbnll exon5 inclusion forward primer = 5’-GCTGCCCAATACCAGGTCAAC-3’ (SEQ ID NO: 321), reverse primer = 5’TGGTGGGAGAAATGCTGTATGC-3’ (SEQ ID NO: 322);

Atp2al exon 22 inclusion forward primer = 5’-GCTCATGGTCCTCAAGATCTCAC-3' (SEQ ID NO: 323), reverse primer: 5’-GGGTCAGTGCCTCAGCTTTG-3’ (SEQ ID NO: 324);

229

Nfix exon 7 inclusion forward primer = 5’-TCGACGACAGTGAGATGGAG-3’ (SEQ ID NO: 325), reverse primer 5’ CAAACTCCTTCAGCGAGTCC-3’ (SEQ ID NO: 326). Primers forClcnl, Mbnll, and Atp2al were from Klein et al., The Journal of Clinical Investigation. 2019, 129 (l l), pg. 4739; and primers for Nfix were from Chen et al., Scientific Reports. 2016, 6( I ), pg. I. The cDNA products were separated on 2% agarose E-gel with SYBR SAFE dye. The perccntage exon inclusion was calculated by the ratio of the un-skipped band/(un-skipped band + skipped band).

Mouse DMl splicing index (mDSI) calculation. The mDSI was calculated following the literature protocol from Tanner et. al. (Nucleic acids research. 2021, 49 (4), pg. 2240-54). For each sample i, normalized splicing values were calculated for each splice event j as (PSI.j - PSI_wiJdiypej)/(PSIfisALR.j -PSIwiidiypej), where PSIwiidtypej is the average PSI for event j across the wildtype mice, and PSIusai.rj is the average PSI for event j across the HSALR mice. mDSI is then calculated as the mean of ail normalized splicing values, which arc Atp2al, Nfix, Mbnll and Clcnl in the studies.

ijRI-PCR assays (USA mRNA knockdown). Reverse transcription was perfonned using the HighCapaeity cDNA Reverse Transcription Kit from Life Technologies Corporation following the manufacturer’s protocole. Quantitative real-time PCR were perfonned using Bio-Rad SyBr Green Supemiix and QuantStudio3 qPCR machine with gene spécifie primers: HSA mRNA forward primer — 5 -TTCCATCGTCCACCGCAAAT-3’ (SEQ ID NO: 327), reverse primer = 5’AGTTTACGATGGCAGCAACG-3’ (SEQ ID NO; 328), both primers from Klein et al., The Journal of Clinical Investigation. 2019, 129 (H), pg. 4739; and mouse GAPDH forward primer = 5’AGGTCGGTGTGAACGGATTTG-3’ (SEQ ID NO: 329), reverse primer = 5’TGTAGACCATGTAGTTGAGGTCA-3' (SEQ ID NO: 330).

RNAseq. PolyA RNAscq using Next Génération Sequencing was done for transcriptome profiling. The Z-seore for each gene was calculated as the (sample value - the mean)/(the standard déviation). Differential splicing analysis was done on the RNAseq data to calculate the percent spliced (PSI) of individual exons for cach gene. The PSI is a ratio of nonnalized read counts indicating the inclusion of a transcript clement over the total normalized reads for that event (inclusion and exclusion rcads). For example, if an exon is included in the reads 100% of the time, the PSI is L Additionally, if an exon is excluded from the reads 100% of the time, the PSI is 0.

Twenty-two genes of interest known to be prédictive of DMl were analyzed. Additionally, the genes studied in Wagner et al. (PLOS Gen 2016 (47)) and Tanner et al. (NAR 2021 (48), 4, 2240-2254)

230 were analyzed. The mouse exons were mapped to the human location. In some cases, the boundary of the exons in mice and/or the human genome was not completely known. As such, the data was analyzed using different boundaries. The correct boundaries were verified using the RNAseq data.

RNA CUG Foci analysis. Tibialis anterior muscle sections were stained for CUG foci (FISH, red) and nuclei (Hoechst, blue). TA muscle sections were imaged and the number of nuclei having a CUG RNA foci were quantified.

Results

HSA mRNA knockdown. The diaphragm only expressed 5-10% of the HSA mRNA levels compared to quadriceps, tibialis anterior, and triceps (FIG. 13A). EEV-PMO treatment did not seem to change the HSA mRNA levels in the diaphragm (FIG. 13B). The expression level ofthe HSA 220 CUG repeats may not be suffîcient for the mis-splicing phenotype in DM 1 in the diaphragm.

The EEV-PMO knock downed HSA mRNA in a dose-dependent manner, confirming target engagement in quadricep (FIG. I4A), gastrocnemius (FIG. 14B), tricep (FIG. 14C), and tibialis anterior (FIG. 14D) tissue. Additionally, the Ct (cycle threshold) value of HSA mRNA is similar to the level of mouse GAPDH (-15), suggesting high expression of HSA transgene in HSA-LR mice in the quadriceps.

mDS/ (correction oj splicing). The mouse DM1 splicing index (mDSI) for the quadriceps, the gastrocnemius, the triceps, and the tibialis anterior are shown in FIG. 15A-15D. Treatment with EEVPMO corrected DMI relevant splicing defects (Atp2al exon 22, Nfix exon 7, Clcnl exon 7a, Mbnll exon 5) at 1-week post injection in the quadriceps (FIG. ISA), gastrocnemius (FIG. 15B), triceps (FIG. 15C), and tibialis anterior (FIG. 15D) in a dose dépendent manner with higher doses approaching or équivalent to wild type (full correction). Approximately 50%-60% human skeletal actin RNA knockdown in HSA-LR mice was achieved at drug concentrations that achieve near complété splicing correction.

FIGS. 16A-B show images of tibialis anterior tissue of HSA-LR mice (FIG. 16A) and HSA-LR mice treated with EEV-PMO (FIG. 16B) stained for CUG toci (red) and nuclei (blue). Qualitative and quantitative assessment (FIG. 16C) showed that EEV-PMO treatment reduced number of nuclei had CUG foci.

231

Drug Exposure. Drug exposure was studied using LC-MS. FIG. I7A-17D show a dose dépendent response for PMO-EEV exposure in the quadriceps (FIG. 17A), triceps (FIG. 17B), heart (FIG. 17C), gastrocnemius (FIG. 17D), tibialis anterior (TA; FIG. 17F), liver (FIG. 171), and kidney (FIG. 17J). No dose-dependent response was observed in the diaphragm (FIG. 17G). The EEV-PMO was not detected in the brain except at the 60 mpk and 90 mpk dosage levels. FIG. 17K shows drug exposure of varions tissues at the 60 mpk dosage level.

Myotonia Response: A dose dépendent myotonia réduction in HSA-LR mice 7 days after treatment with EEV-PMO-DMl-3at 15, 30, 60 and 90 mpk was observed (FIG. 18A). Myotonia is likely ameliorated one week after treatment with EEV-PMO-DM1-3. HSA-LR mice treated with a single dose of 90 mpk EEV-PMO-DM1-3 did not exhibit obvious signs of hind limb myotonia after induction.

RNAseq Data Analysis. FIG. 19A-19D show the results of a principal component analysis. Principal component analysis can be used to reveal the similarity between samples based on the distance matrix. This type of plot is useful for visualizing the overall effect of experimental covariates and batch effects. The x-axis is the direction that explains the most variance and the y-axis is the second most. The percentage of the total variance per direction is shown as PCA. The wild type and HSA-LR mice are in distinct groups. Gene expression in the gastrocnemius muscle of HSA-LR mice treated with PMO-EEV was shifted toward that of wild type mice.

FIG. 20A is a heatmap showing differentially expressed genes (by Z-score) from three treatment groups: 1 ) WT mice; 2) HSA-LR mice; and 3) HSA-LR + EEV-PMO 221-1120 (60 mpk). A total of 956 (p < 0.5) genes differentially expressed between the treatment groups, indicating a différence between the wild type mice (WT) and the disease mode! mice (HSA-LR). Treatment with EEV-PMO (HSA-LR (+,+)) resulted in global gene expression correction, shifting away from a disease profile (in red, HSA-LR (-,-)) and toward that of wild type mice (WT above).

FIG. 20B is a heatmap used to visuahze the expression profile of 40 of the 43 genes found to hâve more than 7 CTG repeats from a BLAST analysis. Three of the CTG repeat gene found in the Blast analysis (Crb2, Hsd3b6, and Inhbe) were not încluded due to low reads number. This analysis is useful to identify co-regulated genes across the treatment conditions.

FIG. 21 shows a volcano plot ofthe global transcriptional change across the EEV-PMO treated group and the HSA-LA group. Each data point in the scatter plot represents a gene. The fold change of each

232

21931 gene is represented on the x-axis and the loglO of its adjusted p-value is on the y-axis. Genes with an adjusted p-value less than 0.05 and a fold change greater than 2 are indicated by red dots. These represcnt up-regulated genes. Genes with an adjusted p-value less than 0.05 and a fold change less than -2 are indicated by blue dots. These represent down-regulated genes. Three genes were found to be significantly downregulated (Txlnb, Scube2 and Grebl) and one gene was significantly uprcgulated (Txlnb). Most transcripts containing at least (CUG)? were not significantly influcnced. PCA Analysis of these genes showed that Scube2 (FIG. 22A), Grebl (FIG. 22B), Ttc7 (FIG. 22C), Txlnb(CUG)9, and Ndrg3 (FIG. 22E) showed correction when treated with EEV-PMO. Txlnb is overcorrected by treatment (FIG. 22D).

FIGS. 23A-23D show the transcriptome data for various genes and various treatment groups. The IISA-LR+ EEV-PMO 22l-l 120 treatment group showed correction of the inclusion of exon 22 of Atp2al (FIG. 23A), exclusion of exon 7 of Clcnl(FIG. 23B), exclusion of exon 7 of Nfix (FIG. 23C), and the exclusion of exon 7 of Mbnl l ( FIG. 23D).

FIGS, 24 show the percent spliced (PSI) of individual exons for various genes. The genes are ΜΕΝΕΙ responsive splicing biomarkers (e.g., downstream genes). The choice of MBNL-l dépendent biomarkers was selected based on the dynamîc range between the wildtype and disease groups as described in the literature. The HSA-LR+EEV-PMO 22l-l 120 treatment group showed correction for exon inclusion/exclusion for ail of the 20 genes of interest including Mbnl I, Nfix, Atp2a I, Ldb3, Camk2g, Trim55, Fbox3l, Slc8a3, Map3k4, Dctn4, Cacnals, Ryrl. Slain2, Phkal, Ppp3cc, Ttn, Neb, lnfip2, Rapgefl, and Vsp39.

Example 5. Evaluation ofPMO-EEV 22l-l 120 în a DM l mouse inodel second DMl mouse mode!

PMO-EEV DMl-3 (22l-l 120; see Example 4 for the sequence) was evaluated in a second DMl mouse inodel using method similar to those described in Example 4.

Experimental. Seven-week-old HSA-LR mice were administered a single dose 80 mpk of EEVPMO-DMl-3 or a 20 mpk dose EEV-PMO-DMl-3 every other week for six weeks (total of 80 mpk over 4 doses) intravenously and tissues were harvested after l week to 12 weeks post the single dose or two weeks after the final dose. RT-PCR was used to détermine alternative splicing for spécifie genes (Atp2al, Clcnl, Nfix, MBNLl). Q-PCR was used to détermine the réduction of mRNA level ofactin-HSA after treatment. LC-mass was used to détermine drug level in quadricep, gastroenemius,

233 tibialis anterior, triceps, diaphragm, heart, kidney, liver, brain, and plasma. Myotonia réduction was recorded 7 days after treatment with the EEV-PMO-DMl-3 compound.

Results. Similar trends to those observed in Example 4 were observed for the rescue of splicing for Atp2al, Clcnl, Nfïx, and MBNLl in the tibialis anterior, gastrocnemius, tricep, and quadricep tissues {data not shown). Additionally, similar trends to those observed in Example 4 were observed for the HSA mRNA knockdown tibialis anterior, gastrocnemius, tricep, and quadricep tissues both l-week and 4-weeks post treatment (data not shown).

FIG. 25A-25D are plots showing a decrease in dru g level with 80 mpk EEV-PMO-DMl-3 after l week to 8 weeks in the tibialis anterior (FIG. 25A), gastrocnemius (FIG. 25B), triceps (FIG. 25C), and quadricep (FIG. 25D) tissues. EEV-PMO-DMl-3 (60 mpk oligo, 80 mpk whole drug) fully correct mis-splicing in gastrocnemius, triceps, tibialis anterior and quadricep post l week treatment. FIG. 26A-26B are plots showing a decrease in drug levels was observed with the single 80 mpk dose of EEV-PMO-DMl-3 after l week to 4 weeks, to 8 weeks, and to 12 weeks in the liver. A relatively higher amount of EEV-PMO-DMl-3 in the liver was observed 2 weeks post the last dose of the 6week dosing régime when compared to 4 weeks post the single dose régime.

FIG. 26C-26D shows a decrease in drug levels was observed with the single 80 mpk dose of EEVPMO-DMl-3 after l week to 4 weeks, to 8 weeks, and to 12 weeks in the kidney. A relatively low amount of EEV-PMO-DMl-3 in the kidney was also observed from 2 weeks post the last doseof the 6-week dosing régime when compared to 4 weeks post the single dose régime. At 12 weeks post the single dose of EEV-PMO-DMl-3, the drug was still present in the kidney but not in the liver.

Subjective myotonia observations were made and shown in Table 15. The multi-dosing régime (Q2W) showed no signs of myotonia rescue after two weeks post treatment. There was a mixed effect on myotonia in mice treated with the single 80 mpk dose that disappeared by 12 weeks post treatment.

Table 15. Myotonia Observations

Group___________	Predosing	1 week post dose	4-week post dose	8-week post dose	12-week post dose
Gender	F	M	F	M	F	M	F	M	F	M
FVB	0	0	0	0	0	0	0	0	0	0
HSA-LR	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ -H
221-112 80-mpk	+ +	+ 4-	+	+	+	+	0	+	+ +	+ +
221-1120 20 mpk Q2W		+ +	NA	NA	NA	NA	+ 4-	_\|—μ	NA	NA

234

F - female; M- male; 0 = no mice displayed myotonia; + = mixed myotonia, some mice displayed reduced myotonia ; ++ = ail mice displayed myotonia

A similar experiment was performed to evaluate intravenous administration of EEV-PMO-DM l -3 for a longer duration and at a higher dose. Eight-week-old HSA-LR mice were treated with 40 mpk, 60 rnpk, 80 mpk, or 120 mpk of EEV-PMO-DM l-3 intravenously and tissues were harvesled after 4 5 weeks to 12 weeks. RT-PCR was used to détermine alternative splicing for spécifie genes (Atp2aI, Clcnl, Nfix, MBNLl). Myotonia réduction was recorded 7 days after treatment with EEV-PMODM l-3. Similar trends to those observed in Example 4 were observed for the rescue of splicing for Atp2al, Clcnl, Nfix, and MBNLl in the tibialis anterior gastroenemius tissues (data not shown).

Subjective myotonia observations were made and shown in Table 16. Females displayed more 10 myotonia than males. There are no signs of myotonia in both male and female mice dosed with 120 mpk after 8 weeks post treatment.

Table I6. Myotonia Observations

Group	Predosîng	l week post dose	4-week post dose	8-week post dose	12-week post dose
Gender	F	M	F	M	F	M	F	M	F	M
FVB	0	0	0	0	0	0	0	0	0	0
HSA-LR	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ +	+ +
221-H2040 mpk	+ +	+ +	+	+ +	+ +	+ +	NA	NA	NA	NA
22l-l 120 60 mpk	+ +	+ +	+	0	+	+	NA	NA	NA	NA
221-H20 80 mpk	+ +	+ +	0	0	+	+	+	0	+	0
221-1120 120 mpk	+ -r	+ +	0	0	0	0	0	0	0	0

F = female; M= male; 0 = no mice displayed myotonia; + ~ mixed myotonia, some mice displayed reduced myotonia ; -H- = ail mice displayed myotonia_____________ îxample 6. Treatment of Patient derived DMl cells with EEV-PMO-DM l-3

PMO-EEV DM l -3 (22 l-l 120; see Example4 for the sequence) was evaluated in DM I patient derived myoblasts.

Experimental. Patient myoblasts were treated with 30 inicromolar of DM I -3 throughout four days of différentiation. Splicing correction was assessed by one-step RT-PCR and Labpchip (Plottcd mean ± SD; n=4). HCR-FISH and sequestered MBNLl protein détection assays were used to detect RNA CUG loci. Results: EEV-PMO-DM l-3promotes significant biomarker splicing correction and a 20 réduction in nuclear foci in DM l patient-derived muscle cells.

235

Results. FIG. 27A-27C are plots showing that EEV-PMO-DM l-3 promotes significant biomarker splicing correction (MBLNl, SOSl, and NFIX) in DMl patient-derived muscle cells. Additionally, treatment with DMl-3 resulted in the réduction of nuclear foci in DMl patient-derived muscle cells (FIG. 28A-28C).

Example 7. Cytotoxicity Screening of EEV-PMO-DM l-3 in Rénal Cells

PMO-EEV DMl-3 (22l-l 120; see Example 4 for the sequence) was evaluated in human rénal cells.

Experimental. Human Primary Rénal Proximal Tubular Epithelial Cells (RPTECs) were exposed to varying concentrations ( 1:2 serial dilution in saline with a final dilution factor of 4x from about 6 μΜ to about 800 pM) of PMO-DMl and EEV-PMO-DM l-3 for 24 hours and screened for viability using a CELLTITER-GLO luminescent viability assay. Melittin was used as a positive control at 16.6 pM.

Results. FIG. 29A-29B show that PMO-DMl or its conjugated EEV-PMO-DMl-3 did not show any toxicity even with the highest concentration of 81 7 pM or 797pM, respectively.

Example 8. Assessment of PMO-EEV 221-H13 for ability to correct mis-splicing events and downstream splicing in immortalized cells DM l patients and HeLa-480 cells

Immortalized DMl patient-derived (2,600 CUG repeats) muscle cells and HeLa-480 (DMl model cell line, see Example l) were treated with the EEV-PMO construct 221-I 113 and analyzcd for correction of aberrant splicing and foci quantification. EEV 1H3 is Ac-PKKKRKV-miniPEGK(cyc/o(Ff-Nal-GrGrQ)-PEGl2-OH (Ac-(SEQ ID NO: 42)-miniPEG-K(cyclo(SEQ ID NO: 80)PEG12-OH). EEV-PMO 22l-l 113 is EEV 11 13 conjugated to PMO sequence 221 (5’-CAG CAG CAG CAG CAG CAG CAG-3’ (SEQ ID NO: 154; ail PMO monomers) via amide bond chemistry.

Experimental. Methods similar to those described in Example l and Example 3 were used.

Results,

RNA CUG Foci analysis, Cells were stained for nuclei (Hoeschet, blue) and for RNA CUG repeat foci (green) and imaged, A réduction in RNA CUG foci was observed between untreated DMl patient cells and EEV-PMO treated DMl cells (FIG. 30A-30C). Similarly, a réduction in RNA CUG foci was observed between untreated HeLa-480 cells and EEV-PMO HeLa-480 cells (FIG. 31A-31B).

Correction of downstream splicing. PMO-EEV treated DMl patient derived cells and HeLa-480 cells were analyzed for percent exon 5 inclusion for MBLNl, percent exon 25 inclusion for SOSl, and

236

21931 percent inclusion of exon 7 for NFIX. Treatment with EEV-PMO resulted in a rescue of splicing events for Mbnll (FIG. 32A), Sosl (FIG. 32B), and NFIX (FIG. 32C).

Additionally, HeLa-480 cell treated EEV-PMO showed a dose dépendent correction of MBNLl (FIG. 33A) splicing and the downstream missplicings of SOSl (FIG. 33B), CLASPl (FIG. 33C), 5 NFIX (FIG. 33D), and INSR (FIG. 33E) in a dose dépendent manner.

Example 9. Evaluation of EEV-PMO 221 -1106 in a second DM l mouse model

A DM l mouse model was done to study the effect of EEV-PMO 221 -1106 on the splicing and mRN A levels of downstream genes.

Experimental. Human skeletal actin long repeat (HSA-LR) transgenic mice were used as the DMl 10 disease model. Similar methods to those described in Example 5 were used.

Résulte. FIGS. 34A-34D show a dose dépendent correction of the inclusion of exon 22 in Atp2al (FIG. 34A), exon 7 in Nfix (FIG. 34B), exon 7A in Clcnl (FIG. 34C), and Mbnll (FIG. 34D) in the gastrocnemius of mice treated with various concentrations of EEV-PMO 22l-l 106. Treatment with PMO 221 alone did not resuit in correction of splicing.

Example 10. DM I mouse model to study effect of different lengths of CUG repeats in PMOs

A DMl mouse model was done to study the effect of PMO-EEV 221-I I2l (PMO has7 CAG repeats, 2l-mer) and PMO-EEV 0325-112l (PMO has 8 CAG repeats, 24-mer) on the splicing and mRNA levels of downstream genes. PMO-EEV 221 -1121 is PMO 221 (5 ’-CAG CAG CAG CAG CAG CAG CAG-3’; SEQ ID NO: 154; ail PMO monomers) conjugated to EEV ll2l (Ac-PKKK.RK.V20 miniPEG2-Lys(cycIo[GfFGrGrQ])-PEGI2-OH; Ac-(SEQ ID NO: 42)-miniPEG2-Lys(SEQ ID NO:74)-PEGl2-OH ) via amide chcmistry. PMO-EEV 0325-112l is PMO 0325 (5’-CAG CAG CAG CAG CAG CAG CAG-CAG-3’; SEQ ID NO: 155; ail PMO monomers) conjugated to EEV l I2l via amide chemistry.

Experimental. Human skeletal actin long repeat (HSA-LR) transgenic mice were used as the DMl 25 disease model. Briefly, HSA-LR mice were dosed with 20 mpk, 40 mpk, or 60 mpk of either 0221 l I2l or 0325-1 I2l via intravenous injection into the tail vein. One week post injection, mice were sacrificed, and tissue was collected. Other experimental methods are similar to those described in Example 5 were used.

237

Results. 022l-l I2l (2l-mer) was more effective in correcting exon splicing in Mbnll (FIG. 35A), Nfix (FIG. 35B), and Atp2al (FIG. 35C) than 0325-1 121 (24-mer) in the tibialis anterior tissue. This resuit was unexpected. It was expected that the 24-mer would be more effective as it would hâve a higher hybridization efficiency and higher thermal melting température. In the gastrocnemius tissue, 5 the différences were less pronounced as shown in FIGS. 36A-36C, Subjective myotonia observations were made using the male mice (Table 17). Mixed myotonia was observed at l week post treatment for the 21 -mer at 40 mpk, similar to the results in Table I l.

Table 17: Myotonia Observations

Group	Pre-dosing	1 week post dose	4 week post dose
Gender	M	M	M
FVB	0	0	NA
HSA-LR	+ +	+ +	NA
221-112 20 mpk	+ +	+ +	NA
221-1120 40 mpk	+ +	4-
325-1120 20 mpk	+ +	+ +	NA
325-1120 40 mpk	+ +	+ +

M= male; 0 = no mice displayed myotonia; + = mixed myotonia, some mice displayed reduced myotonia ; ++ = ail mice displayed myotonia

Exemple H. Pharinacokinetic studics of the EEV-PMO 22l-l 12(1 in GDI mice

A CD 1 mouse model was used to study the plasma, kidney, and tibialis anterior drug exposure (AUC) to the EEV-PMO construct 221-1120 (sec Example 4 for the sequence) and PMO-0221a, the major métabolite of 221-1 120 (see FIG. 37) was also measured.

Experimental. Five- to seven-week-old CD1 mice were treated with 80 mpk of the EEV-PMO construct 221-1120 via intravenous injection. Mice were bled and/or scarified at various time points.

Results. Table 18, Table 19, and Table 2ü show the pharinacokinetic propertics observed in the plasma, kidney, and tibialis anterior, respectively. For the tables: AUCiast = area under the curve from zéro to last quantifiable concentration; D = dose; C_max = maximum sérum or plasma concentration; T_max = time to reach C_max; CL = total plasma, sérum, or blood clearance; 11/2 = élimination half-life; V_ss - apparent volume of distribution at equilibrium; Qi, = hepatic blood flow (ml/min/kg).

238

The AUC values for the métabolite is — l 000-fold lower in the tibias anterior compared to the kidney, The métabolite mean résidence time (MRT) values in plasma may be directly related to tissue MRT values as a resuit of moving from tissues to plasma before urinary excrétion.

Table 18: Plasma pharmacokinetic properties

	221-1120	PMO-O221a
AUC\|_ast(nM*hr)	9290	953
AUClast/D	1121	115
Cma^nM)	16217	12
C^/D	1956	1.4
Ttnax(hr)	0.1	24
CL (mL/min/kg)	15	-
Qh(%)	16
t'/Xhr)	19	68
MRTi_as,(hr)	1.2	60.0
V_iS(mL/kg)	1325	*

Table 19: Kidney pharmacokinetic properties

	221-1120	PMO-0221a
AUCiast(pmol/g*hr)	51609	10113709
AUClasi/D	6225	1219865
C,nax(pmol/g)	8754	105053
Cmax/D	1056	12671
Tmaxfhr)	4	24
Mhr)	5	87
MRTia_S1{hr)	8	68

Table 20: Tibialis anterior pharmacokinetic properties

	221-1120	PMO- 0221a
AUC iasi(pmol/g*hr)	-	9140
AUClasi/D	-	1102
Cnudpmol/g)	34	162
c_tos/d	4	20
T_inai(hr)	4	24
t'/,(hr)	-	48
MRThst(hr)	-	50

239

Example 12. Evaluation of PMO-EEV 221-1120 in a third DMl mouse mode!

A DMl mouse model study similar to Examples 5 and 9 was conducted to evaluate the effect of various doses of PMO-EEV 221-220 (see Example 4 for the sequence) in HSA-LR mice.

Experimental. Eight-week-old HSA-LR mice were adminîstered 40, 60, 80 or 120 mpk of PMO5 EEV 221-1120 intravenously and tissues were harvested after 4 to 12 weeks. RT-PCR was used to détermine alternative splicing for spécifie genes (Atp2al, Clcnl, Nfix, MBNLl ). LC-mass was used to deteimine drug level in Quad, gastro, TA, Triceps, diaphragm, heart, kidney, liver, brain, plasma. RNA-seq was used to détermine the transcription level change between a treated disease model, an untreated disease model and wild-type. Q-PCR was used to détermine the réduction of mRNA level 10 of actin-HSA after treatment.

Results. Fluorescence imaging was used to deteimine RNA Foci réduction after treatment with the EEV-oligo compound (data no shown). Myotonia réduction was recorded 7 days after treatment with the EEV-oligo compound (data not shown). The results of these experiments show similar trends to the mice treated with PMO-EEV 221-1120 in Example 5 and Example 14.

MBNLl, NFIX, and ATP2A1 splicing correction was observed in the tibialis anterior (FIGS. 38A, 39 A, 40A) and gastroenemius (FIGS. 38B, 39B, 40B) at various doses ofthe EEV-PMO. MBNLl, NFIX, and ATP2A1 splicing correction was observed in both the tibialis anterior and gastroenemius 12 weeks post treatment with 120 mpk EEV-PMO.

Example 13. Evaluation of EEV-PMO 221-1120 in HeLa480 cells

HeLa480 cells were treated with various concentrations of EEV-PMO 221-1120 (see Example 4 for the sequence) and analyzed for CUG repeat foci, sélective r(CUG) réduction, and downstream splicing correction of MBNLl and SOS1.

Experimental. Hela480 cells were constructed as described in earlier Examples. RT-PCR and foci staining was performed similar to other Examples described herein.

Results. FIG. 41A-41B shows example images of control cells (untreated) and cells treated with 5 μΜ, 10 μΜ, 20 μΜ, 50 μΜ, and 100 μΜ of EEV-PMO 221-1120. There is a réduction in CUG foci (green) in the treated HeLa480 group when compared to the untreated HeLa480 cells. FIG. 41B is a plot quantifying the foci per nuclear area. FIG.41A-41B show that EEV-PMO 221-1120 can reduce nuclear CUG RNA foci. Almost a complété réduction is observed in the 5 μΜ dose.

240

FIG. 42A-42B indicate the EEV-PMO 221-1120 treatment can selectively knockdown repeat expansion-containing DMPK transcript in the HeLa480 cell line.

FIG. 42C-42D show that treatment with EEV-PMO 22l-l 120 conected MBNLl (FIG. 42C) and SOCS l (FIG. 42D) splicing in a dose dépendent manner.

A number of embodiments of the invention hâve been described. Nevertheless, it will be understood that varions modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following daims.

Claims

A compound comprising:

(a) an antisense compound that is complementary to at least a portion of an expanded CUG repeat in a target mRNA sequence;

(b) a cyclic peptide ofthe formula:

or a protonated form thereof, wherein:

Ri, R₂, and Rj are each independently H or an aromatic or heteroaromatic side chain of an amino acid;

at least one of Ri, R₂, and Ri is an aromatic or heteroaromatic side chain of an amino acid; R-t, R?, R(„ R? are independently H or an amino acid side chain;

AAs( is an amino acid side chain; and q is l, 2, 3 or 4, wherein at least two amino acids of the cyclic peptide are charged amino acids, at least two amino acids ofthe cyclic peptide are aromatic hydrophobie amino acids, and at least two amino acids of the cyclic peptide are uncharged, non-aromatic amino acids.
2. The compound of claim l, wherein at least two of Ri, R₂, and R 3 are side chaîns of aromatic hydrophobie amino acids.
3. The compound of claim l, wherein at least two of Ri, R₂, and R 3 are independently phenylalanine or naphthylalanine.
4. The compound of claim l or claim 2, wherein at least two of Rs, R_f„ and R7 are side chains

242 of charged amino acids.
5. The compound of claim I or 2, wherein at least two of Rs, R&, and R7 are independently arginine, homoarginine, N-methylarginine, and N,N-dimethyiarginine.
6. The compound of any one of daims I to 5, wherein q is l.
7. The compound of any one of daims I to 6, wherein R4 is H.
8. A compound comprising:

(a) an antisense compound that is complementary to at least a portion of an expanded CUG repeat in a target mRNA sequence;

(b) a cyclic peptide of the formula:

Rj, R?, and R3 are each independently H or an amino acid residue having a side chain comprising an aromatic group;

at least two of R[, R₂, and R₃ are the side chain of phenylalanine;

R4 is H or an amino acid side chain;

AAsc is an amino acid side chain; and each m is independently an integer of 0, l, 2, or 3.
9. The compound of daim 8, wherein one of Rj, R₂, and R3 is naphthylalanine.

243

ΙΟ. The compound of claim 8, wherein one of Ri, R?, and R₃ is H.
11. The compound of any one of daims 8 to 10, wherein R4 is H.
12. The compound of any one of daims l to l l, wherein the cyclic cell penetrating peptide is selected from

NH

NH

244

, or a protonated fonn thereof.
13. The compound of any one of daims l to 12, wherein the compound further comprises an 5 exocyclic peptide.
14. The compound ofclaim 13, wherein the exocyclic peptide comprises from 2 to I0 amino acids.

245

I5. The compound of claim 13 or 14, wherein the exocyclic peptide comprises 2, 3, or 4 lysine residues.
16. The compound of any one of daims 13 to 15, wherein the exocyclic peptide comprises at least 2 amino acid residues with a hydrophobie side chain.

5
17. The compound of any one of daims 13 to 15, wherein the exocyclic peptide comprises

PGKKRKV.
18. The compound of any one of daims 13 to 15, wherein the exocyclic peptide comprises PKKKRKV.
19. The compound of any one ot daims 13 to 15, wherein the exocyclic peptide comprises

10 PKKKGKV.
20. The compound of any one of daims 13 to 15, wherein the exocyclic peptide comprises

PKKKRKG.
21. The compound of any one of daims 13 to 20, further comprising a linker, wherein the linker conjugales the antisense compound and the exocyclic peptide to the cyclic peptide.
22. The compound of claim 21, wherein the linker has the following structure:

JWV I Ai ^cv

O wherein Ai, Bi, and Ci, each independently comprise a hydrocarbon linker, a polyethylene glycol (PEG) linker, or one or more amino acid residue.
23. The compound of daim 21, wherein the linker has the following formula:

246

wherein:

x’ is an integer from l-23;

y is an integer from l-5;

5 z’ is an integer from l-23;

* is the point of attachment to the AAsc of the cyclic peptide; and M is a bonding group.
24. The compound of claîm 23, wherein z’ is 11.
25. The compound ofclaim 23 or 24, wherein x’ is l.
26. The compound of any one of daims 23 to 25, wherein the exocyclic peptide is conjugated to the linker at the amino end of the linker and the antisense compound is conjugated to M.
27. The compound of any one of daims 23 to 26, wherein M is -C(O)-.
28. The compound ofclaim 23, wherein the compound is selected from

247

NH

248

or a protonated form or sait thereof, wherein EP is the exocyclic peptide, and

249 oligonucleotide îs the antisense compound.
29. The compound of any one of claims l to 28, wherein the antisense compound comprises

AG(CAG)n, G(CAG)n, (CAG)nAG, or (CAG)nA, wherein n is an integer from l to 50.
30. The compound of any one of claims l to 28, wherein the antisense compound comprises 5 to 10 CAG repeats.
31. The compound of any one of claims l to 28, wherein the antisense compound comprises 5’CAG CAG CAG CAG CAG CAG CAG CAG-3’.
32. The compound of any one of claims l to 28, wherein the antisense compound comprises 5’CAG CAG CAG CAG CAG CAG CAG CAG CAG-3’.
33. The compound of any one of claims l to 28, wherein the antisense compound comprises 5’CAG CAG CAG CAG CAG CAG CAG-3’.
34. The compound of any one of claims 29 or claim 33, wherein the antisense compound further comprises any one of

GGGCCTTTTATTCGCGAGGGTCGGG;

GAGGGCCTTTTATTCGCG AGGGTCG ;

TGGAGGGCCTTTTATTCGCGAGGGT;

GATGGAGGGCCTTTTATTCGCGAGG;

CAGATGGAGGGCCTTTTATTCGCGA;

GGCAGATGGAGGGCCTTTTATTCGC;

TGGGCAGATGGAGGGCCTTTTATTC;

TTTGGGCAGATGGAGGGCCTTTTAT;

GCTTTGGGCAGATGGAGGGCCTTTT;

GAGCTTTGGGCAGATGGAGGGCCTT;

CAGAGCTTTGGGCAGATGGAGGGCC;

250

TCCAGAGCTTTGGGCAGATGGAGGG;

AGTCCAGAGCTTTGGGCAGATGGAG;

GGAGTCCAGAGCTTTGGGCAGATGG;

GTGGAGTCCAGAGCTTTGGGCAGAT;

5 CTGTGGAGTCCAGAGCTTTGGGCAG;

CACTGTGGAGTCCAGAGCTTTGGGC;

GACACTGTGGAGTCCAGAGCTTTGG;

CGGACACTGTGGAGTCCAGAGCTTT;

CGCGGACACTGTGGAGTCCAGAGCT;

10 ACCGCGGACACTGTGGAGTCCAGAG;

AAACCGCGGACACTGTGGAGTCCAG;

GCAAACCGCGGACACTGTGGAGTCC;

ACGCAAACCGCGGACACTGTGGAGT; or

CAACGCAAACCGCGGACACTGTGGA.
35. A pharmaceutical composition comprising the compound of any one of daims l to 34.
36. The compound of any one of daims I to 34 or the composition of daim 35 for use in a method of treating myotonie dystrophy (DM) in a subject in need thereof, the method comprising 20 administering the compound or the composition to the subject.
37. The compound for use of daim 36, wherein the administering results in an increase in the expression of a wild-type protein in muscle tissue, wherein the wild-type protein is expressed from a gene that does not hâve an expanded CUG repeat.
38. The compound for use of daim 37, wherein the muscle tissue is diaphragm tissue, quadricep tissues, heart tissue, or any combination thereof.
39. The compound for use of any one of daims 36 to 38, wherein the administration prevents or 30 reduces foci formation.