WO2022213253A1 - 经修饰的Prp43解旋酶及其用途 - Google Patents

经修饰的Prp43解旋酶及其用途 Download PDF

Info

Publication number
WO2022213253A1
WO2022213253A1 PCT/CN2021/085609 CN2021085609W WO2022213253A1 WO 2022213253 A1 WO2022213253 A1 WO 2022213253A1 CN 2021085609 W CN2021085609 W CN 2021085609W WO 2022213253 A1 WO2022213253 A1 WO 2022213253A1
Authority
WO
WIPO (PCT)
Prior art keywords
prp43
helicase
seq
modified
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/085609
Other languages
English (en)
French (fr)
Inventor
张周刚
李文
王艳双
王慕旸
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Qitan Technology Ltd
Original Assignee
Chengdu Qitan Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Qitan Technology Ltd filed Critical Chengdu Qitan Technology Ltd
Priority to US18/553,834 priority Critical patent/US20240368568A1/en
Priority to EP21935490.9A priority patent/EP4299746A4/en
Priority to CN202180006254.2A priority patent/CN115777019A/zh
Priority to PCT/CN2021/085609 priority patent/WO2022213253A1/zh
Publication of WO2022213253A1 publication Critical patent/WO2022213253A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/70Vectors or expression systems specially adapted for E. coli
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/90Isomerases (5.)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/483Physical analysis of biological material
    • G01N33/487Physical analysis of biological material of liquid biological material
    • G01N33/48707Physical analysis of biological material of liquid biological material by electrical means
    • G01N33/48721Investigating individual macromolecules, e.g. by translocation through nanopores
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y306/00Hydrolases acting on acid anhydrides (3.6)
    • C12Y306/04Hydrolases acting on acid anhydrides (3.6) acting on acid anhydrides; involved in cellular and subcellular movement (3.6.4)
    • C12Y306/04013RNA helicase (3.6.4.13)

Definitions

  • This application relates to nucleic acid sequencing technology.
  • Nanopore sequencing technology is the third-generation nucleic acid sequencing technology, which obtains DNA/RNA sequence information by recording different electrical signals generated by different bases when DNA/RNA strands pass through nanopores.
  • One of the challenges of nanopore sequencing technology is that DNA/RNA molecules often travel through nanopores too fast, exceeding the resolution of the instrument, making it difficult to obtain accurate electrical signals reflecting sequence information. Therefore, how to control or slow down the speed of DNA/RNA molecules passing through the nanopore is crucial to improve the accuracy of nanopore sequencing technology.
  • an emerging method to characterize polynucleotides includes transmembrane pores, contact and interaction of helicases with polynucleotides, whereby helicases control the movement of target polynucleotides through nanopores to increase polynucleotides Residence time of acids at nanopores.
  • patents WO2013057495A3 and US20150191709A1 disclose a new method for characterizing target polynucleotides using pore and Hel308 helicase or molecular motors capable of binding to nucleotides within the target polynucleotide.
  • the helicase or molecular motor of the invention can effectively control the movement of the target polynucleotide through the pore.
  • patents US20150065354A1 and US9617591B2 disclose a method for characterizing target polynucleotides using XPD helicase, the method utilizing pore and XPD helicase.
  • the XPD helicase of this invention can control the movement of the target polynucleotide through the pore.
  • patents US 2016O257942A1 and US20180179500 A1 disclose that T4 phage-derived Dda helicase and some homologous proteins thereof can be applied to polynucleotide through-pore sequencing after modification.
  • helicases can be divided into six superfamily (Superfamily, SF), of which the SF1 and SF2 superfamily helicases play the translocation and unwinding functions in the form of monomers, and the SF3-SF6 family play in the form of polymers. effect.
  • superfamily SF
  • helicases that act in monomeric form are easier to use and more uniform in performance.
  • Helicases of the SF1 and SF2 superfamily are based on protein sequence homology, domain arrangement, substrate-binding form and specificity, polarity of helical (5'-3' and 3'-5' orientation), and Characteristics such as rotation or translocation mechanisms are classified into different families, the SF1 superfamily includes UvrD/Rep family helicases, Upf1 family helicases and Pif1 family helicases, and the SF2 superfamily includes Rad3/XPD family helicases , Ski2-like family helicase, DEAH/RHA family helicase, NS3/NPH-II family helicase, DEAD-Box (DEAD-Box) family helicase, RIG-I family helicase, RecQ-like family Helicases, RecG-like family helicases, Swi/Snf family helicases, and T1R family helicases.
  • the SF1 superfamily includes UvrD/Rep family helicases, Upf1 family helicases and
  • RecD and T4 phage-derived Dda helicases both belong to the Pif1-like family of the SF1 superfamily.
  • the substrates tend to be single-stranded DNA, translocation and unwinding in the 5'-3' direction.
  • the common unwinding belongs to this family.
  • the enzyme also includes Pif1 helicase, TrwC helicase, etc.; Hel308 helicase derived from Methanococcoides burtonii strain (as disclosed in US20150191709A1) belongs to the class ski2 family helicase of the SF2 superfamily, and can simultaneously use single-stranded DNA or RNA.
  • As a substrate it polarly shifts or unwinds double-stranded nucleic acids in the 3'-5' direction.
  • helicases belonging to this family also include ski2 helicase, Brr2 helicase, Mtr4 helicase, etc.;
  • XPD helicase belongs to the Rad3/XPD family helicase of the SF2 superfamily. It specifically binds to single-stranded DNA, and shifts or unwinds double-stranded nucleic acid in the 5'-3' polar direction. Enzymes also include Rad3 helicase and the like.
  • each helicase has its own advantages and disadvantages and its own applicable environment, and these helicases are still difficult to meet the requirements of scientific research and medical technology.
  • novel helicases that can be used in nucleic acid nanopore sequencing technology in order to improve the applicability, accuracy and sensitivity of nanopore sequencing technology.
  • Prp43 helicases especially modified Prp43 helicases, can control the movement of polynucleotide molecules through nanopores, and thus can be used in nanopore sequencing technology.
  • a first aspect of the present application relates to a modified Prp43 helicase comprising a RecA1 domain, a RecA2 domain and a Ratchet domain, relative to the corresponding wild-type Prp43 helicase or fragment thereof
  • the modified Prp43 helicase includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 introduced in at least one domain selected from the group consisting of RecA1 domain, RecA2 domain, Ratchet domain Insertion or substitution of one or more cysteines, and/or insertion or substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or more unnatural amino acids .
  • a second aspect of the present application relates to a protein construct comprising the modified Prp43 helicase described in the first aspect of the present application, and a co-activator protein fused to the C-terminus or N-terminus of the Prp43 helicase G-Path domains of Paf1 or Paf1 fragments containing G-Path domains.
  • the third aspect of the present application relates to a nucleic acid encoding the modified Prp43 helicase described in the first aspect of the present application or the protein construct described in the second aspect of the present application.
  • a fourth aspect of the present application relates to an expression vector comprising the nucleic acid of the third aspect of the present application.
  • the fifth aspect of the present application relates to the nucleic acid of the third aspect of the present application or a host cell comprising the expression vector of the fourth aspect of the present application.
  • the sixth aspect of the present application relates to a method for preparing the protein construct described in the second aspect of the present application, comprising: providing a polypeptide of SEQ ID NO: 1 or a variant thereof and a polypeptide of SEQ ID NO: 26 or a variant thereof Into the polypeptide of SEQ ID NO:1 or a variant thereof, at least one cysteine residue and/or at least one unnatural amino acid are introduced, and then the C-terminus or N-terminus of the resulting polypeptide is fused to the polypeptide of SEQ ID NO:26 A polypeptide or variant thereof, forming the protein construct.
  • the seventh aspect of the present application relates to a method for preparing the modified Prp43 helicase described in the first aspect of the present application or the protein construct described in the second aspect of the present application, comprising: culturing the modified Prp43 helicase described in the fifth aspect of the present application.
  • the host cells described above were used to induce expression, and then the resulting expression product was purified.
  • An eighth aspect of the present application relates to a method for controlling the movement of a polynucleotide molecule, comprising combining the polynucleotide molecule with the modified Prp43 helicase described in the first aspect of the present application or the second aspect of the present application. contact with the protein construct.
  • a ninth aspect of the present application relates to a method for characterizing a target polynucleotide, the method comprising:
  • a tenth aspect of the present application relates to the use of the modified Prp43 helicase of the first aspect of the present application or the protein construct of the second aspect of the present application in characterizing a polynucleotide of interest or controlling the passage of a polynucleotide of interest through a pore. Use on the move.
  • An eleventh aspect of the present application relates to an analytical device for characterizing a target polynucleotide, said analytical device comprising one or more nanopores, one or more modified Prp43 according to the first aspect of the present application A helicase or a protein construct as described in the second aspect of the present application, and one or more containers.
  • a twelfth aspect of the present application relates to a method of forming a sensor for characterizing a polynucleotide of interest, comprising providing a nanopore, and the modified Prp43 helicase or the present invention in the nanopore and the first aspect of the present application
  • a complex is formed between the protein constructs described in the second aspect of the application.
  • the present application provides a novel Prp43 helicase mutant or construct thereof that can be used for nucleic acid nanopore sequencing, which has enhanced ATP hydrolysis activity or unwinding activity due to introduction of mutations and/or introduction of accessory proteins, and/ Alternatively, the binding to the target polynucleotide can be maintained for a long time, thereby allowing continuous and stable control of the movement speed of the polynucleotide. Therefore, the Prp43 helicase mutant of the present application or its construct can continuously control the movement of the target polynucleotide through the pore at an appropriate rate required for sequencing, thereby improving the throughput and accuracy of nanopore sequencing.
  • polypeptide refers to a molecule comprising amino acid residues linked by peptide bonds and containing more than five amino acid residues. Polypeptides may generally contain 20 or more amino acids, preferably 50 or more amino acids, or 100 or more amino acids. As used herein, the terms “protein”, “protein” and the term “polypeptide” are considered to have the same meaning; thus, the terms “protein”, “protein” and “polypeptide” are used interchangeably. Polypeptides can optionally be modified (eg, glycosylated, phosphorylated, acylated, farnesylated, prenylated, sulfonated, etc.) to increase their functionality or activity.
  • modified eg, glycosylated, phosphorylated, acylated, farnesylated, prenylated, sulfonated, etc.
  • Enzymes Polypeptides that exhibit activity under certain conditions in the presence of specific substrates can be referred to as "enzymes”. It will be appreciated that due to the degeneracy of the genetic code, a variety of nucleotide sequences can be generated that encode a given polypeptide.
  • nucleic acid described herein is a general term for deoxyribonucleic acid (DNA) and ribonucleic acid (RNA), and is a biological macromolecular compound composed of many nucleotide monomers.
  • DNA deoxyribonucleic acid
  • RNA ribonucleic acid
  • nucleic acid and polynucleotide are considered to have the same meaning; thus, the term “nucleic acid” and the term “polynucleotide” are used interchangeably.
  • Nucleotide monomers are composed of five-carbon sugars, phosphate groups, and nitrogenous bases. If the five-carbon sugar is ribose, the polymer formed is RNA; if the five-carbon sugar is deoxyribose, the polymer formed is DNA. Nitrogenous bases in nucleotides may include, but are not limited to, adenine (A), guanine (G), thymine (T), uracil (U), and cytosine (C). The nucleotides may be naturally occurring or artificially synthesized.
  • nucleotides include, but are not limited to: adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), cytosine Pyrimidine nucleoside monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate Phosphate (dTMP), deoxyuridine monophosphate (dUMP), and deoxycytidine monophosphate (dCMP).
  • the nucleotides are selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP.
  • a "fragment" of a polypeptide or polypeptide domain means one or more (eg, several, tens or 100, etc.) at the amino and/or carboxyl terminus of the polypeptide or polypeptide domain ) polypeptides or polypeptide domains in which amino acid residues are deleted, but the fragment retains the desired activity.
  • a fragment of a Prp43 helicase represents one or more (eg, 1-5, 1-10, 1-20, 1-50, 1- 100, 1-150, 1-200, or, for example, 20, 30, 40, 50, 60, 70, 80 or 90) amino acid residues deleted but still retain the helicase Active polypeptide sequence.
  • a fragment of a polypeptide or domain comprises at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 92%, 94%, 96% of its original sequence length %, 98% or 99% of the length.
  • fragments of polypeptides or domains comprise at least 50 amino acids, such as at least 60 amino acids, at least 70 amino acids, at least 80 amino acids, at least 90 amino acids, at least 100 amino acids, depending on the length of the original polypeptide or domain.
  • fragments of polypeptides or domains may also comprise less than 700 amino acids, such as less than 600 amino acids, less than 500 amino acids, less than 400 amino acids, less than 300 amino acids, less than 200 amino acids , or less than 100 amino acids.
  • expression includes any step involved in the production of a polypeptide, including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion.
  • An "expression vector” comprises a polynucleotide encoding a polypeptide operably linked to appropriate control sequences (eg, a promoter, and transcriptional and translational stop signals) for expression and/or translation in vitro.
  • the expression vector can be any vector (eg, a plasmid or virus) that can conveniently undergo recombinant DNA procedures and can cause expression of a polynucleotide.
  • the choice of vector will generally depend on the compatibility of the vector with the cells into which the vector is to be introduced.
  • Vectors can be linear or closed circular plasmids.
  • the vector may be an autonomously replicating vector, ie a vector that exists as an extrachromosomal entity that replicates independently of chromosomal replication, eg a plasmid, an extrachromosomal element, a minichromosome or an artificial chromosome.
  • the vector may be one that, when introduced into a host cell, integrates into the genome and replicates with the chromosome into which it is integrated.
  • the integrating cloning vector can integrate at random or predetermined target loci in the chromosome of the host cell.
  • the vector system can be a single vector or plasmid or two or more vectors or plasmids that together contain the total DNA to be introduced into the genome of the host cell, or a transposon.
  • control sequence refers to a component involved in the regulation of the expression of a coding sequence in a particular organism or in vitro.
  • control sequences are transcription initiation, termination, promoter, leader, signal peptide, propeptide, prepropeptide or enhancer sequences; Shine-Delgarno sequence, repressor or activator efficient RNA processing signals, such as splicing and polyadenylation signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., ribosome binding sites); sequences that enhance protein stability; , sequences that enhance protein secretion.
  • a "host cell” as defined herein is an organism suitable for genetic manipulation and which can be used to produce a product of interest, such as a Prp43 helicase as described herein.
  • the host cell may be a host cell found in nature, or a host cell derived from genetic manipulation or classical mutagenesis of a parent host cell.
  • the host cell is a recombinant host cell.
  • the host cell can be a prokaryotic, archaeal or eukaryotic host cell.
  • Prokaryotic host cells can be, but are not limited to, bacterial host cells.
  • Eukaryotic host cells can be, but are not limited to, yeast, fungal, amoeba, algal, plant, animal, or insect host cells.
  • nucleic acid or protein when used in reference to a nucleic acid or protein (or enzyme) means that the nucleic acid or protein (or enzyme) has been modified in sequence by human intervention compared to its native form.
  • recombinant when referring to a cell (eg, a host cell) means that the genome of the cell has been sequence-modified by human intervention if compared to its native form.
  • the terms “recombinant” and “modified” are considered synonymous.
  • substitution means that a natural amino acid residue present in the corresponding wild-type polypeptide or enzyme is replaced by another amino acid residue.
  • amino acid substitution and “amino acid substitution” are considered synonymous.
  • variants or “mutant” as used herein have the same meaning and are used interchangeably. They can refer to polypeptides or nucleic acids. A variant refers to the presence of substitutions, insertions, deletions, truncations, transversions, etc. at one or more positions relative to a reference sequence (usually the wild form of a nucleic acid or polypeptide). Variants can be generated by, for example, site saturation mutagenesis, scanning mutagenesis, insertional mutagenesis, random mutagenesis, site-directed mutagenesis, and directed evolution, as well as various other recombinant methods known to those skilled in the art. Nucleic acid variants can be artificially synthesized by techniques known in the art.
  • a "mature polypeptide” is defined herein as a polypeptide in its final form and obtained after translation of mRNA into a polypeptide and post-translational modification of the polypeptide.
  • Post-translational modifications include N-terminal processing, C-terminal truncation, glycosylation, phosphorylation, and removal of leader sequences (such as signal peptides and/or propeptides) by cleavage.
  • the similarity between two polypeptide sequences or nucleic acid sequences can be expressed in terms of their homology.
  • identity or “homology” between two sequences are considered to have the same meaning and are used interchangeably herein.
  • sequences are aligned for the best match, which is the identical match between the two sequences over the aligned regions percentage.
  • the percent sequence homology between two amino acid sequences or between two polynucleotide sequences can be determined using well-known algorithms, such as the Needleman and Wunsch algorithm for aligning two sequences (Needleman, S.B. and Wunsch, C.D. ( 1970) J. Mol. Biol.
  • the "opening” mentioned in this application refers to the opening of the polynucleotide binding domain of the wild-type Prp43 helicase itself, and may also refer to the opening of the polynucleotide binding part that binds to the Prp43 helicase.
  • An opening is an opening that allows dissociation of the polynucleotide from the Prp43 helicase, and the opening may not always be present, but contains at least one opening in at least one conformational state.
  • a "modified Prp43 helicase” or a construct comprising a modified Prp43 helicase as described herein contains one or more openings. The Prp43 helicase is modified so that two or more moieties are attached on the same monomer of the helicase to reduce the size of the opening.
  • One or more”, “at least one”, “one or more” or “one or more” as used in this application includes: one, two, three, four, five, six, seven , eight, nine, ten, eleven, twelve, or more, etc.
  • two or more or “two or more” or “two or more” includes: two, three, four, five, six, seven, eight, nine , ten, eleven, twelve, or more, etc.
  • the "plurality” used in this application includes, but is not limited to, three, four, five, six, seven, eight, nine, ten, eleven, twelve, or more and many more.
  • “comprising”, “containing” or “comprising” are open-ended descriptions, meaning that the specified components or steps described are included, as well as other specified components or steps that do not substantially affect.
  • the protein or nucleic acid may either consist of the sequence, or may have an additional protein or nucleic acid at one or both ends of the protein or nucleic acid. amino acid residues or nucleotides, but the protein or nucleic acid still has the activity described herein (eg, its ability to control movement of polynucleotides, etc.).
  • Prp43 helicase is a known helicase whose structure and function have been studied and reported in the prior art, for example, see Marcel J. Tauchert et.al, "Structural and functional analysis of the RNA helicase Prp43 from the thermophilic eukaryote Chaetomium thermophilum”, Acta Cryst., 2016, F72, 112–120.
  • Prp43 helicase for nanopore sequencing or to control the movement of polynucleotide molecules through nanopores.
  • Prp43 helicase belongs to the DEAH/RHA helicase of the SF2 superfamily. It can bind to single-stranded DNA or RNA, translocate or unwind double-stranded DNA or RNA nucleic acid in the 3'-5' direction, and belong to the same family of helicases. Helicases also include Prp22 helicase, Prp2 helicase, MLE helicase, DHX9 helicase, and the like.
  • FIG. 1 is a schematic 3D structure of a Prp43 helicase (SEQ ID NO: 1) derived from Chaetomium thermophilum.
  • Prp43 helicase also contains several domains: N-terminal domain (M1-L96), C-terminal domain WH domain (Y459-P526), Ratchet domain (L527-V640) and OB domain (S641-A764), etc.
  • RecA1 and RecA2 contain seven conserved motifs (Motifs), of which Ia (TQPRRVAA), Ib (TDGQLLR) and IV (LLFLTG) interact with substrate nucleic acids, motifs I (GSGKT), II (DEAH), V ( TNIAETSLT) and VI (QRAGRAGR) are involved in the binding of nucleotides, while motif III (SAT) couples nucleotide hydrolysis and nucleic acid translocation or unwinding.
  • Motifs conserved motifs
  • Ia TQPRRVAA
  • Ib TDGQLLR
  • IV LLFLTG
  • motifs I GSGKT
  • II DEAH
  • V TNIAETSLT
  • VI QRAGRAGR
  • Prp43 helicase is enriched in positively charged amino acids at the top of the RecA1 and RecA2 domains, and together with the C-terminal WH, Ratchet, and OB domains form a channel around single-stranded DNA or RNA.
  • Prp43 helicase has a strong affinity with single-stranded DNA or RNA, the process is still a thermodynamic dynamic equilibrium process, and cannot completely control the movement of the via hole of the target nucleic acid, especially the length of the target nucleic acid is relatively long. Long, such as nucleic acids of 1000 bases in length, 5000 bases in length, 10000 bases in length, 100000 bases in length or longer.
  • the inventors found that the Prp43 helicase can be modified to ensure the binding of the enzyme to the nucleic acid and continuously control the passage of the nucleic acid through the nanopore.
  • the inventors found that the introduction of one or more cysteines or unnatural amino acids into the RecA1 domain, RecA2 domain and/or Ratchet domain of Prp43 helicase can reduce the multinucleation of Prp43 helicase The size of the opening of the nucleotide binding domain, thereby improving the binding capacity of its target nucleic acid.
  • a first aspect of the present application relates to a modified Prp43 helicase comprising a RecA1 domain, a RecA2 domain and a Ratchet domain, relative to the corresponding wild-type Prp43 helicase or fragment thereof
  • the modified Prp43 helicase includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 introduced in at least one domain selected from the group consisting of RecA1 domain, RecA2 domain, Ratchet domain Insertion or substitution of one or more cysteines, and/or insertion or substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or more unnatural amino acids .
  • one at least one cysteine residue and/or at least one unnatural amino acid may be introduced in any of the following groups:
  • the unnatural amino acids described in this application include, but are not limited to: 4-azido-L-phenylalanine (Faz), 4-acetyl-L-phenylalanine, 3-acetyl-L-phenylalanine Amino acid, 4-acetoacetyl-L-phenylalanine, O-allyl-L-tyrosine, 3-(phenylselenyl)-L-alanine, O-2-propyne- 1-yl-L-tyrosine, 4(dihydroxyboronyl)-L-phenylalanine, 4-[(ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2 -Amino-3- ⁇ 4-[(propan-2-ylsulfanyl)carbonyl]phenyl ⁇ propionic acid, (2S)-2-amino-3- ⁇ 4-[(2-amino-3-sulfane (ylpropionyl)amino
  • Prp43 helicase in this application should be understood in its broadest sense and is considered to encompass homologous proteins of Prp43 helicase (eg, SEQ ID NO: 1).
  • SEQ ID NO: 1 a RecA1 domain, a RecA2 domain and/or a Ratchet domain, and has at least 30% homology to SEQ ID NO: 1, such as at least 35%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96% %, at least 97%, at least 98%, at least 99%, or at least 99.9% homology, it can be considered to be a Prp43 helicase. Accordingly, some helicases known as HrpA helicases or HrpB helicases, such as those listed in Table 1, are also considered to fall within the scope of "
  • the Prp43 helicase described in this application can be Prp43 helicase from various conventional sources, for example, the Prp43 helicase can be derived from Chaetomium thermophilum, Bathycoccus prasinos, Uncultured bacterium, Archaeon, Parcubacteria, Sorangium cellulosum, Candidatus Sungbacteria, Mycolicibacterium chitae, Parcubacteria, Thermodesulforhabdus norvegica, Deltaproteobacteria, Puniceicoccales, Desulfobacterium vacuolatum or Desulfobacter sp. or derived from viral metagenome, etc.
  • Table 1 gives some examples of homologous Prp43 helicases that can be used in the present application, but the Prp43 helicases of the present application are not limited to these examples.
  • Prp43 helicase described in the application is derived from Chaetomium thermophilum.
  • the application provides a modified Prp43 helicase comprising a variant of SEQ ID NO: 1 or a fragment thereof, the variant being included in the RecA1 structure Insertion or substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or more cysteines introduced into the domain, RecA2 domain and/or Ratchet domain, and /or insertion or substitution of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 or more unnatural amino acids.
  • the variant is included in M157, Q161, D165, F181, E182, N183, R324, L328, E332, R335, L351, P352, P353, H354, D321 corresponding to SEQ ID NO: 1 , E320, R358, P563, A564, N565, D603, K605, K606, H609, Y615, R616, S619, N623, A626 or K630 in any one or two or more positions to introduce at least one cysteine residue and/or at least one unnatural amino acid residue. More preferably, the introduced cysteine residue or unnatural amino acid residue is located at a position corresponding to any one or two or more of F181, P352, S619 or N623 of SEQ ID NO:1.
  • the modified Prp43 helicase comprises removal of the N-terminal domain, preferably at least 96, at least 90, at least 80, at least 70 from position 1 of the N-terminus , at least 60, at least 50, at least 40, or at least 30 residues.
  • M1-N60 is preferably removed, i.e. the T61-A764 fragment of SEQ ID NO: 1 is preferably used, and on this basis an insertion or substitution of one or more cysteines is introduced, and/or insertion or substitution of one or more unnatural amino acids.
  • two or more cysteine residues or non- Natural amino acid residues are linked between introduced cysteine and cysteine, between introduced unnatural amino acid and unnatural amino acid, and between introduced cysteine and unnatural amino acid. interlinked, between an introduced cysteine and a natural amino acid, or between an introduced unnatural amino acid and a natural amino acid.
  • any number and combination of two or more introduced cysteines and unnatural amino acids can be interconnected.
  • 2, 3, 4, 5, 6, 7, 8 or more cysteines and/or unnatural amino acids can be linked to each other.
  • One or more cysteines can be linked to one or more cysteines.
  • One or more cysteines can be linked to one or more unnatural amino acids such as Faz.
  • One or more unnatural amino acids such as Faz can be linked to one or more unnatural amino acids such as Faz.
  • One or more cysteines can be attached to one or more natural amino acids on the helicase.
  • One or more unnatural amino acids such as Faz can be linked to one or more natural amino acids on the helicase.
  • connection can be any connection mode, including temporary connection or permanent connection mode, such as covalent connection or hydrogen bond connection or electrostatic interaction or ⁇ - ⁇ interaction or hydrophobic interaction.
  • linkage may be permanent, such as a covalent linkage.
  • Covalent attachment can be carried out using chemical crosslinkers, which can vary in length from one carbon (phosgene type linker) to multiple Angstroms.
  • chemical crosslinkers can vary in length from one carbon (phosgene type linker) to multiple Angstroms.
  • PEGs polyethylene glycol
  • PNA polypeptide nucleic acid
  • TAA threose nucleic acid
  • GNA glycerol nucleic acid
  • TMAD TMAD
  • TMAD catalytic reagents
  • a TMAD catalyst is used to covalently link cysteine residues introduced at positions F181 and N623 or at positions P352 and S619 to cysteine residues.
  • the modified Prp43 helicase further comprises substitutions for one or more cysteine residues, more preferably C148, C214, C303, C323 corresponding to SEQ ID NO: 1 , C377, C441, C508, C543, C608 one or more cysteine residues are replaced, more preferably cysteine residues are replaced by alanine, glycine, valine, isoleucine , leucine, phenylalanine, tyrosine, serine, threonine, aspartic acid, glutamic acid, lysine, arginine, histidine, methionine, tryptophan, glutamine , asparagine or proline residues.
  • the modified Prp43 helicase further comprises one selected from the following group or multiple amino acid modifications:
  • the amino acids that interact with nucleotides and are substituted include, but are not limited to: R152, R153, R180, T195, Q198, R201, E316, E317, G349, T381, N382 corresponding to SEQ ID NO: 1 , K403, K405, L416, P526, P557, R562, Q558, H688, P689, T708, K710, Y712, R714.
  • at least one amino acid that interacts with the phosphate group of one or more nucleotides in single-stranded DNA, RNA or double-stranded DNA, RNA is substituted.
  • the one or more amino acids related to the binding of NTP and/or divalent metal ions include, but are not limited to: T126, D218, S387, E219, R432 corresponding to SEQ ID NO: 1 , R435, T121, K125, T127, T389, R162, D391, F360.
  • the one or more amino acids that interact with the transmembrane pore include, but are not limited to: C303, E336, D288, R287, E286, E284, E291 corresponding to SEQ ID NO: 1.
  • At least one amino acid that interacts with the sugar and/or base of one or more nucleotides in single-stranded DNA, RNA or double-stranded DNA, RNA is replaced with an amino acid comprising a larger side chain.
  • the larger side chains include an increased number of carbon atoms, have an increased length, have an increased molecular volume, and/or have an increased van der Waals volume.
  • the larger side chain increases (i) electrostatic interactions between the at least one amino acid and one or more nucleotides in the single- or double-stranded DNA; (ii) hydrogen bonding and/or (iii) ) cation-pi interaction.
  • the amino acid of the larger side chain is not alanine (A), cysteine (C), glycine (G), selenocysteine (U), methionine (M), aspartate acid (D) or glutamic acid (E).
  • the Prp43 helicase is further modified to reduce the negative charge on its surface.
  • the Prp43 helicase also contains substitutions that increase the net positive charge.
  • the Prp43 helicase further comprises substitution or modification of surface negatively charged amino acids, polar or non-polar amino acids.
  • the substitution includes the substitution of positively charged amino acids, uncharged amino acids for negatively charged amino acids, uncharged amino acids, aromatic amino acids, polar or non-polar amino acids.
  • described positively charged amino acid, uncharged amino acid, polar, non-polar amino acid or aromatic amino acid can be natural or non-natural amino acid, it can be synthetic or modified natural amino acid .
  • the amino acid sequence of the corresponding wild-type Prp43 helicase may have at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.9 % homology.
  • the Prp43 helicase is a variant of SEQ ID NO: 1 (i.e., derived from Chaetomium thermophilum), and the variant of SEQ ID NO: 1 is included in SEQ ID NO: 1 At least one cysteine residue and/or at least one unnatural amino acid is introduced at the F181 and/or N623 position of the At least one cysteine residue and/or at least one unnatural amino acid is introduced at the position.
  • the Prp43 helicase is a variant of SEQ ID NO: 1 (ie, derived from Chaetomium thermophilum), and the variant of SEQ ID NO: 1 further includes SEQ ID NO: 1 at least one or more cysteines are substituted.
  • Substituted amino acids can be alanine, glycine, valine, isoleucine, leucine, phenylalanine, tyrosine, serine, threonine, aspartic acid, glutamic acid, lysine acid, arginine, histidine, methionine, tryptophan, glutamine, asparagine, proline.
  • the one or more substituted cysteines are C148, C214, C303, C323, C377, C441, C508, C543, C608.
  • the Prp43 helicase is a variant of SEQ ID NO: 1 (i.e., derived from Chaetomium thermophilum), and the variant of SEQ ID NO: 1 removes M1 of the N-terminal domain To the N60 sequence, it is further preferred to remove the N-terminal M1 to L96 sequence.
  • the present invention is a helicase in which the N-terminal domain M1 to N60 sequence has been removed.
  • the Prp43 helicase is a variant of SEQ ID NO: 1 (i.e., derived from Chaetomium thermophilum), and the variant of SEQ ID NO: 1 is the same as the variant of SEQ ID NO: 1
  • the amino acid sequence is at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94% , at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or at least 99.9% homology.
  • the Prp43 helicase is a modified T61-A764 fragment of SEQ ID NO:1.
  • the Prp43 helicase is a modified T61-A764 fragment of SEQ ID NO: 1 (derived from Chaetomium thermophilum), and the modification is F181C/N623C/C508S or P352C/S619C /C508S.
  • Prp43 helicases described herein can be modified to aid in identification or purification, for example by adding histidine residues (His tag), aspartic acid residues (asp tag), streptavidin tag, Flag tag, SUMO tag, GST tag or MBP tag, or by adding a signal sequence to facilitate their secretion from cells in which the polypeptide does not naturally contain the signal sequence.
  • His tag histidine residues
  • asp tag aspartic acid residues
  • streptavidin tag Flag tag
  • Flag tag Flag tag
  • SUMO tag SUMO tag
  • GST tag GST tag
  • MBP tag MBP tag
  • Prp43 helicase described herein may be in the form of a Prp43 helicase oligomer comprising one or more of the Prp43 helicases described herein.
  • the Prp43 helicase oligomer may further comprise wild-type Prp43 helicase or other types of helicases.
  • the other types of helicases can be Hel308 helicase, XPD helicase, Dda helicase, RecD2 helicase, TraI helicase or TrwC helicase and the like.
  • Prp43 helicase and the wild-type Prp43 helicase Preferably, between the Prp43 helicase and the wild-type Prp43 helicase, between the Prp43 helicase and the Prp43 helicase, between the wild-type Prp43 helicase and the wild-type Prp43 helicase, and between the Prp43 helicase
  • they can be connected or arranged in a head-to-head, tail-to-tail or head-to-tail manner.
  • the Prp43 helicase oligomer comprises two or more Prp43 helicases described in this application, wherein the Prp43 helicases may be different or the same.
  • the Prp43 helicase In physiological functions, the Prp43 helicase is involved in the dissociation of the intronic spliceosome composed of U2.U5.U6snRNPs during pre-mRNA processing.
  • the function of the enzyme in this process requires the interaction of two glycine-rich moieties.
  • the auxiliary proteins Ntr1 and Ntr2 of G-Path motif interact to activate their ATP hydrolysis and unwinding activities; Prp43 helicase is also involved in ribosome synthesis to help the maturation of 18S and 25S rRNAs precursors, This process also requires the activation of G-Path motif-rich proteins Pfa1 and Gno1 proteins.
  • Prp43 helicase requires a G-Path domain-containing accessory protein to activate its ATP hydrolysis and helical activities under physiological functional conditions.
  • the enzyme has weak activity in the absence of the co-activator protein, it is more preferably more active in ATP hydrolysis and unwinding in the presence of the co-activator protein.
  • the inventors found that individual partial fragments containing G-Path domain accessory proteins still have an activating function.
  • a protein construct comprising the modified Prp43 helicase described in the first aspect of the present application, and at the C-terminus or N-terminus of the Prp43 helicase Fusion of the G-Path domain of the helper activator protein Paf1 or a fragment of Paf1 containing the G-Path domain.
  • the protein construct can also be regarded as a fusion protein.
  • This modified Prp43 helicase construct is due to the fusion of the G-Path domain or the fragment containing the G-Path domain of the co-activator protein Paf1 or its homologous protein at the C-terminus or N-terminus of the Prp43 helicase. , the ATP hydrolysis and/or unwinding activity of the modified helicase is significantly enhanced, which is more conducive to controlling the movement of nucleic acid through holes in nanopore nucleic acid sequencing.
  • the number of modified Prp43 helicases may be one or more.
  • the coactivator protein Paf1 can be Paf1 proteins from various sources routinely used in the art, such as from Chaetomium thermophilum var.thermophilum, Thermothielavioides terrestris, Thermothelomyces thermophilus, Podospora anserina, Neurospora tetrasperma, Paf1 of Coniochaeta sp., Monosporascus sp., Hypoxylon sp., Madurella mycetomatis or Coniochaeta pulveracea.
  • Table 2 gives some examples of homologous Paf1 proteins that can be used in the Prp43 helicase constructs of the present application, but the Paf1 proteins of the present application are not limited to these examples.
  • the G-path domain sequence is the sequence of the K662-G742 fragment corresponding to SEQ ID NO: 16 (that is, the sequence of SEQ ID NO: 26) in the above-mentioned Pfa1 accessory protein or its homologous protein or a variant thereof. sequence.
  • the amino acid sequence of the co-activator protein Paf1 is SEQ ID NO: 16 or has at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97% %, at least 98%, at least 99%, or at least 99.9% homology of the amino acid sequence of the variant, and the co-activator protein Paf1 has the function of activating the Prp43 helicase.
  • the G-Path domain of Paf1 is the K662-G742 fragment of SEQ ID NO: 16.
  • the Prp43 helicase comprises the sequence of SEQ ID NO: 1 or a variant thereof
  • the Pfa1 co-activator protein comprises the sequence of SEQ ID NO: 16 or a variant thereof A variant, or the G-Path domain sequence SEQ ID NO:26 of SEQ ID NO:16 (corresponding to the K662-G742 fragment of the sequence of SEQ ID NO:16) or a variant thereof.
  • the protein constructs described herein can be modified to aid in identification or purification, for example by adding histidine residues (His tag), aspartic acid residues (asp tag), streptavidin tag, Flag tags, SUMO tags, GST tags or MBP tags or Strep TagII tags, or by adding a signal sequence to facilitate their secretion from cells in which the polypeptide does not naturally contain the signal sequence.
  • His tag histidine residues
  • asp tag aspartic acid residues
  • streptavidin tag Flag tags
  • SUMO tags GST tags or MBP tags or Strep TagII tags
  • An alternative way of introducing a genetic tag is to chemically attach the tag to a natural or artificial site on the protein construct.
  • the third aspect of the present application provides a nucleic acid encoding the construct of the Prp43 helicase described in the first aspect of the present application and/or the protein described in the second aspect of the present application.
  • the fourth aspect of the present application provides an expression vector, the expression vector comprising the nucleic acid described in the third aspect of the present application.
  • the nucleic acid is operably linked to a regulatory element in an expression vector, wherein the regulatory element is preferably a promoter.
  • the promoter is selected from T7, trc, lac, ara or ⁇ L.
  • the expression vector includes but is not limited to plasmid, virus or phage.
  • nucleic acid constructs or expression vectors Various methods for inserting nucleic acids into nucleic acid constructs or expression vectors are known to those of skill in the art, see e.g. Sambrook and Russell, Molecular Cloning: A Laboratory Manual, 3rd Edition, CSHL Press, Cold Spring Harbor, NY, 2001.
  • the fifth aspect of the present application provides a host cell, the host cell comprising the nucleic acid described in the third aspect of the present application or the expression vector described in the fourth aspect of the present application.
  • the host cells include but are not limited to Escherichia coli.
  • the host cell is selected from BL21(DE3), JM109(DE3), B834(DE3), TUNER, C41(DE3), Rosetta2(DE3), Origami, Origami B, etc. .
  • the sixth aspect of the present application relates to a method for preparing the protein construct described in the second aspect of the present application, comprising: providing a polypeptide of SEQ ID NO: 1 or a variant thereof and a polypeptide of SEQ ID NO: 26 or a variant thereof Into the polypeptide of SEQ ID NO:1 or a variant thereof, at least one cysteine residue and/or at least one unnatural amino acid are introduced, and then the C-terminus or N-terminus of the resulting polypeptide is fused to the polypeptide of SEQ ID NO:26 A polypeptide or variant thereof, forming the protein construct.
  • the seventh aspect of the present application relates to a method for preparing the modified Prp43 helicase described in the first aspect of the present application or the protein construct described in the second aspect of the present application, comprising: culturing the modified Prp43 helicase described in the fifth aspect of the present application.
  • the host cells described above were used to induce expression, and then the resulting expression product was purified.
  • Genetic engineering techniques such as overexpression of enzymes in host cells, genetic modification of host cells, or hybridization techniques are methods known in the art, such as those described in Sambrook and Russel (2001) "Molecular Cloning: A Laboratory Manual (p. 3 editions), Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press or F. Ausubel et al., eds., "Current protocols in molecular biology", Green Publishing and Wiley Interscience, those techniques described in New York (1987).
  • the preparation method of the modified Prp43 helicase comprises: according to the amino acid sequence of the Prp43 helicase and or the auxiliary activator protein or the activation domain described in the present application, obtaining a Prp43 helicase encoding the Prp43 helicase.
  • the nucleic acid sequence of the gyrase is digested and connected to the expression vector, then transformed into E. coli, induced to express and purified to obtain the Prp43 helicase.
  • Prp43 helicase or protein constructs Use of Prp43 helicase or protein constructs
  • Prp43 helicase or protein constructs of the present application can be used to control the movement of polynucleotide molecules or to characterize a polynucleotide of interest.
  • the Prp43 helicase of the present invention is a useful tool for controlling the movement of target polynucleotides during strand sequencing, when provided with the usual necessary components to facilitate movement, Prp43 helicases along DNA or RNA in 3 '-5' direction, but the orientation of the DNA or RNA in the pore (depending on which end of the DNA or RNA is captured) means that the Prp43 helicase can be used against the direction of the applied field or along the The direction of the field moves the DNA or RNA into the well.
  • the opening of the polynucleotide binding domain or polynucleotide binding moiety on the Prp43 helicase or construct can be effectively reduced
  • the ATP of the modified Prp43 helicase can be effectively increased Hydrolytic activity or unwinding activity, thereby improving the ability to control the passage of target polynucleotides through the pore.
  • An eighth aspect of the present application relates to a method for controlling the movement of a polynucleotide molecule, comprising combining the polynucleotide molecule with the modified Prp43 helicase described in the first aspect of the present application or the second aspect of the present application. contact with the protein construct.
  • the control of the movement of the polynucleotide is the movement of the control polynucleotide through the pore.
  • the pores are nanopores, and the nanopores are transmembrane pores.
  • the pores may be natural or artificial, including but not limited to biological pores, solid state pores, or hybrid biological and solid state pores.
  • the method may comprise one or more Prp43 helicases jointly controlling the movement of the polynucleotide.
  • a ninth aspect of the present application relates to a method for characterizing a target polynucleotide, the method comprising:
  • steps (a) and (b) are repeated one or more times.
  • any number of the Prp43 helicases described herein can be used in the method.
  • the two or more Prp43 helicases described in this application may be the same or different. Wild-type Prp43 helicases or other types of helicases may also be included.
  • two or more helicases can be linked or just arranged to play the function of controlling the movement of the polynucleotide by binding to the polynucleotide respectively.
  • the method further comprises the step of applying a potential difference across the pore in contact with the helicase or construct, and the polynucleotide of interest.
  • the pores are structures that allow hydrated ions to flow from one side of the membrane to the other layer of the membrane, driven by an applied electrical potential.
  • the pores are nanopores, and the nanopores are transmembrane pores.
  • the transmembrane pore provides a channel for the movement of the target polynucleotide.
  • the pores are selected from biological pores, solid-state pores or hybrid pores of biological and solid-state.
  • the pores include, but are not limited to, pores derived from M. smegmatis porin A, M. smegmatis porin B, M. smegmatis porin C, M. smegmatis porin Protein D, hemolysin, lysin, interleukin, outer membrane porin F, outer membrane porin G, outer membrane phospholipase A, WZA or Neisseria autotransport lipoprotein and the like.
  • Said membrane may be any membrane existing in the prior art, preferably an amphiphilic layer, ie a layer formed of amphiphilic molecules such as phospholipids having at least one hydrophilic part and at least one lipophilic or hydrophobic part , amphiphilic molecules can be synthetic or naturally occurring.
  • the membrane is a lipid bilayer membrane.
  • the target polynucleotide can be attached to the membrane using any known method. If the membrane is an amphiphilic layer, such as a lipid bilayer, the polynucleotide is preferably attached to the membrane via a polypeptide present in the membrane or via a hydrophobic anchor present in the membrane.
  • the hydrophobic anchor is preferably lipid, fatty acid, sterol, carbon nanotube or amino acid.
  • the rate of passage of the target polynucleotide through the pore is controlled by the Prp43 helicase or construct, resulting in an identifiable and stable current level for target determination Characterization of polynucleotides.
  • the target polynucleotide is single-stranded, double-stranded or at least partially double-stranded.
  • the target polynucleotide can be modified by means of tags, spacers, methylation, oxidation or damage.
  • the target polynucleotide is at least partially double-stranded. wherein the double-stranded portion constitutes a Y-adapter structure comprising a leader sequence that preferentially screws into the pore.
  • the length of the target polynucleotide may be 10-100,000 bases or more.
  • the length of the target polynucleotide can be at least 10, at least 50, at least 100, at least 200, at least 300, at least 400, at least 500, at least 1000, at least 2000, at least 5000, at least 10000, at least 50000 or at least 100000 bases, etc.
  • the helicase is incorporated into the internal nucleotides of the single-stranded polynucleotide.
  • the target polynucleotide is DNA or RNA.
  • the target polynucleotide is RNA
  • the RNA in order to improve the ability and efficiency of the RNA to be sequenced passing through the pore, is modified to include a non-RNA polynucleotide.
  • the step of RNA modification comprises linking the DNA leader region to the 3' end of the RNA to be tested. It also includes the step of reverse transcription of the RNA to be tested
  • the one or more features are selected from the source, length, identity, sequence, secondary structure of the target polynucleotide, or whether the target polynucleotide is modified. Further preferably, the one or more features are performed by electrical and/or optical measurements.
  • electrical and/or optical signals are generated by electrical and/or optical measurements, and each nucleotide corresponds to a signal level, and the electrical and/or optical signals are then converted into nucleotide characteristics.
  • the electrical measurement includes, but is not limited to, current measurement, impedance measurement, tunnel measurement, wind tunnel measurement, or field effect transistor (FET) measurement, and the like.
  • FET field effect transistor
  • the electrical signals described herein are selected from measurements of current, voltage, tunneling, resistance, potential, conductivity or lateral electrical measurements.
  • the electrical signal is an electrical current through the aperture.
  • the characterization further includes applying an improved Viterbi algorithm.
  • a tenth aspect of the present application relates to the use of the modified Prp43 helicase of the first aspect of the present application or the protein construct of the second aspect of the present application in characterizing a polynucleotide of interest or controlling the passage of a polynucleotide of interest through a pore. Use on the move.
  • An eleventh aspect of the present application relates to an analytical device for characterizing a target polynucleotide, said analytical device comprising one or more nanopores, one or more modified Prp43 according to the first aspect of the present application A helicase or a protein construct as described in the second aspect of the present application, and one or more containers.
  • the analysis device is selected from kits, devices or sensors.
  • the analysis device is a kit, and the kit further includes a chip comprising a lipid bilayer.
  • the pores span the lipid bilayer.
  • the kits described herein comprise one or more lipid bilayers, each lipid bilayer comprising one or more of the described pores.
  • the kits described herein also include reagents or devices for carrying out the characterization of the polynucleotide of interest.
  • the reagents include buffers and tools required for PCR amplification.
  • a twelfth aspect of the present application relates to a method of forming a sensor for characterizing a polynucleotide of interest, comprising providing a nanopore, and the modified Prp43 helicase or the present invention in the nanopore and the first aspect of the present application
  • a complex is formed between the protein constructs described in the second aspect of the application.
  • Figure 1 shows a schematic 3D structure of an N-terminal (M1-N60) truncated wild-type Prp43 helicase (SEQ ID NO: 1) from Chaetomium thermophilum.
  • Figure 2 shows N-terminal (M1-N60) truncated wild-type Prp43 helicase, modified Prp43 helicase Prp43-2 (F181C/N623C/C508S), modified Prp43 helicase Prp43-3 (P352C/S619C/C508S), N-terminal (M1-N60) truncated protein construct Prp43-GP, N-terminal (M1-N60) truncated protein construct Prp43-GP-2 (F181C/N623C/C508S) and N-terminal (M1-N60) truncated protein constructs Prp43-GP-3 (P352C/S619C/C508S) single-stranded DNA-dependent ATP hydrolysis activity assay.
  • Prp43-GP-3 P352C/S619C/C508S
  • Figure 3 shows N-terminal (M1-N60) truncated wild-type Prp43 helicase, modified Prp43 helicase Prp43-2 (F181C/N623C/C508S), modified Prp43 helicase Prp43-3 (P352C/S619C/C508S), N-terminal (M1-N60) truncated protein construct Prp43-GP, N-terminal (M1-N60) truncated protein construct Prp43-GP-2 (F181C/N623C/C508S) and N-terminal (M1-N60) truncated protein constructs Prp43-GP-3 (P352C/S619C/C508S) single-stranded RNA-dependent ATP hydrolysis activity assay.
  • Prp43-GP-3 P352C/S619C/C508S
  • Figure 4 shows N-terminal (M1-N60) truncated wild-type Prp43 helicase or N-terminal (M1-N60) truncated protein constructs Prp43-GP, N-terminal (M1-N60) under low salt conditions Affinity curve of truncated protein construct Prp43-GP-2 (F181C/N623C/C508S) with single-stranded DNA.
  • Figure 5 shows N-terminal (M1-N60) truncated wild-type Prp43 helicase, N-terminal (M1-N60) truncated protein constructs Prp43-GP and N-terminal (M1-N60) truncated protein Results of gel migration experiments for construct Prp43-GP-2 (F181C/N623C/C508S).
  • lane 1 is the T44-37-FAM substrate
  • lane 2 is the complex bound by the wild-type Prp43 helicase and T44-37-FAM substrate
  • lane 3 is the wild-type Prp43 helicase and T44-37-FAM
  • lane 4 is the complex of Prp43-GP helicase and T44-37-FAM substrate binding
  • lane 5 is Prp43-GP helicase and T44-37-FAM substrate
  • lane 6 is the Prp43-GP-2 helicase mutant and T44-37-FAM substrate bound complex
  • lane 7 is the Prp43-GP-2 helicase mutant and T44 The product of TMAD-catalyzed treatment after 37-FAM substrate binding.
  • Figure 6 shows a schematic diagram of DNA construct X, wherein the 5' end of the corresponding sequence SEQ ID NO: 32 in the A region is connected to 4 iSpC3 spacers (region B), and the spacer region is connected to the corresponding sequence SEQ ID NO in the C region.
  • the 3' end of: 33, the 5' end of the C region sequence is connected to the corresponding sequence SEQ ID NO: 34 of the D region, the corresponding sequence SEQ ID NO: 35 of the E region of the construct and the corresponding sequence SEQ ID NO: 36 of the F region ( It has a 3' cholesterol tether) hybridization.
  • Figure 7 shows an example of current traces as N-terminal (M1-N60) truncated protein construct Prp43-GP-2 (F181C/N623C/C508S) controls DNA construct X movement through the MspA nanopore (y-axis coordinates are Current (pA, 0 to 100), x-coordinate is time (h:m:s)).
  • Figure 8 illustrates: shows a schematic diagram of RNA construct Y wherein SEQ ID NO: 37 (labeled D) has its 3' end linked to 20 iSpC3 spacers (labeled A) and its 5' end is linked to 4 iSpC3 A spacer (labeled B) linked to the 3' end of SEQ ID NO:38 (labeled C), the SEQ ID NO:39 (labeled E) region of the construct is identical to SEQ ID NO:40 ( Labeled F, it hybridizes with a 3' cholesterol tether.
  • Figure 9 shows an example of the current trajectory of N-terminal (M1-N60) truncated protein construct Prp43-GP-2 (F181C/N623C/C508S) controlling RNA construct Y through the MspA nanopore (y-axis coordinates are current (pA, 0 to 100), the x-axis coordinate is time (h:m:s)).
  • Wild-type Prp43 helicase and modified helicase Prp43 and protein constructs were prepared using standard molecular biology methods, the principles and procedures of which are well known to those skilled in the art (see references cited herein ).
  • N-terminal truncated wild-type Prp43 helicase i.e., T61-A764 fragment
  • N-terminal truncated Prp43 helicase T61-A764 fragment corresponding to the Prp43 helicase amino acid sequence of SEQ ID NO: 1 with the N removed
  • the nucleic acid sequence SEQ ID NO: 28, provided by GenScript Biotechnology Co., Ltd.
  • corresponding to the M1 to N60 fragment of the terminal domain was connected to the vector pGS-21a (GenScript Biotechnology Co., Ltd.) by restriction enzyme ligation Company, Cat. No. SD0121), it was transformed into expression competent host cell BL21(DE3) (Beijing Quanshijin Biotechnology Co., Ltd., Cat. No.
  • CD601-02 after being verified by sequencing.
  • IPTG isopropyl- ⁇ -D-thiogalactoside
  • the supernatant is subjected to subsequent protein chromatography purification, including nickel ion affinity chromatography, ion exchange chromatography and molecular sieve separation.
  • Target protein the target protein after GST tag removal was detected by SDS-PAGE gel electrophoresis.
  • the truncated Prp43 protein (M1 to N60 with the N-terminal domain removed) after the excision of the tag is detected by SDS-PAGE, and it is shown that the size of the target protein is correct, which can be used for subsequent testing and analysis.
  • N-terminal truncated Prp43 helicase T61-A764 fragment fusion GP domain protein mutant Prp43-GP-2 (F181C/N623C/C508S) (i.e. SEQ ID NO: 27): according to N-terminal truncated Prp43 helicase T61 -The same preparation method of the A764 fragment was carried out, except that the starting sequence was replaced by SEQ ID NO:30 with the nucleic acid sequence corresponding to the N-terminal truncated Prp43 helicase T61-A764 fragment (SEQ ID NO:28). The protein construct Prp43-GP-2 after excision of the tag was detected by SDS-PAGE and showed that the size of the target protein was correct, which could be used for subsequent testing and analysis.
  • modified N-terminal (M1-N60) truncated Prp43 helicases and protein constructs were prepared: Prp43-2 (F181C/N623C/C508S), modified The Prp43 helicase Prp43-3 (P352C/S619C/C508S), N-terminal (M1-N60) truncated protein construct Prp43-GP and N-terminal (M1-N60) truncated protein construct Prp43-GP- 3 (P352C/S619C/C508S).
  • Prp43-2 F181C/N623C/C508S
  • modified The Prp43 helicase Prp43-3 P352C/S619C/C508S
  • N-terminal (M1-N60) truncated protein construct Prp43-GP and N-terminal (M1-N60) truncated protein construct Prp43-GP- 3 P352C/S619C/C
  • the ATP hydrolysis activity of GP-3 (P352C/S619C/C508S) was tested when bound or incubated with single-stranded DNA or single-stranded RNA substrates.
  • the ATPase hydrolysis activity of Prp43 helicase was detected by absorptiometry.
  • the specific steps are to prepare a premixed solution containing 50uM phosphate, transfer 50uL of phosphate standard solution into 950uL of ultrapure water, and number the pipes.
  • reaction mixture 160 uL of working reagent was added to each background blank well to stop the reaction. The initial 30 min incubation is not required, after which the background blank reading can be subtracted from the sample reading. Set the reaction combination according to the scheme of Table 4 and Table 5. Each sample, background blank, or negative control reaction requires 70 uL of reaction mix.
  • Figures 2 and 3 show the ATP hydrolysis activities of N-terminal (M1-N60) truncated wild-type Prp43 helicase and modified Prp43 helicase or protein constructs after binding to DNA or RNA, respectively. It can be seen from Figure 2 and Figure 3 that after the G-Path activation domain was fused to the C-terminus of Prp43 helicase or mutant, the ATP hydrolysis activity of the enzyme was significantly improved; After cysteine, the ATP hydrolysis activity was also improved.
  • This example uses the fluorescence polarization method to detect the N-terminal (M1-N60) truncated wild-type Prp43 helicase or the modified protein constructs Prp43-GP and Prp43-GP-2 (F181C/N623C/C508S) in single-stranded DNA affinity was tested.
  • N-terminal (M1-N60) truncated wild-type helicase or modified helicase were diluted according to the following concentration gradients: 800nM, 400nM, 200nM, 100nM, 50nM, 25nM, 12.5nM, 6.25nM, 3.125nM, 1.56nM, BLANK, the enzyme and 10nM single-stranded DNA substrate were incubated in Binding Buffer (10mM HEPES, 50mM KCl, 5% Glycerol, pH7.0) for 20min, and the polarization value was read under 530nM excitation light and 560nM emission light And fit to draw the affinity curve, and set three replicates for each enzyme concentration.
  • Binding Buffer (10mM HEPES, 50mM KCl, 5% Glycerol, pH7.0
  • the fitting results are shown in Fig. 4.
  • the N-terminal (M1-N60) truncated Prp43 helicase is fused with a G-Path domain at the C-terminal, that is, Prp43-GP helicase, or the Prp43-GP helicase is based on Prp43-GP.
  • Prp43-GP-2 F181C/N623C/C508S
  • the binding of N-terminal (M1-N60) truncated wild-type Prp43 helicase or modified protein constructs Prp43-GP and Prp43-GP-2 was detected by gel shift assay
  • the TMAD catalyst catalyzed the enhancement of nucleic acid binding following the formation of a disulfide bond between mutant sites F181C and N623C in mutants.
  • the experimental conditions are as follows: 30nM of FAM fluorophore-labeled single-chain polythymidine substrate T44-37-FAM was added to Buffer (10mM HEPES, 50mM KCl, PH7.0), and then the final concentration was 120nM.
  • Buffer 10mM HEPES, 50mM KCl, PH7.0
  • the wild-type Prp43 helicase and the modified Prp43-2 and Prp43-GP-2 helicases were incubated at 30 °C for 1.5 h; the final concentration of the enzyme was 1000 times the TMAD cross-linking agent to catalyze the mutation site cyste For cross-linking of amino acids, incubate at 30°C for 1.5h.
  • DNA construct X as shown in Figure 6 was prepared: the 5' end of the corresponding sequence in the A region (SEQ ID NO: 32) was connected to 4 iSpC3 spacers (region B), which was connected to the corresponding sequence in the C region (SEQ ID NO: 32). ID NO: 33), the 5' end of the C region sequence is connected to the corresponding sequence of the D region (SEQ ID NO: 34), and the corresponding sequence of the E region of the construct (SEQ ID NO: 35) corresponds to the F region
  • SEQ ID NO: 36 which has a 3' cholesterol tether
  • the A, B, C, and D segments with a concentration of 10uM were synthesized into ligated fragments, and added to the annealing buffer (10mM Tris, pH7.0, 50mM NaCl) in a ratio of 1:1:1 with the E and F fragments.
  • annealing buffer 10mM Tris, pH7.0, 50mM NaCl
  • annealing was carried out according to the process of 98°C 10min, -0.1°C/0.6s, 300 cycles, 65°C 5min, -0.1°C/0.6s, 400 cycles (among them, A, B, C, D, E,
  • the F fragment was provided by Sangon Bioengineering (Shanghai) Co., Ltd.).
  • the prepared DNA construct X and modified mutant helicase Prp43-GP-2 (F181C/N623C/C508S) or N-terminally truncated wild-type Prp43-GP were incubated at 25°C in buffer (10 mM HEPES, pH 8.0 , 50mM NaCl, 5% glycerol) were pre-incubated for 30 minutes, and 1000 times the concentration of helicase TMAD catalyst was added for incubation at room temperature for 30 minutes.
  • MspA nanopore MspA protein sequence is SEQ ID NO: 31, according to Michael Faller et al., "The Structure of a Mycobacterial Outer-Membrane Channel", Science 303, 1189 (2004); prepared as described in DOI: 10.1126/science.1094114) to obtain electrical measurement signals.
  • Montal-Mueller technique ⁇ 25 ⁇ m diameter holes in the PTFE membrane formed a bilayer, separating two ⁇ 100 ⁇ L buffer solutions. All experiments were performed in the described buffer. Single-channel current is measured using an amplifier equipped with a digitizer. The Ag/AgCl electrodes were connected into the buffer such that the cis compartment was connected to the ground of the amplifier and the trans compartment was connected to the active electrode.
  • RNA construct shown in Figure 8 was prepared: the 3' end of the corresponding sequence in the D region (SEQ ID NO: 37) was connected to 20 iSpC3 spacers (A region), and its 5' end was connected to 4 iSpC3 spacers (B area), the spacer is connected to the 3' end of the corresponding sequence in the C area (SEQ ID NO: 38), the E area corresponding sequence (SEQ ID NO: 39) and the F area corresponding sequence (SEQ ID NO: 39) of the construct :40) Hybrid.
  • the A, B, C, and D segments with a concentration of 10uM were synthesized into ligated fragments, and added to the annealing buffer (10mM Tris, pH7.0, 50mM NaCl) in a ratio of 1:1:1 with the E and F fragments.
  • annealing buffer 10mM Tris, pH7.0, 50mM NaCl
  • annealing was carried out according to the process of 98°C 10min, -0.1°C/0.6s, 300 cycles, 65°C 5min, -0.1°C/0.6s, 400 cycles (among them, A, B, C, D, E,
  • the F fragment was provided by Sangon Bioengineering (Shanghai) Co., Ltd.).
  • RNA constructs were pre-incubated with Prp43-GP-2 or N-terminally truncated wild-type Prp43-GP in buffer (10 mM HEPES, pH 7.0, 50 mM NaCl) at 30°C for 30 minutes.
  • MspA nanopore MspA protein sequence is SEQ ID NO: 31, according to Michael Faller et al., "The Structure of a Mycobacterial Outer-Membrane Channel", Science 303, 1189 (2004); prepared as described in DOI: 10.1126/science.1094114
  • Montal-Mueller technique ⁇ 25 ⁇ m diameter holes in the PTFE membrane formed a bilayer, separating two ⁇ 100 ⁇ L buffer solutions.
  • RNA polynucleotide construct and Prp43-GP-2 helicase or N-terminally truncated wild-type Prp43-GP were added to 70 ⁇ L of the cis compartment of the electrophysiology chamber buffer to initiate the capture of the helicase-RNA complex in the nanopore.
  • Helicase ATPase activity was activated by adding divalent metal (5 mM MgCl 2 ) and NTP (5 mM ATP) to the cis compartment as needed. Experiments were carried out at a constant potential of +180 mV.
  • the results show that the RNA construct is moved by the RNA controlled by the Prp43-GP-2 helicase, and the results of the RNA movement controlled by the Prp43-GP-2 helicase are shown in FIG. 9 .
  • the RNA movement controlled by the Prp43-GP-2 helicase was 3 seconds long and corresponded to the translocation of an RNA construct of approximately 30 bp across the nanopore.
  • N-terminal truncated wild-type Prp43 (T61-A764 fragment) or N-terminal truncated construct Prp43-GP was difficult to obtain the A/B/C/D fragment of construct Y generated by nanopore. continuous current signal.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Microbiology (AREA)
  • Medicinal Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Hematology (AREA)
  • Pathology (AREA)
  • Urology & Nephrology (AREA)
  • General Physics & Mathematics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Food Science & Technology (AREA)
  • Nanotechnology (AREA)
  • Plant Pathology (AREA)
  • Enzymes And Modification Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

一种经修饰的Prp43解旋酶及其用途。由于突变的引入和/或辅助蛋白的引入而增强了Prp43解旋酶的ATP水解活性或解旋活性,能长时间保持与目标多核苷酸的结合,从而允许酶以测序要求的合适速率、持续稳定地控制多核苷酸的移动速度,可以用于纳米孔测序。

Description

经修饰的Prp43解旋酶及其用途
技术领城
本申请涉及核酸测序技术。
背景技术
纳米孔测序技术是第三代核酸测序技术,它是通过记录DNA/RNA链穿过纳米孔时不同碱基产生的不同电信号来获得DNA/RNA序列信息。纳米孔测序技术所面临的挑战之一是DNA/RNA分子通过纳米孔的速度往往太快、超过仪器的分辨率,因此难以获得反映序列信息的准确的电信号。因此,如何控制或减缓DNA/RNA分子通过纳米孔的速度对于提高纳米孔测序技术的准确率是至关重要的。
目前,新兴的一种表征多核苷酸的方法包括跨膜孔、解旋酶与多核苷酸的接触及相互作用,从而解旋酶控制目标多核苷酸穿过纳米孔的运动,以增加多核苷酸在纳米孔处的停留时间。
例如,专利WO2013057495A3和US20150191709A1公开了一种新的表征目标多核苷酸的方法,所述的方法使用孔和Hel308解旋酶或能结合目标多核苷酸内部核苷酸的分子马达。该发明所述的解旋酶或分子马达可以有效控制目标多核苷酸穿过所述孔的运动。另外,专利US20150065354A1和US9617591B2公开了一种使用XPD解旋酶表征目标多核苷酸的方法,所述方法利用孔和XPD解旋酶。该发明所述的XPD解旋酶可以控制目标多核苷酸穿过所述孔的运动。此外,专利US 2016O257942A1和US20180179500 A1则披露了T4噬菌体来源Dda解旋酶及其某些同源蛋白经过修饰后可以应用于多核苷酸过孔测序。
自然界中,解旋酶可分为六大超家族(Superfamily,SF),其中SF1和SF2超家族解旋酶以单体形式发挥移位和解旋功能,SF3-SF6家族则以多聚体形式发挥作用。纳米孔测序应用中,单体形式作用的解旋酶应用更简便、表 现均一性也会更强。SF1和SF2超家族的解旋酶根据蛋白序列同源性、结构域排列、底物结合形式和特异性、解旋的极性(5’-3’方向和3’-5’方向)以及解旋或移位机制等特性分类为不同的家族,SF1超家族包含UvrD/Rep家族解旋酶、类Upf1家族解旋酶和类Pif1家族解旋酶,SF2超家族包含Rad3/XPD家族解旋酶、类Ski2家族解旋酶、DEAH/RHA家族解旋酶、NS3/NPH-II家族解旋酶、DEAD盒(DEAD-Box)家族解旋酶、类RIG-I家族解旋酶、类RecQ家族解旋酶、类RecG家族解旋酶、Swi/Snf家族解旋酶和T1R家族解旋酶。RecD和T4噬菌体来源的Dda解旋酶同属于SF1超家族的类Pif1家族,底物倾向于单链DNA,以5’-3’方向进行移位和解旋,常见的同属于该家族的解旋酶还包括Pif1解旋酶、TrwC解旋酶等;Methanococcoides burtonii菌株来源的Hel308解旋酶(如US20150191709A1中所披露)属于SF2超家族的类ski2家族解旋酶,能同时以单链DNA或RNA为底物,以3’-5’方向极性移位或解旋双链核酸,常见的同属于该家族的解旋酶还包括ski2解旋酶、Brr2解旋酶、Mtr4解旋酶等;XPD解旋酶则属于SF2超家族的Rad3/XPD家族解旋酶,特异性结合单链DNA,以5’-3’极性方向移位或解旋双链核酸,常见的同属于该家族的酶还包括Rad3解旋酶等。
虽然现有技术中公开了多种可用于纳米孔测序技术的解旋酶,但是每种解旋酶各有其优缺点和各自的适用环境,这些解旋酶仍然难以满足科学研究和医疗技术等对于核酸测序技术在多方面提出的更苛刻的要求,因此仍然存在对于可以用于核酸纳米孔测序技术的新型解旋酶的需要,以便改善纳米孔测序技术的适用性、准确性和灵敏度等。
发明内容
发明人发现Prp43解旋酶,尤其是经修饰的Prp43解旋酶可以控制多核苷酸分子移动穿过纳米孔,从而可以用于纳米孔测序技术。
因此,本申请的第一方面涉及一种经修饰的Prp43解旋酶,包括RecA1结构域、RecA2结构域和Ratchet结构域,相对于对应的野生型Prp43解旋酶或其片段而言所述经修饰的Prp43解旋酶包括在选自RecA1结构域、 RecA2结构域、Ratchet结构域的至少一个结构域中引入的1、2、3、4、5、6、7、8、9、10、11个或更多个半胱氨酸的插入或置换,和/或1、2、3、4、5、6、7、8、9、10、11个或更多个非天然氨基酸的插入或置换。
本申请的第二方面涉及一种蛋白构建体,其包括本申请第一方面所述的经修饰的Prp43解旋酶,以及在所述Prp43解旋酶的C端或N端融合的辅助激活蛋白Paf1的G-Path结构域或含有G-Path结构域的Paf1片段。
本申请的第三方面涉及一种编码本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体的核酸。
本申请的第四方面涉及一种包含本申请的第三方面所述核酸的表达载体。
本申请的第五方面涉及本申请第三方面所述的核酸或包含本申请的第四方面所述表达载体的宿主细胞。
本申请的第六方面涉及一种制备本申请的第二方面所述的蛋白构建体的方法,包括:提供SEQ ID NO:1的多肽或其变体和SEQ ID NO:26的多肽或其变体,在SEQ ID NO:1的多肽或其变体中引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸,然后在所得多肽的C端或N端融合SEQ ID NO:26的多肽或其变体,形成所述蛋白构建体。
本申请的第七方面涉及一种制备本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体的方法,包括:包括培养本申请第五方面所述的宿主细胞,并进行诱导表达,然后纯化所得表达产物。
本申请的第八方面涉及一种控制多核苷酸分子移动的方法,包括将所述多核苷酸分子与本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体接触。
本申请的第九方面涉及一种表征目标多核苷酸的方法,所述的方法包括:
(a)将目标多核苷酸与本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体接触,使得所述Prp43解旋酶或蛋白构建体控制所述目标多核苷酸移动穿过一纳米孔;(b)获取目标多核苷酸中的核苷酸与所述纳米孔相互作用时的一个或多个特征,从而表征所述目标 多核苷酸。
本申请的第十方面涉及本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体在表征目标多核苷酸或控制目标多核苷酸穿过孔的移动中的用途。
本申请的第十一方面涉及一种用于表征目标多核苷酸的分析装置,所述的分析装置包含一个或多个纳米孔、一个或多个本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体、以及一个或多个容器。
本申请的第十二方面涉及一种形成表征目标多核苷酸的传感器的方法,包括提供纳米孔,和在所述纳米孔和本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体之间形成复合物。
本申请提供了一种可用于核酸纳米孔测序的新型Prp43解旋酶突变体或其构建体,由于突变的引入和/或辅助蛋白的引入而增强了其ATP水解活性或解旋活性,并且/或者能长时间保持与目标多核苷酸的结合,从而允许持续稳定地控制多核苷酸的移动速度。因此,本申请的Prp43解旋酶突变体或其构建体可以在测序要求的合适速率下持续地控制目标多核苷酸穿过孔的移动,进而改善纳米孔测序的通量和准确率。
具体实施方式
定义
为了更清楚地解释本发明的实施方式,本文中使用了一些科学术语和专有名词。除非在本文中进行了明确定义,所有这些术语和名词应当被理解为具有本领域技术人员所通常理解的含义。为了更清楚起见,对于本文中使用的某些术语进行了以下定义。
术语“多肽”是指包含通过肽键连接的氨基酸残基并含有多于五个氨基酸残基的分子。多肽通常可以包含20个或更多个氨基酸,优选地包含50个或更多个氨基酸,或者包含100个或更多个氨基酸。在本文中,术语“蛋白质”、“蛋白”与术语“多肽”被认为具有相同的含义;因此,术语“蛋白质”、“蛋白”和“多肽”可互换使用。可任选地修饰(例如,糖基化、 磷酸化、酰化、法尼基化、异戊烯基化、磺化等)多肽以增加其官能性或活性。在某些条件下、在特定底物存在下表现出活性的多肽可称为“酶”。应当理解,由于遗传密码的简并性,可以产生编码给定多肽的多种核苷酸序列。
本文所述的“核酸”是脱氧核糖核酸(DNA)和核糖核酸(RNA)的总称,是由许多核苷酸单体聚合成的生物大分子化合物。在本文中,术语“核酸”与术语“多核苷酸”被认为具有相同的含义;因此,术语“核酸”与术语“多核苷酸”可互换使用。
核苷酸单体由五碳糖、磷酸基和含氮碱基组成。如果五碳糖是核糖,则形成的聚合物是RNA;如果五碳糖是脱氧核糖,则形成的聚合物是DNA。核苷酸中的含氮碱基可以包括但不局限于:腺嘌呤(A)、鸟嘌呤(G)、胸腺嘧啶(T)、尿嘧啶(U)和胞嘧啶(C)。所述核苷酸可以是天然存在的或人工合成的。因此,本文所述的“核苷酸”包括但不局限于:腺苷单磷酸(AMP)、鸟苷单磷酸(GMP)、胸苷单磷酸(TMP)、尿苷单磷酸(UMP)、胞嘧啶核苷单磷酸(CMP)、环状腺苷单磷酸(cAMP)、环状鸟苷单磷酸(cGMP)脱氧腺苷单磷酸(dAMP)、脱氧鸟苷单磷酸(dGMP)、脱氧胸苷单磷酸(dTMP)、脱氧尿苷单磷酸(dUMP)和脱氧胞苷单磷酸(dCMP)。优选的,所述核苷酸选自AMP、TMP、GMP、CMP、UMP、dAMP、dTMP、dGMP或dCMP。
在本申请中,多肽或多肽结构域的“片段”是指在所述多肽或多肽结构域的氨基和/或羧基末端有一个或多个(例如,几个、几十个或100个等等)氨基酸残基缺失的多肽或多肽结构域,但该片段仍保留有所期望的活性。例如,Prp43解旋酶的片段表示在野生型Prp43的氨基和/或羧基末端有一个或多个(例如,1-5个、1-10个、1-20个、1-50个、1-100个、1-150个、1-200个,或者例如20个、30个、40个、50个、60个、70个、80个或90个)氨基酸残基缺失但仍保留有解旋酶活性的多肽序列。
通常,多肽或结构域的片段包含其原始序列长度的至少50%、55%、60%、65%、70%、75%、80%、85%、90%、92%、、94%、96%、98%或99%的长度。在本申请中,取决于原始多肽或结构域的长度,多肽或结构域的片段包含至少50个氨基酸,例如至少60个氨基酸、至少70个氨基酸、至少80个氨基酸、至少90个氨基酸、至少100个氨基酸、至少150个氨基酸、 至少200个氨基酸、至少250个氨基酸、至少300个氨基酸、至少350个氨基酸、至少400个氨基酸、至少500个氨基酸、至少650个氨基酸、或至少700个氨基酸。在本申请中,多肽或结构域的片段也可以包含少于700个氨基酸,例如少于600个氨基酸、少于500个氨基酸、少于400个氨基酸、少于300个氨基酸、少于200个氨基酸、或少于100个氨基酸。
术语“表达”包括参与多肽产生的任何步骤,包括但不限于转录、转录后修饰、翻译、翻译后修饰和分泌。
“表达载体”包含编码多肽的多核苷酸,该多核苷酸可操作地连接至适当的控制序列(例如启动子,以及转录和翻译终止信号)以用于在体外表达和/或翻译。表达载体可以是任何载体(例如,质粒或病毒),该表达载体可以方便地经历重组DNA程序并且可以引起多核苷酸的表达。载体的选择将通常取决于载体与要导入载体的细胞的相容性。载体可以是线性或闭环质粒。载体可以是自主复制载体,即这样的载体,所述载体作为染色体外实体存在,所述载体的复制独立于染色体复制,为例如质粒、染色体外元件、微型染色体或人工染色体。或者,载体可以是这样的载体,所述载体当被引入宿主细胞时整合到基因组中并与其所整合到的染色体一起复制。整合克隆载体可以整合在宿主细胞的染色体中的随机或预定靶基因座处。载体系统可以是单一载体或质粒或两种或更多种载体或质粒,所述载体或质粒一起含有待引入宿主细胞基因组的总DNA,或转座子。
本文所用的术语“控制序列”是指在特定生物体内或体外参与编码序列表达的调节的组分。控制序列的示例是转录起始序列、终止序列、启动子、前导序列、信号肽、前肽、前原肽或增强子序列;夏因-达尔加诺序列(Shine-Delgarno sequence)、阻遏物或激活物序列;有效的RNA处理信号,例如剪接和多腺苷酸化信号;稳定化细胞质mRNA的序列;增强翻译效率的序列(例如,核糖体结合位点);增强蛋白质稳定性的序列;以及当需要时,增强蛋白质分泌的序列。
如本文所定义的“宿主细胞”是适用于遗传操纵并且可以在可用于生产目标产物(如本申请所述的Prp43解旋酶)的生物体。宿主细胞可以是在自然界中发现的宿主细胞,或来源于亲本宿主细胞的遗传操纵或经典诱变 后的宿主细胞。有利地,宿主细胞是重组宿主细胞。宿主细胞可以是原核、古细菌或真核宿主细胞。原核宿主细胞可以是但不限于细菌宿主细胞。真核宿主细胞可以是但不限于酵母、真菌、变形虫、藻类、植物、动物、或昆虫宿主细胞。
针对核酸或蛋白质(或酶)使用时,术语“重组”是指该核酸或蛋白质(或酶)与其天然形式相比,已经通过人工干预进行了序列修饰。当涉及细胞(例如宿主细胞)时,术语“重组”表示该细胞的基因组如果与其天然形式相比则已经通过人工干预进行了序列修饰。在本文中,术语“重组”与“经修饰”被认为同义。
针对经修饰的多肽或酶使用时,术语“置换”表示相应的野生型多肽或酶中存在的天然氨基酸残基被另一种氨基酸残基代替。在本文中,术语“氨基酸置换”与“氨基酸取代”被认为同义。
本文所用的术语“变体”或“突变体”具有相同含义,可互换地使用。它们可以指多肽或核酸。变体是指相对于参考序列(通常为核酸或多肽的野生形式)存在一个或多个位置处的置换、插入、缺失、截短、颠换等。可以通过例如位点饱和诱变、扫描诱变、插入诱变、随机诱变、定点诱变和定向进化以及本领域技术人员已知的各种其他重组方法来产生变体。核酸的变异基因可以通过本领域已知的技术人工合成。
“成熟多肽”在本文中定义为这样的多肽,所述多肽处于其最终形式下并且在将mRNA翻译成多肽并对所述多肽进行翻译后修饰后获得。翻译后修饰包括N-末端加工、C-末端截短、糖基化、磷酸化,以及通过切割去除前导序列(诸如信号肽和/或前肽)。
两个多肽序列或核酸序列之间的相似性可以用其同源性来表示。在本文中,两个序列之间的“同一性”或“同源性”被认为具有相同含义,在本文中可互换使用。为了确定两个氨基酸序列或两个核酸序列的序列同源性或序列同一性的百分比,对序列进行比对以实现最佳匹配,序列同一性是两个序列之间在比对区域上相同匹配的百分比。两个氨基酸序列之间或两个多核苷酸序列之间的序列同源性百分比可以使用公知的算法来确定,例如用于比对两个序列的Needleman和Wunsch算法(Needleman,S.B.和Wunsch, C.D.(1970)J.Mol.Biol.48,443-453)。例如可以使用来自EMBOSS程序包的NEEDLE程序来进行。本领域技术人员将理解,当使用不同的算法或特定算法的不同参数时,可能会产生略微不同的结果,但是两个序列之间的同一性百分比不会显著改变。
本申请所述的“开口”是指野生型Prp43解旋酶本身带有的多核苷酸结合结构域的开口,也可以指与Prp43解旋酶结合的多核苷酸结合部分的开口,所述的开口为使得多核苷酸与Prp43解旋酶解离的开口,并且该开口可以不是一直存在的,但是至少在一种构象状态下包含至少一个开口。本申请所述的“经修饰的Prp43解旋酶”或者包含经修饰的Prp43解旋酶的构建体含有一个或多个开口。经过修饰Prp43解旋酶,使得解旋酶的同一单体上有两个或多个部分连接以减小开口的大小。
本申请所述的“一个以上”、“至少一个”、“一个或多个”或“一个或更多个”包括:一个、两个、三个、四个、五个、六个、七个、八个、九个、十个、十一个、十二个、或更多个等等。
本申请所述的“两个以上”或“两个或多个”或“两个或以上”包括:两个、三个、四个、五个、六个、七个、八个、九个、十个、十一个、十二个、或更多个等等。
本申请所述的“多个”包括但不限于:三个、四个、五个、六个、七个、八个、九个、十个、十一个、十二个、或更多个等等。
本申请所述的“和/或”包括择一列出的项目以及任何数量的项目组合。
本申请所述的“包括”、“含有”或“包含”是开放式的描述,表示含有所描述的指定成分或步骤,以及不会产生实质影响的其他指定成分或步骤。特别地,当上述术语用于描述蛋白质或核酸的序列时,表示所述蛋白质或核酸既可以是由所述序列组成,或者也可以是在所述蛋白质或核酸的一端或两端可以具有额外的氨基酸残基或核苷酸,但所述蛋白质或核酸仍然具有本申请所述的活性(例如其控制多核苷酸移动的能力等等)。
Prp43解旋酶
Prp43解旋酶是一种已知的解旋酶,其结构和功能在现有技术中已有研究和报道,例如可参见Marcel J.Tauchert et.al,“Structural and functional analysis of the RNA helicase Prp43 from the thermophilic eukaryote Chaetomium thermophilum”,Acta Cryst.,2016,F72,112–120。但是,未见有使用Prp43解旋酶用于纳米孔测序或控制多核苷酸分子移动通过纳米孔的报道。
Prp43解旋酶属于SF2超家族的DEAH/RHA解旋酶,能结合单链DNA或RNA,以3’-5’方向进行移位或解旋双链DNA或RNA核酸,同属于该家族的解旋酶还包括Prp22解旋酶、Prp2解旋酶、MLE解旋酶、DHX9解旋酶等。
Prp43解旋酶的3D结构和其组成结构域已经在现有技术中被阐明。例如,图1是嗜热毛壳菌(Chaetomium thermophilum)来源的Prp43解旋酶(SEQ ID NO:1)的3D结构示意图。Prp43解旋酶除了解旋酶常有的RecA1(P97-R273)和RecA2(T274-T458)两个核心结构域之外,还包含若干结构域:N端结构域(M1-L96),C端的WH结构域(Y459-P526)、Ratchet结构域(L527-V640)和OB结构域(S641-A764)等。RecA1和RecA2包含7个保守的基序(Motifs),其中Ia(TQPRRVAA)、Ib(TDGQLLR)和IV(LLFLTG)与底物核酸相互作用,基序I(GSGKT)、II(DEAH)、V(TNIAETSLT)和VI(QRAGRAGR)则与核苷酸的结合有关,而基序III(SAT)则耦联核苷酸的水解和核酸的移位或解旋。
Prp43解旋酶在RecA1和RecA2结构域的顶端富含带正电荷氨基酸,与C端的WH结构域、Ratchet结构域和OB结构域一起形成环绕单链DNA或RNA的通道。
发明人发现虽然Prp43解旋酶与单链DNA或RNA具有较强的亲和力,但该过程仍然是一个热力学动态平衡过程,并不能完全保持控制目标核酸的过孔移动,尤其是目标核酸的长度较长时,比如1000个碱基长度、5000个碱基长度、10000个碱基长度、100000个碱基长度或更长碱基长度的核酸。发明人发现可以通过对Prp43解旋酶进行修饰,保证酶与核酸的结合并持续控制核酸穿过纳米孔。具体地,发明人发现,在Prp43解旋酶的RecA1结构域、RecA2结构域和/或Ratchet结构域中引入一个或多个半胱氨酸或 非天然氨基酸,可以减小Prp43解旋酶的多核苷酸结合结构域的开口大小,进而改善其目标核酸的结合能力。
因此,本申请的第一方面涉及一种经修饰的Prp43解旋酶,包括RecA1结构域、RecA2结构域和Ratchet结构域,相对于对应的野生型Prp43解旋酶或其片段而言所述经修饰的Prp43解旋酶包括在选自RecA1结构域、RecA2结构域、Ratchet结构域的至少一个结构域中引入的1、2、3、4、5、6、7、8、9、10、11个或更多个半胱氨酸的插入或置换,和/或1、2、3、4、5、6、7、8、9、10、11个或更多个非天然氨基酸的插入或置换。
优选的,可以在下列任意一组中引入一个至少一个半胱氨酸残基和/或至少一个非天然氨基酸:
(a)RecA1结构域;
(b)RecA2结构域;
(c)Ratchet结构域;
(d)RecA1结构域和Ratchet结构域;
(e)RecA2结构域和Ratchet结构域。
本申请所述的非天然氨基酸包括但不限于:4-叠氮基-L-苯丙氨酸(Faz),4-乙酰基-L-苯丙氨酸,3-乙酰基-L-苯丙氨酸,4-乙酰乙酰基-L苯丙氨酸,O-烯丙基-L-酪氨酸,3-(苯基硒烷基)-L-丙氨酸,O-2-丙炔-1-基-L-酪氨酸,4(二羟基硼基)-L-苯丙氨酸,4-[(乙基硫烷基)羰基]-L-苯丙氨酸,(2S)-2-氨基-3-{4-[(丙烷-2-基硫烷基)羰基]苯基}丙酸,(2S)-2-氨基-3-{4-[(2-氨基-3-硫烷基丙酰基)氨基]苯基}丙酸,O-甲基-L-酪氨酸,4-氨基-L-苯丙氨酸,4-氰基-L-苯丙氨酸,3-氰基-L-苯丙氨酸,4-氟-L-苯丙氨酸,4-碘-L-苯丙氨酸,4-溴-L-苯丙氨酸,O-(三氟甲基)酪氨酸,4-硝基L-苯丙氨酸,3-羟基-L-酪氨酸,3-氨基-L-酪氨酸,3-碘-L-酪氨酸,4-异丙基-L-苯丙氨酸,3-(2-萘基)-L-丙氨酸,4-苯基-L-苯丙氨酸,(2S)-2-氨基-3-(萘-2-基氨基)丙酸,6-(甲基硫烷基)正亮氨酸,6-氧-L-赖氨酸,D-酪氨酸,(2R)-2-羟基-3-(4-羟基苯基)丙酸,(2R)-2氨基辛酸酯3-(2,2′-二吡啶-5-基)-D-丙氨酸,2-氨基-3-(8-羟基-3-喹啉基)丙酸,4-苯甲酰-L-苯丙氨酸,S-(2-硝基苄基)半胱氨酸,(2R)-2-氨基-3-[(2-硝基苄基)硫烷基]丙酸,(2S)-2-氨基-3-[(2-硝基苄基)氧基]丙酸,O-(4, 5-二甲氧基-2-硝基苄基)-L-丝氨酸,(2S)-2-氨基-6-({[(2-硝基苄基)氧基]羰基}氨基)己酸,O-(2-硝基苄基)-L-酪氨酸,2-硝基苯丙氨酸,4-[(E)-苯基二氮烯基]-L-苯丙氨酸,4-[3-(三氟甲基)-3H-二吖丙啶基-3基]-D-苯丙氨酸,2-氨基-3-[[5-(二甲基氨基)-1-萘基]磺酰基氨基]丙酸,(2S)-2-氨基4-(7-羟基-2-氧-2H-色烯-4-基)丁酸,(2S)-3-[(6-乙酰基萘-2-基)氨基]-2-氨基丙酸,4(羧基甲基)苯丙氨酸,3-硝基-L-酪氨酸,O-硫基-L-酪氨酸,(2R)-6-乙酰氨基-2-氨基己酸酯,1-甲基组氨酸,2-氨基壬酸,2-氨基癸酸,L-同质半胱氨酸,5-硫烷基正缬氨酸,6-硫烷基-L-正亮氨酸,5-(甲基硫烷基)-L-正缬氨酸,N6-{[(2R,3R)-3-甲基-3,4-二氢-2H-吡咯2-基]羰基}-L-赖氨酸,N6-[(苄基氧基)羰基]赖氨酸,(2S)-2-氨基-6-[(环戊基羰基)氨基]己酸,N6-[(环戊基氧基)羰基]-L-赖氨酸,(2S)-2-氨基-6-{[(2R)-四氢呋喃-2-基羰基]氨基}己酸,(2S)-2-氨基-8-[(2R,3S)-3-乙炔基四氢呋喃-2-基]-8-氧基辛酸,N6-(叔丁氧基羰基)-L-赖氨酸,(2S)-2-羟基-6-({[(2-甲基-2-丙烷基)氧基]羰基}氨基)己酸,N6-[(烯丙氧基)羰基]赖氨酸,(2S)-2-氨基-6-({[(2-叠氮苄基)氧基]羰基}氨基)己酸,N6L-脯氨酰基-L-赖氨酸,(2S)-2-氨基-6-{[(丙-2-炔-1-基氧基)羰基]氨基}己酸或N6-[(2叠氮乙氧基)羰基]-L-赖氨酸。
本申请中的“Prp43解旋酶”应该按照其广义来理解,并认为涵盖了Prp43解旋酶(例如SEQ ID NO:1)的同源蛋白。通常,只要一种酶具有DNA/RNA解旋活性,含有RecA1结构域、RecA2结构域和/或Ratchet结构域,且与SEQ ID NO:1具有至少30%的同源性,例如至少35%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性,则可以认为其属于Prp43解旋酶。因此,一些被称为HrpA解旋酶或HrpB解旋酶的解旋酶(例如表1中所列出的那些)也被认为落入本申请中的“Prp43解旋酶”范围内。
本申请所述的Prp43解旋酶可以是各种常规来源的Prp43解旋酶,例如所述Prp43解旋酶可以来源于Chaetomium thermophilum、Bathycoccus prasinos、Uncultured bacterium、Archaeon、Parcubacteria、Sorangium cellulosum、Candidatus Sungbacteria、Mycolicibacterium chitae、Parcubacteria、 Thermodesulforhabdus norvegica、Deltaproteobacteria、Puniceicoccales、Desulfobacterium vacuolatum或Desulfobacter sp.或来源于病毒基因组(viral metagenome)等。表1给出了一些可以用于本申请的同源Prp43解旋酶的示例,但是本申请的Prp43解旋酶并不限于这些例子。
表1:ctPrp43同源蛋白示例
Figure PCTCN2021085609-appb-000001
最优选的,申请所述的Prp43解旋酶是来源于嗜热毛壳菌(Chaetomium thermophilum)的。
因此,在一些优选实施方式中,本申请提供了一种经修饰的Prp43解旋酶,所述Prp43解旋酶包含SEQ ID NO:1或其片段的变体,所述变体包括在RecA1结构域、RecA2结构域和/或Ratchet结构域中引入的1、2、3、4、5、6、7、8、9、10、11个或更多个半胱氨酸的插入或置换,和/或1、2、3、4、5、6、7、8、9、10、11个或更多个非天然氨基酸的插入或置换。
在一些优选实施方式中,所述变体包括在对应于SEQ ID NO:1的M157、Q161、D165、F181、E182、N183、R324、L328、E332、R335、L351、P352、P353、H354、D321、E320、R358、P563、A564、N565、D603、K605、K606、H609、Y615、R616、S619、N623、A626或K630中的任一个或两个以上的位置上引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸残基。更优选地,所引入的半胱氨酸残基或非天然氨基酸残基位于对应于SEQ ID NO:1的F181、P352、S619或N623中的任一个或两个以上的位置。
已经发现,将野生型Prp43解旋酶的N端结构域去除,更有利于控制核苷酸的移动。因此,在一些优选实施方式中,所述经修饰的Prp43解旋酶包括N端结构域的去除,优选地去除N端的位置1开始的至少96个、至少90个、至少80个、至少70个、至少60个、至少50个、至少40个或至少30个残基。对于SEQ ID NO:1而言,优选地去除M1-N60,即优选使用SEQ ID NO:1的T61-A764片段,并在此基础上引入一个或更多个半胱氨酸的插入或置换,和/或一个或更多个非天然氨基酸的插入或置换。
为提高本申请所述的Prp43解旋酶与目标多核苷酸结合的稳定性,降低从目标多核苷酸上解脱的能力,还可以引入2个或2个以上的半胱氨酸残基或非天然氨基酸残基,且在引入的半胱氨酸与半胱氨酸之间相互连接、在引入的非天然氨基酸与非天然氨基酸之间相互连接、在引入的半胱氨酸与非天然氨基酸之间相互连接、在引入的半胱氨酸与天然氨基酸之间相互连接,或者在引入的非天然氨基酸与天然氨基酸之间相互连接。
优选的,可以使任何数目和组合的两个以上引入的半胱氨酸与非天然氨基酸相互连接。例如,可以使2、3、4、5、6、7、8或更多个半胱氨酸和 /或非天然氨基酸相互连接。一个或多个半胱氨酸可以与一个或多个半胱氨酸连接。一个或多个半胱氨酸可以与一个或多个非天然氨基酸诸如Faz连接。一个或多个非天然氨基酸诸如Faz可以与一个或多个非天然氨基酸诸如Faz连接。一个或多个半胱氨酸可以与一个或多个解旋酶上的天然氨基酸连接。一个或多个非天然氨基酸诸如Faz可以与一个或多个解旋酶上的天然氨基酸连接。
优选的,所述的连接可以是任何连接方式,包括暂时连接或者永久的连接方式,例如共价连接或氢键连接或静电相互作用或π-π相互作用或疏水相互作用等。在本发明的另一个具体实施方式中,所述的连接可以是永久的,例如共价连接。可以采用化学交联剂进行共价连接,其长度可以从一个碳(碳酰氯型连接器)到多个埃变化。例如马来酰亚胺、活性酯、琥珀酰亚胺、叠氮化物、烷烃、烯烃、炔烃(诸如二苯并环辛炔醇(DIBO或DBCO),二氟环炔烃和线性炔烃)等。又例如聚乙二醇(PEGs)、多肽、多糖、脱氧核糖核酸(DNA)、肽核酸(PNA)、苏糖核酸(TNA)、甘油核酸(GNA)、饱和的和不饱和的烃或聚酰胺等等的线性分子,又例如TMAD等等的催化试剂,可以通过-S-S键进行连接。
在本发明的某些具体实施方式中,采用TMAD催化剂使得F181和N623位置或P352和S619位置引入的半胱氨酸残基与半胱氨酸残基之间共价连接。
在一些优选实施方式中,所述经修饰的Prp43解旋酶还包括对于一个或多个半胱氨酸残基的置换,更优选为对应于SEQ ID NO:1的C148、C214、C303、C323、C377、C441、C508、C543、C608的一个或多个半胱氨酸残基被置换,更优选地半胱氨酸残基被置换为丙氨酸、甘氨酸、缬氨酸、异亮氨酸、亮氨酸、苯丙氨酸、酪氨酸、丝氨酸、苏氨酸、天冬氨酸、谷氨酸、赖氨酸、精氨酸、组氨酸、蛋氨酸、色氨酸、谷氨酰胺、天冬酰胺或脯氨酸残基。
在一些优选实施方式中,为了进一步使所述Prp43解旋酶具有持续、稳定地以一定速率控制多核苷酸移动的能力,所述经修饰的Prp43解旋酶还进一步包含选自以下组的一个或多个氨基酸修饰:
(a)一个或多个与核苷酸相互作用的氨基酸被置换;
(b)一个或多个与NTP和/或二价金属离子(如Mg 2+)结合相关的氨基酸被置换;
(c)一个或多个与跨膜孔相互作用的氨基酸被置换;
(d)降低Prp43解旋酶的表面的负电荷的进一步修饰。
优选的,所述与核苷酸相互作用且被置换的氨基酸包括但不限于:对应于SEQ ID NO:1的R152、R153、R180、T195、Q198、R201、E316、E317、G349、T381、N382、K403、K405、L416、P526、P557、R562、Q558、H688、P689、T708、K710、Y712、R714。进一步优选的,至少一个与单链DNA、RNA或双链DNA、RNA中一个或多个核苷酸的磷酸基团相互作用的氨基酸被置换。
优选的,所述一个或多个与NTP和/或二价金属离子(如Mg 2+)结合相关的氨基酸包括但不限于:对应于SEQ ID NO:1的T126、D218、S387、E219、R432、R435、T121、K125、T127、T389、R162、D391、F360。
优选的,所述一个或多个与跨膜孔相互作用的氨基酸包括但不限于:对应于SEQ ID NO:1的C303、E336、D288、R287、E286、E284、E291。
进一步优选的,用包含较大侧链的氨基酸取代至少一个与单链DNA、RNA或双链DNA、RNA中一个或多个核苷酸的糖和/或碱基相互作用的氨基酸。所述较大侧链包括增加数目的碳原子,具有增加的长度,增加的分子体积和/或具有增加的范德华体积。所述较大侧链增加了所述至少一个氨基酸与所述单链或双链DNA中一个或多个核苷酸之间的(i)静电相互作用;(ii)氢键和/或(iii)阳离子-pi相互作用。所述较大侧链的氨基酸不是丙氨酸(A)、半胱氨酸(C)、甘氨酸(G)、硒代半胱氨酸(U)、甲硫氨酸(M)、天冬氨酸(D)或谷氨酸(E)。
优选的,所述的Prp43解旋酶进一步被修饰降低其表面的负电荷。所述的Prp43解旋酶还包含增加净正电荷的取代。优选的,所述的Prp43解旋酶还包含对表面带负电的氨基酸、极性或非极性氨基酸进行取代或修饰。进一步优选的,所述的取代包括带正电的氨基酸、不带电荷的氨基酸取代带负电的氨基酸、不带电荷的氨基酸、芳香族氨基酸、极性或非极性氨基酸。其中, 所述的带正电的氨基酸、不带电荷的氨基酸、极性、非极性氨基酸或芳香族氨基酸可以是天然的或非天然的氨基酸,其可以是人工合成的或者经过修饰的天然氨基酸。
本申请所述所述的Prp43解旋酶经修饰后,可以与对应的野生型的Prp43解旋酶的氨基酸序列具有至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性。
在一些更优选的实施方式中,所述Prp43解旋酶为SEQ ID NO:1的变体(即来源于Chaetomium thermophilum),且所述SEQ ID NO:1的变体包括在SEQ ID NO:1的F181和/或N623位置上引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸;或者所述SEQ ID NO:1的变体包括在SEQ ID NO:1的P352和/或S619位置上引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸。
在一些更优选的实施方式中,所述Prp43解旋酶为SEQ ID NO:1的变体(即来源于Chaetomium thermophilum),且所述SEQ ID NO:1的变体还包括SEQ ID NO:1的至少一个或多个半胱氨酸被取代。取代的氨基酸可以为丙氨酸、甘氨酸、缬氨酸、异亮氨酸、亮氨酸、苯丙氨酸、酪氨酸、丝氨酸、苏氨酸、天冬氨酸、谷氨酸、赖氨酸、精氨酸、组氨酸、蛋氨酸、色氨酸、谷氨酰胺、天冬酰胺、脯氨酸。优选的,所述的一个或多个被取代的半胱氨酸为C148、C214、C303、C323、C377、C441、C508、C543、C608。
在一些更优选的实施方式中,所述Prp43解旋酶为SEQ ID NO:1的变体(即来源于Chaetomium thermophilum),且所述SEQ ID NO:1的变体去除N端结构域的M1至N60序列,进一步优选的去除N端的M1至L96序列。本发明的一个具体案例中是去除了N端结构域M1至N60序列的解旋酶。
在一些更优选的实施方式中,所述Prp43解旋酶为SEQ ID NO:1的变体(即来源于Chaetomium thermophilum),且所述SEQ ID NO:1的变体与SEQ ID NO:1的氨基酸序列具有至少30%、至少40%、至少50%、至少 60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性。更优选地,所述Prp43解旋酶为经修饰的SEQ ID NO:1的T61-A764片段。
在一些更优选的实施方式中,所述Prp43解旋酶为经修饰的SEQ ID NO:1的T61-A764片段(来源于Chaetomium thermophilum),且所述修饰为F181C/N623C/C508S或P352C/S619C/C508S。
另外,本申请所述的Prp43解旋酶可以被修饰以助于鉴定或纯化,例如通过添加组氨酸残基(His标签),天冬氨酸残基(asp标签),链霉亲和素标签,Flag标签,SUMO标签,GST标签或MBP标签,或通过添加信号序列以促进它们从细胞中分泌,该细胞中的多肽不天然地含有该信号序列。引入遗传标签的替换方式是通过化学反应将标签连到Prp43解旋酶上的天然或人工位点。
本申请所述的Prp43解旋酶可以是Prp43解旋酶寡聚体形式,所述的Prp43解旋酶寡聚体包含一个或多个的本申请所述的Prp43解旋酶。
在一些实施方式中,所述的Prp43解旋酶寡聚体还可以包含野生型Prp43解旋酶或其他类型的解旋酶。其中,所述的其他类型的解旋酶可以为Hel308解旋酶、XPD解旋酶、Dda解旋酶、RecD2解旋酶、TraI解旋酶或TrwC解旋酶等等。
优选的,所述的Prp43解旋酶与野生型Prp43解旋酶之间、Prp43解旋酶与Prp43解旋酶之间、野生型Prp43解旋酶与野生型Prp43解旋酶、Prp43解旋酶与其他类型解旋酶之间或者野生型Prp43解旋酶与其他类型解旋酶之间,可以通过头对头、尾对尾或者头对尾的方式连接或排列。
优选的,所述的Prp43解旋酶寡聚体包含两个以上的本申请所述的Prp43解旋酶,其中,所述的Prp43解旋酶可以是不同的或者相同的。
蛋白构建体
在生理功能中,Prp43解旋酶参与了前体mRNA加工过程中由U2.U5.U6snRNPs组成的内含子剪接体的解离,该过程中酶功能的发挥需要 与两个含有富含甘氨酸基序(G-Path motif)的辅助蛋白Ntr1和Ntr2相互作用,以激活其ATP水解活性和解旋活性;Prp43解旋酶也参与了核糖体合成过程,以帮助18S和25S的rRNAs前体的成熟,该过程也需要富含G-Path基序的蛋白Pfa1和Gno1蛋白的激活。
Prp43解旋酶在生理功能条件下需要含有G-Path结构域的辅助蛋白激活其ATP水解活性和解旋活性。虽然在缺少辅助蛋白的条件下该酶具有微弱的活性,更优地是在辅助激活蛋白的存在下其ATP水解和解旋活性更强。特别地,发明人发现含有G-Path结构域辅助蛋白的单独部分片段仍具有激活功能。
因此,在本申请的第二方面,提供了一种蛋白构建体,其包括本申请第一方面所述的经修饰的Prp43解旋酶,以及在所述Prp43解旋酶的C端或N端融合的辅助激活蛋白Paf1的G-Path结构域或含有G-Path结构域的Paf1的片段。该蛋白构建体也可以视为一种融合蛋白。
这种经修饰的Prp43解旋酶构建体,由于在Prp43解旋酶的C端或N端融合了辅助激活蛋白Paf1或其同源蛋白的G-Path结构域或含有G-Path结构域的片段,使修饰后的解旋酶ATP水解和或解旋活性明显增强,更有利于控制纳米孔核酸测序中核酸的过孔移动。
所述蛋白构建体中,经修饰的Prp43解旋酶的数目可以为一个或多个。
所述蛋白构建体中,所述辅助激活蛋白Paf1可以是本领域常规使用的各种来源的Paf1蛋白,例如可以是来源于Chaetomium thermophilum var.thermophilum、Thermothielavioides terrestris、Thermothelomyces thermophilus、Podospora anserina、Neurospora tetrasperma、Coniochaeta sp.、Monosporascus sp.、Hypoxylon sp.、Madurella mycetomatis或Coniochaeta pulveracea的Paf1。
表2给出了一些可以用于本申请的Prp43解旋酶构建体的同源Paf1蛋白的示例,但是本申请的Paf1蛋白并不限于这些例子。
表2:ctPfa1同源蛋白
Figure PCTCN2021085609-appb-000002
优选的,所述G-path结构域序列为上述Pfa1辅助蛋白或其同源蛋白中对应SEQ ID NO:16的K662-G742片段(即SEQ ID NO:26的序列)的序列或其变体的序列。
在一些优选的实施方式中,在所述蛋白构建体中,所述辅助激活蛋白Paf1的氨基酸序列为SEQ ID NO:16或与SEQ ID NO:16的氨基酸序列具有至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性的变体的氨基酸序列,并且所述辅助激活蛋白Paf1具有激活Prp43解旋酶的功能。
在一些更优选的实施方式中,所述Paf1的G-Path结构域为SEQ ID NO:16的K662-G742片段。
在最优选的实施方式中,在所述蛋白构建体中,所述Prp43解旋酶包括SEQ ID NO:1的序列或其变体,所述Pfa1辅助激活蛋白包括SEQ ID NO:16序列或其变体、或SEQ ID NO:16的G-Path结构域序列SEQ ID NO:26(对应SEQ ID NO:16序列的K662-G742片段)或其变体。
本申请所述的蛋白构建体可以被修饰以助于鉴定或纯化,例如通过添加组氨酸残基(His标签),天冬氨酸残基(asp标签),链霉亲和素标签,Flag标签,SUMO标签,GST标签或MBP标签或Strep TagII标签,或通过添加信号序列以促进它们从细胞中分泌,该细胞中的多肽不天然地含有该信号序列。引入遗传标签的替换方式是通过化学反应将标签连到蛋白构建体上的天然或人工位点。
核酸
本申请的第三方面,提供了一种核酸,所述的核酸编码本申请第一方面所述的Prp43解旋酶和/或本申请第二方面所述的蛋白的构建体。
表达载体
本申请的第四个方面,提供了一种表达载体,所述的表达载体包含本申请第三方面所述的核酸。优选的,所述的核酸可操作的连接至表达载体中的调控元件,其中所述的调控元件优选为启动子。在本申请的一些具体实施方式中,所述的启动子选自T7、trc、lac、ara或λL。优选的,所述的表达载体包括但不限于质粒、病毒或噬菌体。
本领域技术人员已知有多种用于将核酸插入核酸构建体或表达载体中的方法,参见例如Sambrook和Russell,Molecular Cloning:A Laboratory Manual,第3版,CSHL Press,Cold Spring Harbor,NY,2001。
宿主细胞
本申请的第五方面,提供了一种宿主细胞,所述的宿主细胞包含本申请第三方面所述的核酸或本申请第四方面所述的表达载体。优选的,所述的宿主细胞包括但不限于大肠杆菌。在本申请的一个具体实施方式中,所述的宿主细胞选自BL21(DE3)、JM109(DE3)、B834(DE3)、TUNER、C41(DE3)、Rosetta2(DE3)、Origami、Origami B等等。
Prp43解旋酶或蛋白构建体的制备方法
本申请的第六方面涉及一种制备本申请的第二方面所述的蛋白构建体的方法,包括:提供SEQ ID NO:1的多肽或其变体和SEQ ID NO:26的多肽或其变体,在SEQ ID NO:1的多肽或其变体中引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸,然后在所得多肽的C端或N端融合SEQ ID NO:26的多肽或其变体,形成所述蛋白构建体。
本申请的第七方面涉及一种制备本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体的方法,包括:包括培养本申请第五方面所述的宿主细胞,并进行诱导表达,然后纯化所得表达产物。
遗传改造技术,诸如酶在宿主细胞中的过表达、对宿主细胞的遗传修饰、或杂交技术是本领域中已知的方法,诸如在Sambrook和Russel(2001)"Molecular Cloning:A Laboratory Manual(第3版),Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press或F.Ausubel等人编著,"Current protocols in molecular biology",Green Publishing and Wiley Interscience,New York(1987)中描述的那些技术。
例如,在本申请的一个具体实施方式中,经修饰的Prp43解旋酶的制备方法包括:根据本申请所述Prp43解旋酶和或辅助激活蛋白或激活结构域的氨基酸序列,获得编码Prp43解旋酶的核酸序列,酶切连接至表达载体后转化至大肠杆菌中,诱导表达和纯化,获得所述Prp43解旋酶。
Prp43解旋酶或蛋白构建体的应用
本申请的Prp43解旋酶或蛋白构建体可以用于控制多核苷酸分子移动或是用于表征目标多核苷酸。
本发明所述的Prp43解旋酶是一种在链测序过程中控制目标多核苷酸移动的有用工具,当提供了促进移动的常规必要组分时,Prp43解旋酶沿着DNA或RNA以3’-5’的方向移动,但DNA或RNA在孔中的定向(取决于DNA或RNA的哪个末端被捕获)意味着Prp43解旋酶可以用于逆着所施加的场的方向或顺着施加的场的方向将DNA或RNA移进孔。通过在野生型Prp43解旋酶中引入半胱氨酸残基和/或至少一个非天然氨基酸,可以有效减少Prp43解旋酶或构建体上多核苷酸结合结构域或多核苷酸结合部分开口的大小或开闭,以及目标多核苷酸被解开的开口的大小或开闭,从而显著降低Prp43解旋酶从目标多核苷酸上解脱的能力,提高控制目标多核苷酸通过孔的能力。通过在野生型Prp43解旋酶或者修饰后的Prp43解旋酶的C端或N端融合G-path结构域或含有G-Path结构域的多肽,可以有效提升修饰后的Prp43解旋酶的ATP水解活性或者解旋活性,从而提高控制目标多核苷酸通过孔的能力。
本申请的第八方面涉及一种控制多核苷酸分子移动的方法,包括将所述多核苷酸分子与本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体接触。
优选的,所述的控制多核苷酸移动为控制多核苷酸穿过孔的移动。所述的孔为纳米孔,所述的纳米孔为跨膜孔。该孔可以是天然的或人造的,包括但不限于生物孔、固态孔或生物与固态杂交的孔。优选的,所述的方法可以包含一个或多个的Prp43解旋酶共同控制多核苷酸的移动。
本申请的第九方面涉及一种表征目标多核苷酸的方法,所述的方法包括:
(a)将目标多核苷酸与本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体接触,使得所述Prp43解旋酶或蛋白构建体控制所述目标多核苷酸移动穿过一纳米孔;(b)获取目标多核苷酸中的核苷酸与所述纳米孔相互作用时的一个或多个特征,从而表征所述目标多核苷酸。
优选的,重复步骤(a)和(b)一次或多次。
优选的,所述的方法中可以使用任意数量的本申请所述的Prp43解旋 酶。优选可以为一个或多个,更优选为1、2、3、4、5、6、7、8、9个或更多个。其中,所述的两个以上本申请所述的Prp43解旋酶可以相同或不同。也可以包含野生型Prp43解旋酶或者其他类型的解旋酶。进一步的,两个以上解旋酶之间可以连接或者只是通过分别结合在多核苷酸上而排列发挥控制多核苷酸移动的功能。
优选的,所述的方法还包括横跨与所述解旋酶或构建体,和目标多核苷酸接触的孔施加电势差的步骤。
优选的,所述的孔是允许水合离子在施加的电势的驱动下从膜的一侧流向膜的另一层的结构。进一步优选的,所述的孔为纳米孔,所述的纳米孔为跨膜孔。所述跨膜孔为目标多核苷酸的移动提供了通道。进一步优选的,所述的孔选自生物孔、固态孔或生物与固态杂交的孔。
在一些具体实施方式中,所述的孔包括但不限于衍生自耻垢分枝杆菌孔蛋白A、耻垢分枝杆菌孔蛋白B、耻垢分枝杆菌孔蛋白C、耻垢分枝杆菌孔蛋白D、溶血素、胞溶素、白细胞介素、外膜孔蛋白F、外膜孔蛋白G、外膜磷脂酶A、WZA或奈瑟氏菌自转运脂蛋白等等。
所述的膜可以为任何现有技术中存在的膜,优选为两性分子层,即一种由具有至少一个亲水性部分和至少一个亲脂性或疏水性部分的两性分子诸如磷脂质形成的层,两性分子可以是合成的或天然存在的。进一步优选的,所述的膜为脂质双层膜。所述的目标多核苷酸可以使用任何已知的方法连接到膜上。如果膜是两性分子层,如脂质双分子层,所述多核苷酸优选通过在所述膜中存在的多肽或通过在所述膜中存在的疏水锚被连接到该膜上。其中,疏水锚优选为脂质、脂肪酸、甾醇、碳纳米管或氨基酸。
优选的,当在孔施加一种力(如电压),目标多核苷酸通过孔的速率被Prp43解旋酶或构建体所控制,从而获得一种可识别的稳定的电流水平,用于确定目标多核苷酸的特征。
优选的,所述的目标多核苷酸为单链、双链或至少一部分是双链的。
进一步优选的,所述的目标多核苷酸可以通过标签、间隔物、甲基化、氧化或损伤的方式进行修饰。
在本申请的一个具体实施方式中,所述的目标多核苷酸为至少一部分 是双链的。其中所述的双链部分构成Y衔体结构,所述的Y衔体结构包含优先螺入所述孔的前导序列。
进一步优选的,所述的目标多核苷酸的长度可以为10-100000个碱基或更多个碱基。
在本申请的一个具体实施方式中,所述的目标多核苷酸的长度可以为至少10个、至少50个、至少100个、至少200个、至少300个、至少400个、至少500个、至少1000个、至少2000个、至少5000个、至少10000个、至少50000个或至少100000个碱基等等。
优选的,所述的解旋酶结合到单链多核苷酸的内部核苷酸中。
优选的,所述的目标多核苷酸为DNA或RNA。
优选的,当所述的目标多核苷酸为RNA时,为提高要被测序的RNA穿过孔的能力和效率,将RNA修饰为包含非RNA多核苷酸。
优选的,RNA修饰的步骤包含将DNA前导区与待测RNA的3’末端连接。还包括将待测RNA反转录的步骤
优选的,所述的一个或多个特征选自目标多核苷酸的来源、长度、同一性、序列、二级结构或目标多核苷酸是否被修饰。进一步优选的,所述的一个或多个特征通过电测量和/或光学测量进行。
进一步优选的,通过电测量和/或光测量产生电信号和/或光信号,而每种核苷酸对应一种信号水平,继而将电信号和/或光信号转化为核苷酸的特征。
在本申请的一个具体实施方式中,所述的电测量包括但不限于电流测量、阻抗测量、隧道测量、风洞测量或场效应晶体管(FET)测量等等。
本申请所述的电信号选自电流、电压、隧穿、电阻、电位、电导率或横向电测量的测量值。
在一些具体实施方式中,所述的电信号为穿过所述孔的电流。
优选的,所述的表征还包括应用改进型维特比算法。
本申请的第十方面涉及本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体在表征目标多核苷酸或控制目标多核苷酸穿过孔的移动中的用途。
传感器和分析装置
本申请的第十一方面涉及一种用于表征目标多核苷酸的分析装置,所述的分析装置包含一个或多个纳米孔、一个或多个本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体、以及一个或多个容器。
优选的,所述的分析装置选自试剂盒、装置或传感器。
进一步优选的,所述的分析装置是试剂盒,所述的试剂盒中还包括包含脂质双层的芯片。所述的孔横跨脂质双层。本申请所述的试剂盒包含一个或多个脂质双层,每个脂质双层包含一个或多个所述的孔。本申请所述的试剂盒还包括实施表征目标多核苷酸的试剂或装置。优选的,所述的试剂包括缓冲剂、PCR扩增所需的工具。
本申请的第十二方面涉及一种形成表征目标多核苷酸的传感器的方法,包括提供纳米孔,和在所述纳米孔和本申请第一方面所述的经修饰的Prp43解旋酶或本申请第二方面所述的蛋白构建体之间形成复合物。
下面结合附图和具体实施例对本申请的实施方式作进一步的解释和说明。这些实施例仅仅是为了解释和说明本申请的各个方面,而不能被理解为对本申请的范围的限制。
附图说明
图1示出了来自于嗜热毛壳菌(Chaetomium thermophilum)的N端(M1-N60)截短的野生型Prp43解旋酶(SEQ ID NO:1)的3D结构示意图。
图2示出了N端(M1-N60)截短的野生型Prp43解旋酶、经修饰的Prp43解旋酶Prp43-2(F181C/N623C/C508S)、经修饰的Prp43解旋酶Prp43-3(P352C/S619C/C508S)、N端(M1-N60)截短的蛋白构建体Prp43-GP、N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)和N端(M1-N60)截短的蛋白构建体Prp43-GP-3(P352C/S619C/C508S)的单链DNA依赖ATP水解活性检测。
图3示出了N端(M1-N60)截短的野生型Prp43解旋酶、经修饰的 Prp43解旋酶Prp43-2(F181C/N623C/C508S)、经修饰的Prp43解旋酶Prp43-3(P352C/S619C/C508S)、N端(M1-N60)截短的蛋白构建体Prp43-GP、N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)和N端(M1-N60)截短的蛋白构建体Prp43-GP-3(P352C/S619C/C508S)的单链RNA依赖ATP水解活性检测。
图4示出了低盐条件下N端(M1-N60)截短的野生型Prp43解旋酶或N端(M1-N60)截短的蛋白构建体Prp43-GP、N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)与单链DNA的亲和力曲线。
图5示出了N端(M1-N60)截短的野生型Prp43解旋酶、N端(M1-N60)截短的蛋白构建体Prp43-GP和N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)的凝胶迁移实验结果。其中,泳道1是T44-37-FAM底物,泳道2是野生型Prp43解旋酶和T44-37-FAM底物结合的复合物,泳道3是野生型Prp43解旋酶和T44-37-FAM底物结合后进行TMAD催化处理的产物,泳道4是Prp43-GP解旋酶和T44-37-FAM底物结合的复合物,泳道5是Prp43-GP解旋酶和T44-37-FAM底物结合后进行TMAD催化处理的产物,泳道6是Prp43-GP-2解旋酶突变体和T44-37-FAM底物结合的复合物,泳道7是Prp43-GP-2解旋酶突变体和T44-37-FAM底物结合后进行TMAD催化处理的产物。
图6示出了DNA构建体X的示意图,其中A区对应序列SEQ ID NO:32其5’末端连接到4个iSpC3间隔区(B区),该间隔区连接到C区对应序列SEQ ID NO:33的3’末端,C区序列的5’末端连接到D区对应序列SEQ ID NO:34,该构建体的E区对应序列SEQ ID NO:35与F区对应序列SEQ ID NO:36(其具有3’胆固醇系链)杂交。
图7示出了N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)控制DNA构建体X通过MspA纳米孔移动时的电流轨迹示例(y轴坐标为电流(pA,0到100),x轴坐标为时间(h:m:s))。
图8说明:示出了RNA构建体Y示意图,其中SEQ ID NO:37(标记为D)其3’末端连接到20个iSpC3间隔区(标记为A),其5’末端连接到4个iSpC3间隔区(标记为B),该间隔区连接到SEQ ID NO:38(标记 为C)的3’末端,该构建体的SEQ ID NO:39(标记为E)区域与SEQ ID NO:40(标记为F,其具有3’胆固醇系链)杂交。
图9示出了N端(M1-N60)截短的蛋白构建体Prp43-GP-2(F181C/N623C/C508S)控制RNA构建体Y穿过MspA纳米孔的电流轨迹示例(y轴坐标为电流(pA,0到100),x轴坐标为时间(h:m:s))。
实施例
以下各实施例中未具体注明的实验操作细节可以参考本文为所引用的参考文献,所采用的实验试剂和仪器设备均为常规商业可得的试剂或仪器。
实施例1
野生型Prp43解旋酶和经修饰的解旋酶Prp43以及蛋白构建体均采用标准的分子生物学方法进行制备,其原理和操作过程为本领域技术人员所熟知(参见本文为所引用的参考文献)。
N端截短的野生型Prp43解旋酶(即T61-A764片段):将N端截短Prp43解旋酶T61-A764片段(对应于SEQ ID NO:1的Prp43解旋酶氨基酸序列去除了N端结构域的M1至N60片段)对应的核酸序列(SEQ ID NO:28,由金斯瑞生物科技股份有限公司提供)通过酶切连接方式连接至载体pGS-21a(金斯瑞生物科技股份有限公司,货号SD0121)中,经测序验证正确后转化至表达感受态宿主细胞BL21(DE3)(北京全式金生物技术有限公司,货号CD601-02)中。从平板上挑选单克隆接种至100ml氨苄抗性的液体LB培养基,37℃过夜培养后第二天转接至大瓶培养基中扩大培养,待OD600达到0.4-0.8左右时,添加终浓度为0.5mM的异丙基-β-D-硫代半乳糖苷(IPTG),16℃过夜诱导表达12小时左右,低温离心收集的菌体经过裂解缓冲液重悬后高压均质破碎,高速离心收集上清液,进行后续的蛋白层析纯化,具体包括镍离子亲和层析、离子交换层析和分子筛分离,目标蛋白经酶切切除GST标签后经过镍离子亲和层析柱后收集流穿的目的蛋白,切除GST标签后的目标蛋白使用SDS-PAGE凝胶电泳检测。切除标签后的截短Prp43蛋白(去除了N端结构域的M1至N60)经过SDS-PAGE检测, 显示目的蛋白大小正确,可用于后续的测试和分析。
N端截短Prp43解旋酶T61-A764片段融合GP结构域蛋白突变体Prp43-GP-2(F181C/N623C/C508S)(即SEQ ID NO:27):按照N端截短Prp43解旋酶T61-A764片段的相同制备方法进行制备,只是将起始序列由对应于N端截短Prp43解旋酶T61-A764片段的核酸序列(SEQ ID NO:28)替换为SEQ ID NO:30。切除标签后的蛋白构建体Prp43-GP-2,经过SDS-PAGE检测,显示目的蛋白大小正确,可用于后续的测试和分析。
按照上述相同方法,采用不同的起始核酸序列,分别制备经修饰的N端(M1-N60)截短的Prp43解旋酶和蛋白构建体:Prp43-2(F181C/N623C/C508S)、经修饰的Prp43解旋酶Prp43-3(P352C/S619C/C508S)、N端(M1-N60)截短的蛋白构建体Prp43-GP和N端(M1-N60)截短的蛋白构建体Prp43-GP-3(P352C/S619C/C508S)。其中所用的启示核酸序列分别如下表3中所示。
表3:实施例中使用的蛋白或蛋白构建体及其制备
Figure PCTCN2021085609-appb-000003
实施例2
本实施例对N端(M1-N60)截短的野生型Prp43解旋酶、经修饰的Prp43解旋酶Prp43-2(F181C/N623C/C508S)、经修饰的Prp43解旋酶Prp43-3(P352C/S619C/C508S)、N端截短的蛋白构建体Prp43-GP、N端截短的蛋 白构建体Prp43-GP-2(F181C/N623C/C508S)和N端截短的蛋白构建体Prp43-GP-3(P352C/S619C/C508S)在结合或孵育单链DNA或单链RNA底物时的ATP水解活性进行了测试。
(1)材料与方法
本实施例使用吸光光度法对Prp43解旋酶的ATPase水解活性进行检测。具体步骤是,准备含有50uM磷酸盐的预混溶液,将50uL的磷酸盐标准溶液移入950uL的超纯水中,给管道编号。
表4:标准品制备
Figure PCTCN2021085609-appb-000004
在96孔板的重复孔中加入25nM Prp43解旋酶样品,加入0.5nM M13ssDNA,用测试Buffer(10mM HEPES,600mM KCL,5mM Mg2+)将样品加入10uL的最终体积,30℃反应30min,加入终浓度为1mM TMAD,30℃反应30min。在重复的孔中加入10uL(10mM HEPES,50mM KCL,5mM Mg2+)缓冲液作为阴性对照。高水平的磷酸盐会导致样品背景,要更正此背景。反应混合物加入后,立即在每个背景空白孔中加入160uL工作试剂,使反应停止。不需要进行最初的30min的孵化,然后可以从样本读数中减去背景空白读数。根据表4、表5的方案设置反应组合。每个样品、背景空白或阴性对照反应需要70uL的反应混合物。
表5:样品制备
Figure PCTCN2021085609-appb-000005
每孔加70uL反应混合液,空白背景,阴性对照。不要添加到标准品中。 在室温下培养反应30min。在每个孔中加入160uL工作试剂,在室温下再孵育15min,终止酶反应,生成比色产物。酶标仪读取所有样品、标准品和对照品的600-660nm处的吸光度[620nm处的最大吸光度(A620)]。
(2)结果
N端(M1-N60)截短的野生型Prp43解旋酶和修饰的Prp43解旋酶或蛋白构建体在分别和DNA或RNA结合后的ATP水解活性如图2和图3所示。从图2、图3可知:在Prp43解旋酶或突变体的C端融合了G-Path激活结构域之后,该酶的ATP水解活性得到了明显的提升;在Prp43解旋酶中引入两个半胱氨酸后,同样改善了ATP水解活性。
实施例3
本实施例使用荧光偏振方法对N端(M1-N60)截短的野生型Prp43解旋酶或经修饰的蛋白构建体Prp43-GP和Prp43-GP-2(F181C/N623C/C508S)在单链DNA的亲和力进行了测试。
(1)材料与方法
N端(M1-N60)截短的野生型解旋酶或修饰后的解旋酶按如下浓度梯度进行稀释:800nM,400nM,200nM,100nM,50nM,25nM,12.5nM,6.25nM,3.125nM,1.56nM,BLANK,酶与10nM单链DNA底物在结合Buffer(10mM HEPES,50mM KCl,5%Glycerol,ph7.0)中孵育20min后,在530nM激发光和560nM发射光下读取其偏振值并拟合绘制亲和力曲线,每个酶浓度设置三个重复。
(2)结果
拟合结果如图4所示,N端(M1-N60)截短的Prp43解旋酶在C端融合了G-Path结构域,即Prp43-GP解旋酶,或在Prp43-GP基础上进行定点突变的修饰酶Prp43-GP-2(F181C/N623C/C508S)与单链DNA的亲和力在低盐条件下与野生型没有明显的差别。
实施例4
本实施例通过凝胶迁移实验来检测N端(M1-N60)截短的野生型Prp43 解旋酶或经修饰的蛋白构建体Prp43-GP和Prp43-GP-2(F181C/N623C/C508S)结合DNA的情况,包括TMAD催化剂催化突变体中突变体位点F181C和N623C之间形成二硫键后对核酸结合力的增强作用。
(1)材料和方法
本实验条件如下,在Buffer(10mM HEPES,50mM KCl,PH7.0)中加入30nM的FAM荧光基团标记的单链多聚胸腺嘧啶底物T44-37-FAM,然后分别加入使用终浓度为120nM的野生型Prp43解旋酶和修饰修饰的Prp43-2、Prp43-GP-2解旋酶,在30℃孵育1.5h;使用终浓度为酶的1000倍TMAD交联剂进行催化突变位点半胱氨酸的交联,30℃孵育1.5h。
(2)结果
凝胶迁移实验结果如图5所示,野生型Prp43解旋酶与DNA结合后在电泳条件下酶与核酸脱落情况较严重,修饰体Prp43-GP解旋酶与DNA结合后在电泳条件下酶与核酸脱落相比野生型Prp43解旋酶稍微好点,而修饰体突变体Prp43-GP-2与DNA结合效果较好,无论经过TMAD处理与否都没有明显的酶从核酸上脱落现象。
实施例5
本实施例显示了N端(M1-N60)截短的修饰体突变体解旋酶Prp43-GP-2(F181C/N623C/C508S)控制DNA构建体X穿过MspA纳米孔的运动。
(1)材料和方法
制备如图6所示的DNA构建体X:A区对应序列(SEQ ID NO:32)的5’末端连接到4个iSpC3间隔区(B区),该间隔区连接到C区对应序列(SEQ ID NO:33)的3’末端,C区序列的5’末端连接到D区对应序列(SEQ ID NO:34),该构建体的E区对应序列(SEQ ID NO:35)与F区对应序列(SEQ ID NO:36,其具有3’胆固醇系链)杂交。将浓度为10uM的A、B、C、D区段合成连接在一起的片段,与E片段、F片段按照1:1:1比例加入到退火缓冲液(10mM Tris,pH7.0,50mM NaCl)中,按照98℃10min,-0.1℃/0.6s,300个循环,65℃5min,-0.1℃/0.6s,400个循环的流程进行退火处理(其中,A、B、C、D、E、F片段由生工生物工程(上海)股份有限 公司提供)。
将制备的DNA构建体X和修饰体突变体解旋酶Prp43-GP-2(F181C/N623C/C508S)或N端截短的野生型Prp43-GP在25℃的缓冲液(10mM HEPES,pH 8.0,50mM NaCl,5%甘油)中一起预孵育30分钟,加入1000倍于解旋酶浓度的TMAD催化剂室温孵育处理30分钟。由嵌入1,2-二乙醇酰基-甘油-3-胆碱磷酸脂质双分子层的MspA纳米孔(MspA蛋白序列为SEQ ID NO:31,按照Michael Faller et al.,“The Structure of a Mycobacterial Outer-Membrane Channel”,Science 303,1189(2004);DOI:10.1126/science.1094114所述进行制备)获得电测量信号。通过Montal-Mueller技术,在PTFE膜上的~25μm直径孔穴形成双分子层,隔开两个约100μL的缓冲溶液。所有实验在所述缓冲液中进行。使用装配有数字转换器的放大器测定单通道电流。将Ag/AgCl电极连接到所述缓冲液中使得顺式隔间连接到放大器的接地端,并且反式隔间连接到活性电极。
在所述双分子层实现单孔之后,将DNA多核苷酸和修饰体突变体解旋酶Prp43-GP-2(F181C/N623C/C508S)或N端截短的野生型Prp43-GP的复合物添加到电生理学室的顺式隔间的70μL缓冲液中以引发解旋酶-DNA复合体在所述纳米孔的捕获。根据需要通过向所述顺式隔间添加二价金属(5mM MgCl 2)和NTP(2.86μM ATP)激活解旋酶ATP酶活性。实验在+180mV的恒定电势下实施。
(2)结果
结果显示DNA构建体X的移动被解旋酶Prp43-GP-2(F181C/N623C/C508S)控制,结果见图7,Prp43-GP-2解旋酶控制了接近200bp的DNA构建体穿过所述纳米孔的移位。相对应地,N端截短的野生型Prp43(T61-A764片段)或构建体Prp43-GP则难以获得构建体X的A/B/C/D片段通过纳米孔所产生的的持续电流信号。
实施例6
本实施例显示了N端(M1-N60)截短的修饰体突变体解旋酶Prp43-GP-2(F181C/N623C/C508S)控制RNA构建体Y穿过MspA纳米孔的运动。
(1)材料与方法
制备如图8所示的RNA构建体:将D区对应序列(SEQ ID NO:37)的3’末端连接到20个iSpC3间隔区(A区),其5’末端连接到4个iSpC3间隔区(B区),该间隔区连接到C区对应序列(SEQ ID NO:38)的3’末端,该构建体的E区对应序列(SEQ ID NO:39)与F区对应序列(SEQ ID NO:40)杂交。将浓度为10uM的A、B、C、D区段合成连接在一起的片段,与E片段、F片段按照1:1:1比例加入到退火缓冲液(10mM Tris,pH7.0,50mM NaCl)中,按照98℃10min,-0.1℃/0.6s,300个循环,65℃5min,-0.1℃/0.6s,400个循环的流程进行退火处理(其中,A、B、C、D、E、F片段由生工生物工程(上海)股份有限公司提供)。
将制备的RNA构建体和Prp43-GP-2或N端截短的野生型Prp43-GP在30℃的缓冲液(10mM HEPES,pH 7.0,50mM NaCl)中一起预孵育30分钟。由嵌入1,2-二乙醇酰基-甘油-3-胆碱磷酸脂质双分子层的MspA纳米孔(MspA蛋白序列为SEQ ID NO:31,按照Michael Faller et al.,“The Structure of a Mycobacterial Outer-Membrane Channel”,Science 303,1189(2004);DOI:10.1126/science.1094114所述进行制备)获得电测量信号。通过Montal-Mueller技术,在PTFE膜上的~25μm直径孔穴形成双分子层,隔开两个约100μL的缓冲溶液。所有实验在所述缓冲液中进行。使用装配有数字转换器的放大器测定单通道电流。将Ag/AgCl电极连接到所述缓冲液中使得顺式隔间连接到放大器的接地端,并且反式隔间连接到活性电极。
在所述双分子层实现单孔之后,将RNA多核苷酸构建体和Prp43-GP-2解旋酶或N端截短的野生型Prp43-GP添加到电生理学室的顺式隔间的70μL缓冲液中以引发解旋酶-RNA复合体在所述纳米孔的捕获。根据需要通过向所述顺式隔间添加二价金属(5mM MgCl 2)和NTP(5mM ATP)激活解旋酶ATP酶活性。实验在+180mV的恒定电势下实施。
(2)结果
结果显示RNA构建体被Prp43-GP-2解旋酶控制的RNA移动,Prp43-GP-2解旋酶控制的RNA移动的结果见图9。Prp43-GP-2解旋酶控制的RNA移动为3秒长并对应于接近30bp的RNA构建体穿过所述纳米孔的移位。 相对应地,N端截短的野生型Prp43(T61-A764片段)或N端截短的构建体Prp43-GP则难以获得构建体Y的A/B/C/D片段通过纳米孔所产生的的持续电流信号。
本文中描述了本发明的优选实施方式和具体实施例,但是这些实施方式和实施例仅作为示例提供,而非用来限制本发明。在不脱离本发明的情况下,本领域技术人员现在将想到许多变化、改变和替换。因此,本发明还应涵盖任何此类替代方案、修改、变体或等效形式。

Claims (46)

  1. 一种经修饰的Prp43解旋酶,包括RecA1结构域、RecA2结构域和Ratchet结构域,相对于对应的野生型Prp43解旋酶或其片段而言所述经修饰的Prp43解旋酶包括在选自RecA1结构域、RecA2结构域、Ratchet结构域的至少一个结构域中引入的1、2、3、4、5、6、7、8、9、10、11个或更多个半胱氨酸的插入或置换,和/或1、2、3、4、5、6、7、8、9、10、11个或更多个非天然氨基酸的插入或置换。
  2. 如权利要求1所述的经修饰的Prp43解旋酶,其中所述引入的半胱氨酸残基或非天然氨基酸残基位于对应于SEQ ID NO:1的M157、Q161、D165、F181、E182、N183、R324、L328、E332、R335、P353、L351、P352、H354、D321、E320、R358、P563、A564、N565、D603、K605、K606、H609、Y615、R616、S619、N623、A626或K630中的任一个或两个以上的位置,优选地位于对应于SEQ ID NO:1的F181、P352、S619或N623中的任一个或两个以上的位置。
  3. 如权利要求1或2所述的经修饰的Prp43解旋酶,其中所述野生型Prp43解旋酶的片段是Prp43解旋酶的N端结构域被去除后所得片段,优选地去除N端的位置1开始的至少96个、至少90个、至少80个、至少70个、至少60个、至少50个、至少40个或至少30个残基。
  4. 如权利要求1至3任一项所述的经修饰的Prp43解旋酶,其中所述经修饰的Prp43解旋酶还包括对于一个或多个半胱氨酸残基的置换,优选为对应于SEQ ID NO:1的C148、C214、C303、C323、C377、C441、C508、C543、C608的一个或多个半胱氨酸残基被置换,更优选地半胱氨酸残基被置换为丙氨酸、甘氨酸、缬氨酸、异亮氨酸、亮氨酸、苯丙氨酸、酪氨酸、丝氨酸、苏氨酸、天冬氨酸、谷氨酸、赖氨酸、精氨酸、组氨酸、蛋氨酸、色氨酸、谷氨酰胺、天冬酰胺或脯氨酸残基。
  5. 如权利要求1至4任一项所述的经修饰的Prp43解旋酶,其中所述被引入的半胱氨酸残基和非天然氨基酸残基总个数为2个或以上,并且在至少一个被引入的半胱氨酸或非天然氨基酸残基与另一个被引入的半胱氨 酸或非天然氨基酸残基之间形成相互连接。
  6. 如权利要求5所述的经修饰的Prp43解旋酶,其中所述连接选自共价连接或氢键连接或静电相互作用或π-π相互作用或疏水相互作用等,优选共价连接。
  7. 如权利要求6所述的经修饰的Prp43解旋酶,其中所述共价连接是-S-S键或者是通过选自碳酰氯、马来酰亚胺、活性酯、琥珀酰亚胺、叠氮化物、烷烃、烯烃、炔烃、聚乙二醇(PEGs)、多肽、多糖、脱氧核糖核酸(DNA)、肽核酸(PNA)、苏糖核酸(TNA)、甘油核酸(GNA)、聚酰胺或TMAD的交联剂或催化剂实现的共价连接。
  8. 如权利要求1至7任一项所述的经修饰的Prp43解旋酶,其中所述经修饰的Prp43解旋酶还包含选自以下组的一个或多个氨基酸修饰:
    (a)一个或多个与核苷酸相互作用的氨基酸被置换;
    (b)一个或多个与NTP和/或二价金属离子结合相关的氨基酸被置换;
    (c)一个或多个与跨膜孔相互作用的氨基酸被置换;
    (d)降低Prp43解旋酶的表面的负电荷的进一步修饰。
  9. 如权利要求1至8任一项所述的经修饰的Prp43解旋酶,其来源于Chaetomium thermophilum、Bathycoccus prasinos、Uncultured bacterium、Archaeon、Parcubacteria、Sorangium cellulosum、Candidatus Sungbacteria、Mycolicibacterium chitae、Parcubacteria、Thermodesulforhabdus norvegica、Deltaproteobacteria、Puniceicoccales、Desulfobacterium vacuolatum或Desulfobacter sp.或来源于病毒基因组(viral metagenome)。
  10. 如权利要求1至8任一项所述的经修饰的Prp43解旋酶,其中所述野生型Prp43解旋酶是选自具有以下序列之一的Prp43解旋酶:SEQ ID NO:1、SEQ ID NO:2、SEQ ID NO:3、SEQ ID NO:4、SEQ ID NO:5、SEQ ID NO:6、SEQ ID NO:7、SEQ ID NO:8、SEQ ID NO:9、SEQ ID NO:10、SEQ ID NO:11、SEQ ID NO:12、SEQ ID NO:13、SEQ ID NO:14、SEQ ID NO:15。
  11. 如权利要求1至8任一项所述的经修饰的Prp43解旋酶,其与对应的野生型的Prp43解旋酶的氨基酸序列具有至少30%、至少40%、至少50%、 至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性。
  12. 如权利要求1至8任一项所述的经修饰的Prp43解旋酶,其来源于Chaetomium thermophilum,优选地其与SEQ ID NO:1的氨基酸序列具有至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性。
  13. 如权利要求1至3任一项所述的经修饰的Prp43解旋酶,其来源于Chaetomium thermophilum,且所述引入的半胱氨酸残基或非天然氨基酸残基位于对应于SEQ ID NO:1的F181、P352、S619或N623中的任一个或多个位置。
  14. 如权利要求13所述的经修饰的Prp43解旋酶,其为经修饰的SEQ ID NO:1的T61-A764片段,且所述修饰选自F181C/N623C/C508S和P352C/S619C/C508S。
  15. 如权利要求1至14任一项所述的经修饰的Prp43解旋酶,其为寡聚体形式,包含一个或多个如权利要求1至12任一项所述的经修饰的Prp43解旋酶。
  16. 一种蛋白构建体,其包括如权利要求1至15任一项所述的经修饰的Prp43解旋酶,以及在所述Prp43解旋酶的C端或N端融合的辅助激活蛋白Paf1的G-Path结构域或含有G-Path结构域的Paf1片段。
  17. 如权利要求16所述的蛋白构建体,其中包含一个或多个所述经修饰的Prp43解旋酶。
  18. 如权利要求16或17所述的蛋白构建体,其中所述辅助激活蛋白Paf1是来源于Chaetomium thermophilum var.thermophilum、Thermothielavioides terrestris、Thermothelomyces thermophilus、Podospora anserina、Neurospora tetrasperma、Coniochaeta sp.、Monosporascus sp.、Hypoxylon sp.、Madurella mycetomatis或Coniochaeta pulveracea的Paf1。
  19. 如权利要求16或17所述的蛋白构建体,其中所述辅助激活蛋 白Paf1的氨基酸序列选自SEQ ID NO:16、SEQ ID NO:17、SEQ ID NO:18、SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、SEQ ID NO:22、SEQ ID NO:23、SEQ ID NO:24、或SEQ ID NO:25,或具有与SEQ ID NO:16、SEQ ID NO:17、SEQ ID NO:18、SEQ ID NO:19、SEQ ID NO:20、SEQ ID NO:21、SEQ ID NO:22、SEQ ID NO:23、SEQ ID NO:24或SEQ ID NO:25其中之一的氨基酸序列具有至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性的变体的氨基酸序列,并且所述辅助激活蛋白Paf1具有激活Prp43解旋酶的功能。
  20. 如权利要求16或17所述的蛋白构建体,其中所述Paf1的G-Path结构域为SEQ ID NO:16序列的K662-G742片段(SEQ ID NO:26)或与SEQ ID NO:26的氨基酸序列具有至少30%、至少40%、至少50%、至少60%、至少70%、至少80%、至少85%、至少90%、至少91%、至少92%、至少93%、至少94%、至少95%、至少96%、至少97%、至少98%、至少99%或至少99.9%的同源性的变体的氨基酸序列,并且所述变体具有激活Prp43解旋酶的功能。
  21. 如权利要求16所述的蛋白构建体,其中所述Prp43解旋酶为SEQ ID NO:1的T61-A764片段,并且在对应于SEQ ID NO:1的F181、P352、S619或N623中的任一个或两个以上的位置处具有引入的1个或多个半胱氨酸的插入或置换和/或非天然氨基酸的插入或置换,且所述辅助激活蛋白Paf1的氨基酸序列为SEQ ID NO:16。
  22. 如权利要求16所述的蛋白构建体,其中所述Prp43解旋酶为SEQ ID NO:1的T61-A764,并且还具有选自F181C/N623C/C508S和P352C/S619C/C508S的修饰,且所述Prp43解旋酶的C端与氨基酸序列为SEQ ID NO:26的多肽融合。
  23. 一种编码如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体的核酸。
  24. 如权利要求23所述的核酸,其中所述核酸被包含在选自于质粒、 病毒或噬菌体的载体中。
  25. 一种包含如权利要求23所述核酸的表达载体。
  26. 如权利要求25所述的表达载体,其中所述表达载体选自于质粒、病毒或噬菌体。
  27. 如权利要求25或26所述的表达载体,其中所述表达载体进一步包含用于控制所述核酸的表达的调控元件。
  28. 如权利要求27所述的表达载体,其中所述调控元件是与所述核酸可操作地连接的启动子。
  29. 如权利要求28所述的表达载体,其中所述启动子选自T7、trc、lac、ara或λL。
  30. 包含如权利要求23或24所述的核酸或包含如权利要求25至28任一项所述表达载体的宿主细胞。
  31. 如权利要求30所述的宿主细胞,其为大肠杆菌(Escherichia coli.)。
  32. 一种制备如权利要求16至22任一项所述的蛋白构建体的方法,包括:提供SEQ ID NO:1的多肽或其变体和SEQ ID NO:26的多肽或其变体,在SEQ ID NO:1的多肽或其变体中引入至少一个半胱氨酸残基和/或至少一个非天然氨基酸,然后在所得多肽的C端或N端融合SEQ ID NO:26的多肽或其变体,形成所述蛋白构建体。
  33. 一种制备如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体的方法,包括:包括培养如权利要求30或31所述的宿主细胞,并进行诱导表达,然后纯化所得表达产物。
  34. 一种控制多核苷酸分子移动的方法,包括将所述多核苷酸分子与如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体接触。
  35. 如权利要求34所述的控制多核苷酸分子移动的方法,其中所述多核苷酸分子被控制穿过纳米孔,所述的纳米孔为跨膜孔。
  36. 如权利要求35所述的控制多核苷酸分子移动的方法,其中所述跨膜孔选自蛋白孔、固态孔或生物与固态杂交的孔,优选地所述的蛋白孔选 自于耻垢分枝杆菌孔蛋白A、耻垢分枝杆菌孔蛋白B、耻垢分枝杆菌孔蛋白C、耻垢分枝杆菌孔蛋白D、溶血素、胞溶素、白细胞介素、外膜孔蛋白F、外膜孔蛋白G、外膜磷脂酶A、WZA或奈瑟氏菌自转运脂蛋白。
  37. 一种表征目标多核苷酸的方法,所述的方法包括:
    (a)将目标多核苷酸与如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体接触,使得所述Prp43解旋酶或蛋白构建体控制所述目标多核苷酸移动穿过一纳米孔;
    (b)获取目标多核苷酸中的核苷酸与所述纳米孔相互作用时的一个或多个特征,从而表征所述目标多核苷酸。
  38. 如权利要求37所述的表征目标多核苷酸的方法,其中所述方法进一步包括跨所述纳米孔施加电势差的步骤。
  39. 如权利要求37或38所述的表征目标多核苷酸的方法,其中所述方法使用一个或多个所述Prp43解旋酶或蛋白构建体。
  40. 如权利要求37或38所述的表征目标多核苷酸的方法,其中所述的纳米孔为跨膜孔,所述跨膜孔选自蛋白孔、固态孔或生物与固态杂交的孔,优选地所述的蛋白孔选自于耻垢分枝杆菌孔蛋白A、耻垢分枝杆菌孔蛋白B、耻垢分枝杆菌孔蛋白C、耻垢分枝杆菌孔蛋白D、溶血素、胞溶素、白细胞介素、外膜孔蛋白F、外膜孔蛋白G、外膜磷脂酶A、WZA或奈瑟氏菌自转运脂蛋白。
  41. 如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体在表征目标多核苷酸或控制目标多核苷酸穿过孔的移动中的用途。
  42. 一种用于表征目标多核苷酸的分析装置,所述的分析装置包含一个或多个纳米孔、一个或多个如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体、以及一个或多个容器。
  43. 如权利要求42所述的用于表征目标多核苷酸的分析装置,所述分析装置中还包括包含脂质双层的芯片,其中所述的纳米孔横跨脂质双层。
  44. 如权利要求42或43所述的用于表征目标多核苷酸的分析装置, 所述分析装置中还包括缓冲剂、PCR扩增试剂。
  45. 如权利要求42、43或44所述的用于表征目标多核苷酸的分析装置,所述的分析装置是试剂盒或传感器。
  46. 一种形成表征目标多核苷酸的传感器的方法,包括提供纳米孔,和在所述纳米孔和如权利要求1至15任一项所述的经修饰的Prp43解旋酶或如权利要求16至22任一项所述的蛋白构建体之间形成复合物。
PCT/CN2021/085609 2021-04-06 2021-04-06 经修饰的Prp43解旋酶及其用途 Ceased WO2022213253A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US18/553,834 US20240368568A1 (en) 2021-04-06 2021-04-06 Modified prp43 helicase and use thereof
EP21935490.9A EP4299746A4 (en) 2021-04-06 2021-04-06 Modified prp43 helicase and use thereof
CN202180006254.2A CN115777019A (zh) 2021-04-06 2021-04-06 经修饰的Prp43解旋酶及其用途
PCT/CN2021/085609 WO2022213253A1 (zh) 2021-04-06 2021-04-06 经修饰的Prp43解旋酶及其用途

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/085609 WO2022213253A1 (zh) 2021-04-06 2021-04-06 经修饰的Prp43解旋酶及其用途

Publications (1)

Publication Number Publication Date
WO2022213253A1 true WO2022213253A1 (zh) 2022-10-13

Family

ID=83545051

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085609 Ceased WO2022213253A1 (zh) 2021-04-06 2021-04-06 经修饰的Prp43解旋酶及其用途

Country Status (4)

Country Link
US (1) US20240368568A1 (zh)
EP (1) EP4299746A4 (zh)
CN (1) CN115777019A (zh)
WO (1) WO2022213253A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024138701A1 (zh) * 2022-12-30 2024-07-04 深圳华大生命科学研究院 一种解旋酶突变体及其制备方法和在高通量测序中的应用
WO2024089270A3 (en) * 2022-10-28 2024-07-18 Oxford Nanopore Technologies Plc Pore monomers and pores
WO2025067293A1 (zh) * 2023-09-27 2025-04-03 北京齐碳科技有限公司 解旋酶及其应用
WO2025138248A1 (zh) * 2023-12-29 2025-07-03 深圳华大生命科学研究院 Ski2-like解旋酶及其应用

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116334030B (zh) * 2023-05-06 2024-01-16 深圳市梅丽纳米孔科技有限公司 一种经修饰的CfM HL4解旋酶及其应用
CN119709690A (zh) * 2023-09-27 2025-03-28 北京齐碳科技有限公司 一种用于纳米孔测序的酶修饰方法
CN118256468B (zh) * 2024-02-22 2025-08-26 北京普译生物科技有限公司 一种修饰的ToPif1解旋酶及其应用
CN118126983B (zh) * 2024-02-22 2025-09-02 北京普译生物科技有限公司 一种修饰的mpk2解旋酶及其应用

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013057495A2 (en) 2011-10-21 2013-04-25 Oxford Nanopore Technologies Limited Enzyme method
US20150065354A1 (en) 2011-12-29 2015-03-05 Oxford Nanopore Technologies Limited Method for characterising a polynucelotide by using a xpd helicase
US20150191709A1 (en) 2012-07-19 2015-07-09 Oxford Nanopore Technologies Limited Modified helicases
US20160257942A1 (en) 2013-10-18 2016-09-08 Oxford Nanopore Technologies Ltd. Modified helicases
US20180179500A1 (en) 2014-10-07 2018-06-28 Oxford Nanopore Technologies Ltd. Modified enzymes
CN108676785A (zh) * 2018-03-07 2018-10-19 天津市湖滨盘古基因科学发展有限公司 一种atp依赖型rna解旋酶dhx3突变蛋白及应用
CN109266537A (zh) * 2018-09-14 2019-01-25 首度生物科技(苏州)有限公司 使用单分子多次通过纳米孔达到精密测序的基因测序仪
CN112147185A (zh) * 2019-06-29 2020-12-29 清华大学 一种控制多肽穿过纳米孔速度的方法及其应用

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030119013A1 (en) * 2001-04-23 2003-06-26 Bo Jiang Identification of essential genes of Aspergillus fumigatus and methods of use
KR102168813B1 (ko) * 2013-03-08 2020-10-22 옥스포드 나노포어 테크놀로지즈 리미티드 효소 정지 방법
GB201609220D0 (en) * 2016-05-25 2016-07-06 Oxford Nanopore Tech Ltd Method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013057495A2 (en) 2011-10-21 2013-04-25 Oxford Nanopore Technologies Limited Enzyme method
US20150065354A1 (en) 2011-12-29 2015-03-05 Oxford Nanopore Technologies Limited Method for characterising a polynucelotide by using a xpd helicase
US9617591B2 (en) 2011-12-29 2017-04-11 Oxford Nanopore Technologies Ltd. Method for characterising a polynucleotide by using a XPD helicase
US20150191709A1 (en) 2012-07-19 2015-07-09 Oxford Nanopore Technologies Limited Modified helicases
US20160257942A1 (en) 2013-10-18 2016-09-08 Oxford Nanopore Technologies Ltd. Modified helicases
US20180179500A1 (en) 2014-10-07 2018-06-28 Oxford Nanopore Technologies Ltd. Modified enzymes
CN108676785A (zh) * 2018-03-07 2018-10-19 天津市湖滨盘古基因科学发展有限公司 一种atp依赖型rna解旋酶dhx3突变蛋白及应用
CN109266537A (zh) * 2018-09-14 2019-01-25 首度生物科技(苏州)有限公司 使用单分子多次通过纳米孔达到精密测序的基因测序仪
CN112147185A (zh) * 2019-06-29 2020-12-29 清华大学 一种控制多肽穿过纳米孔速度的方法及其应用

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
F. AUSUBEL ET AL.: "Current protocols in Molecular biology", 1987, GREEN PUBLISHING AND WILEY INTERSCIENCE
LONG JIAFU, ZHOU HAO, CHEN FEILONG: "Research progress in structure and function of polymerase-associated factor 1 complex", SCIENCE CHINA: CHINESE BULLETIN OF LIFE SCIENCE = SCIENTIA SINICA VITAE, vol. 49, no. 9, 1 September 2019 (2019-09-01), pages 1143 - 1154, XP055976785, ISSN: 1674-7232, DOI: 10.1360/SSV-2019-0154 *
MARCEL J. TAUCHERT ET AL.: "Structural and functional analysis of the RNA helicase Prp43 from the thermophilic eukaryote Chaetomium thermophilum", ACTA CRYST., vol. F72, 2016, pages 112 - 120, XP072457883, DOI: 10.1107/S2053230X15024498
MICHAEL FALLER ET AL.: "The Structure of a Mycobacterial Outer-Membrane Channel", SCIENCE, vol. 303, 2004, pages 1189, XP003029246, DOI: 10.1126/science.1094114
NEEDLEMAN, S. B.WUNSCH, C. D., J. MOL. BIOL., vol. 48, 1970, pages 443 - 453
SAMBROOKRUSSEL: "Molecular Cloning: A Laboratory Manual", 2001, COLD SPRING HARBOR LABORATORY PRESS
See also references of EP4299746A4
TANAKA NAOKO, SCHWER BEATE: "Mutations in PRP43 That Uncouple RNA-Dependent NTPase Activity and Pre-mRNA Splicing Function", BIOCHEMISTRY, vol. 45, no. 20, 1 May 2006 (2006-05-01), pages 6510 - 6521, XP055976580, ISSN: 0006-2960, DOI: 10.1021/bi052656g *
TAUCHERT MARCEL J, FOURMANN JEAN-BAPTISTE, LÜHRMANN REINHARD, FICNER RALF: "Structural insights into the mechanism of the DEAH-box RNA helicase Prp43", ELIFE, vol. 6, 16 January 2017 (2017-01-16), XP055976581, DOI: 10.7554/eLife.21510 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024089270A3 (en) * 2022-10-28 2024-07-18 Oxford Nanopore Technologies Plc Pore monomers and pores
WO2024138701A1 (zh) * 2022-12-30 2024-07-04 深圳华大生命科学研究院 一种解旋酶突变体及其制备方法和在高通量测序中的应用
EP4644570A4 (en) * 2022-12-30 2026-03-04 Bgi Shenzhen Helicase mutant, its preparation process and its use in high-throughput sequencing
WO2025067293A1 (zh) * 2023-09-27 2025-04-03 北京齐碳科技有限公司 解旋酶及其应用
WO2025138248A1 (zh) * 2023-12-29 2025-07-03 深圳华大生命科学研究院 Ski2-like解旋酶及其应用

Also Published As

Publication number Publication date
CN115777019A (zh) 2023-03-10
US20240368568A1 (en) 2024-11-07
EP4299746A4 (en) 2024-06-19
EP4299746A1 (en) 2024-01-03

Similar Documents

Publication Publication Date Title
WO2022213253A1 (zh) 经修饰的Prp43解旋酶及其用途
US12258591B2 (en) Modified helicases
US12448646B2 (en) SSB method
JP6614972B2 (ja) 修飾ヘリカーゼ
CN112646019B (zh) 突变胞溶素孔
EP3259351B1 (en) Method for producing hetero-oligomeric pores
JP6889701B2 (ja) アルファ溶血素バリアント
JP2022095668A (ja) 変異体ポア
WO2022126304A1 (zh) 一种经修饰的解旋酶及其应用
CN120230736A (zh) 用于表征生物分子的ns3解旋酶突变体及其用途

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21935490

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2021935490

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2021935490

Country of ref document: EP

Effective date: 20230926

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Ref document number: 2021935490

Country of ref document: EP