WO2021244557A1 - 分析来自细胞的目标核酸的方法 - Google Patents

分析来自细胞的目标核酸的方法 Download PDF

Info

Publication number
WO2021244557A1
WO2021244557A1 PCT/CN2021/097800 CN2021097800W WO2021244557A1 WO 2021244557 A1 WO2021244557 A1 WO 2021244557A1 CN 2021097800 W CN2021097800 W CN 2021097800W WO 2021244557 A1 WO2021244557 A1 WO 2021244557A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
target nucleic
sequence
strand
attached
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/097800
Other languages
English (en)
French (fr)
Inventor
施威扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ocean University of China
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to EP21817287.2A priority Critical patent/EP4163390A4/en
Priority to CN202180039759.9A priority patent/CN116234926A/zh
Priority to CA3181004A priority patent/CA3181004A1/en
Priority to JP2022574773A priority patent/JP7853705B2/ja
Priority to US18/000,665 priority patent/US20230212648A1/en
Publication of WO2021244557A1 publication Critical patent/WO2021244557A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6816Hybridisation assays characterised by the detection means
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions

Definitions

  • This application relates to the field of biomedicine, in particular to a method for analyzing target nucleic acids from cells and related preparations.
  • nucleic acid sequencing technology has undergone rapid and tremendous progress. Sequencing technology generates a large amount of sequence data, which can be used for research and interpretation of genomes and genomic regions, and provides information that is widely used in conventional biological research and diagnosis. Genome sequencing can be used to obtain information on a variety of biomedical backgrounds, including diagnostics, prognosis, biotechnology, and forensic biology. Sequencing includes Maxam-Gilbert sequencing and chain termination or de novo sequencing (including shotgun sequencing and bridge PCR), or next-generation methods, including polymerase clone sequencing, 454 pyrosequencing, Illumina sequencing, SOLiD sequencing, Ion Torrent Semiconductor sequencing, HeliScope single molecule sequencing, [image] sequencing, etc. For most sequencing applications, samples such as nucleic acid samples are processed before being introduced into the sequencer.
  • the final signal value is the average of multiple cells, and the information of cell heterogeneity is lost.
  • the current analysis of the mRNA content of cells by direct sequencing relies on the analysis of a large amount of mRNA obtained from a tissue sample containing millions of cells, which means that when the gene expression is analyzed in a large amount of mRNA, the expression in a single cell A lot of functional information will be lost or blurred; in addition, dynamic processes such as the cell cycle cannot be observed based on the overall average.
  • certain cell types in complex tissues for example, the brain can only be studied by analyzing cells individually.
  • This application provides a method for analyzing target nucleic acid from a cell, the method comprising:
  • a target nucleic acid derived from a single cell wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid;
  • a solid support attached with at least one oligonucleotide tag wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and is located in the barcode sequence 3'end of the hybridization sequence, the second strand includes a first portion complementary to the hybridization sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid Part, and the first strand and the second strand form a partially double-stranded structure or the second strand and the attached target nucleic acid form a partially double-stranded structure;
  • the oligonucleotide tag is connected to the attached target nucleic acid, thereby generating a barcoded target nucleic acid.
  • the oligonucleotide tag is releasably attached to the solid support.
  • it includes releasing the at least one oligonucleotide tag from the solid support, and in b) making the released oligonucleotide tag and the attached target Nucleic acids are linked to produce barcoded target nucleic acids.
  • the oligonucleotide tag is directly or indirectly attached to the solid support through the 5' end of its first strand.
  • a ligase is further included in the discrete partition, and the ligase connects the oligonucleotide tag to the attached target nucleic acid.
  • the ligase includes T4 ligase.
  • the target nucleic acid sequence is located at the 3'end of the barcode sequence.
  • the solid support is a bead.
  • the beads are magnetic beads.
  • the discrete partitions are holes or droplets.
  • the barcode sequence includes a cell barcode sequence, and each oligonucleotide tag attached to the same solid support contains the same cell barcode sequence.
  • the cell barcode sequence comprises at least 2 cell barcode segments separated by a linker sequence.
  • a) includes co-distributing the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag into the discrete partitions.
  • b) includes connecting the hybridizing sequence of the first strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the Barcoded target nucleic acid.
  • b) includes hybridizing the second portion of the second strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, and allowing the The hybridization sequence of the first strand of the oligonucleotide tag is connected to the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the barcoded target nucleic acid.
  • the attached target nucleic acid includes a unique molecular identification region.
  • the unique molecular identification region is located between the oligonucleotide adaptor sequence and the target nucleic acid sequence.
  • the oligonucleotide tag further includes an amplification primer recognition region.
  • the amplification primer recognition region is a universal amplification primer recognition region.
  • the method further includes:
  • the method further includes, after b) and before c), releasing the barcoded target nucleic acid from the discrete partition.
  • c) includes sequencing the barcoded target nucleic acid to obtain the characterization result.
  • the method further comprises assembling contiguous nucleic acid sequences of at least a portion of the genome of the single cell from the sequence of the barcoded target nucleic acid.
  • the single cell is characterized based on the nucleic acid sequence of at least a portion of the genome of the single cell.
  • each of the discrete partitions includes at most the target nucleic acid derived from a single cell.
  • the method further includes identifying a single nucleic acid sequence in the barcoded target nucleic acid as derived from a given nucleic acid in the target nucleic acid based at least in part on the existence of the unique molecular identification region.
  • the target nucleic acid includes an exogenous nucleic acid
  • the exogenous nucleic acid includes an exogenous nucleic acid linked to a protein, lipid, and/or small molecule compound, and the protein, lipid, and/or small molecule The compound can bind to the target molecule in the cell.
  • the method further comprises determining the amount of a given nucleic acid in the target nucleic acid based on the presence of the unique molecular identification region.
  • it includes pre-treating the cells before a).
  • the pretreatment includes fixing the cells.
  • the cells are fixed using a fixative, and the fixative is selected from one or more of the following group: formaldehyde, paraformaldehyde, methanol, ethanol, acetone, glutaraldehyde, osmium Acid and potassium dichromate.
  • the fixative is selected from one or more of the following group: formaldehyde, paraformaldehyde, methanol, ethanol, acetone, glutaraldehyde, osmium Acid and potassium dichromate.
  • the pretreatment includes exposing the nucleus of the cell.
  • the pretreatment includes treating the cells with a detergent, the detergent including Triton, Tween, SDS, NP-40, and/or digitonin.
  • the target nucleic acid includes one or more selected from the group consisting of DNA, RNA and cDNA.
  • it further includes, after b) and before c), amplifying the barcoded target nucleic acid.
  • the barcoded target nucleic acid is released from the discrete partition, and the amplification is performed after the barcoded target nucleic acid is released from the discrete After the partition is released.
  • amplification primers are used in the amplification, and random guide sequences are included in the amplification primers.
  • the random leader sequence is a random hexamer.
  • the amplifying includes at least partially hybridizing the random leader sequence with the barcoded target nucleic acid and extending the random leader sequence in a template-directed manner.
  • it includes releasing at least a portion of the target nucleic acid from the single cell in the discrete partition to the outside of the cell, and in b) the released target nucleic acid and the oligonucleotide
  • the nucleotide tag is connected to produce a barcoded target nucleic acid.
  • it includes allowing at least a portion of the oligonucleotide tag released from the solid support to enter the single cell, and to link with the target nucleic acid in b), thereby generating a barcoded Target nucleic acid.
  • it includes using a microfluidic device to co-distribute the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag into the discrete partitions.
  • the discrete partitions are droplets
  • the microfluidic device is a droplet generator.
  • the microfluidic device includes a first input channel and a second input channel that meet at a junction fluidly connected to the output channel.
  • the method further includes introducing a sample containing the target nucleic acid into the first input channel, and introducing the solid support to which at least one oligonucleotide tag is attached into the first input channel. Two input channels, thereby generating a mixture of the sample and the solid support in the output channel.
  • the output channel and the third input channel are fluidly connected at the junction.
  • it further includes introducing oil into the third input channel, so that aqueous droplets in the water-in-oil emulsion are formed as the discrete partitions.
  • each of the discrete partitions contains at most the target nucleic acid from a single cell.
  • the first input channel and the second input channel form a substantially perpendicular angle to each other.
  • the target nucleic acid includes cDNA derived from RNA in the single cell.
  • the RNA includes mRNA.
  • it includes reverse transcription of the RNA before a) and the production of the attached target nucleic acid.
  • a reverse transcription primer is used in the reverse transcription, and the reverse transcription primer includes the oligonucleotide adaptor sequence and the polyT sequence in a 5'to 3'direction.
  • the reverse transcription includes hybridizing the polyT sequence with the RNA and extending the polyT sequence in a template-directed manner.
  • the target nucleic acid includes DNA derived from the single cell.
  • the DNA includes genomic DNA, open chromatin DNA, protein-bound DNA regions, and/or exogenous nucleic acids linked to proteins, lipids, and/or small molecule compounds. And/or small molecule compounds can bind to target molecules in cells.
  • it includes fragmenting the DNA derived from a single cell before a).
  • the attached target nucleic acid is produced after or during the fragmentation.
  • the fragmentation includes the use of ultrasonic fragmentation, and then adding a sequence containing the oligonucleotide adaptor to the fragmented DNA, thereby obtaining the attached target nucleic acid.
  • the fragmentation includes using DNA endonuclease or DNA exonuclease to break, and then adding a sequence containing the oligonucleotide adaptor to the fragmented DNA to obtain the Attached target nucleic acid
  • the fragmentation includes using a transposase-nucleic acid complex to integrate the sequence comprising the oligonucleotide adaptor into the DNA, and releasing the transposase to obtain the Attached target nucleic acid.
  • the transposase-nucleic acid complex includes a transposase and a transposon end nucleic acid molecule, wherein the transposon end nucleic acid molecule includes the oligonucleotide adaptor sequence.
  • the transposase includes Tn5.
  • the DNA includes a DNA region that binds to a protein
  • the transposase-nucleic acid complex also includes a portion that directly or indirectly recognizes the protein.
  • the part that directly or indirectly recognizes the protein includes one or more of the following group: an antibody that specifically binds to the protein and protein A or protein G.
  • the present application also provides a composition
  • a composition comprising: a plurality of solid supports, each of which is attached with at least one oligonucleotide tag, wherein each of the oligonucleotides
  • the acid tag includes a first strand and a second strand
  • the first strand includes a barcode sequence and a hybridization sequence located at the 3'end of the barcode sequence
  • the second strand includes a hybrid sequence complementary to the hybrid sequence of the first strand.
  • the barcode sequence of the oligonucleotide tag includes a common barcode domain and a variable domain, and the common barcode domain is in the oligonucleotide tag attached to the same solid support The same, and the common barcode domain is different between two or more solid supports in the plurality of solid supports.
  • the present application also provides a kit for analyzing target nucleic acids from cells, which includes the composition described in the present application.
  • the kit includes a transposase.
  • the kit further includes at least one of a nucleic acid amplification agent, a reverse transcription agent, a fixative, a permeabilizing agent, a linking agent, and a lysis agent.
  • a method for amplifying a target nucleic acid from a cell comprising:
  • a) Provide discrete partitions comprising: i. a target nucleic acid derived from a single cell, wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid; and ii. A solid support attached with at least one oligonucleotide tag, wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and a 3'end of the barcode sequence.
  • a hybridizing sequence comprising a first portion complementary to the hybridizing sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and
  • the first strand and the second strand form a partially double-stranded structure or the second strand and the attached target nucleic acid form a partially double-stranded structure;
  • the oligonucleotide tag is releasably attached to the solid support.
  • it includes releasing the at least one oligonucleotide tag from the solid support, and in b) making the released oligonucleotide tag and the attached target Nucleic acids are linked to produce barcoded target nucleic acids.
  • the oligonucleotide tag is directly or indirectly attached to the solid support through the 5' end of its first strand.
  • a ligase is further included in the discrete partition, and the ligase connects the oligonucleotide tag to the attached target nucleic acid.
  • the ligase includes T4 ligase.
  • the target nucleic acid sequence is located at the 3'end of the barcode sequence.
  • the solid support is a bead.
  • the discrete partitions are holes or droplets.
  • the barcode sequence includes a cell barcode sequence, and each oligonucleotide tag attached to the same solid support contains the same cell barcode sequence.
  • the cell barcode sequence comprises at least 2 cell barcode segments separated by a linker sequence.
  • a) includes co-distributing the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag into the discrete partitions.
  • b) includes connecting the hybridizing sequence of the first strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the Barcoded target nucleic acid.
  • b) includes hybridizing the second portion of the second strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, and allowing the The hybridization sequence of the first strand of the oligonucleotide tag is connected to the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the barcoded target nucleic acid.
  • the attached target nucleic acid includes a unique molecular identification region.
  • the unique molecular identification region is located between the oligonucleotide adaptor sequence and the target nucleic acid sequence.
  • the oligonucleotide tag further includes an amplification primer recognition region.
  • the amplification primer recognition region is a universal amplification primer recognition region.
  • the barcoded target nucleic acid is released from the discrete partition, and the amplification is performed after the barcoded target nucleic acid is released from the discrete After the partition is released.
  • amplification primers are used in the amplification, and random guide sequences are included in the amplification primers.
  • the random leader sequence is a random hexamer.
  • the amplifying includes at least partially hybridizing the random leader sequence with the barcoded target nucleic acid and extending the random leader sequence in a template-directed manner.
  • this application also provides a method for sequencing a target nucleic acid from a cell, the method comprising:
  • a) Provide discrete partitions comprising: i. a target nucleic acid derived from a single cell, wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid; and ii. A solid support attached with at least one oligonucleotide tag, wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and a 3'end of the barcode sequence.
  • a hybridizing sequence comprising a first portion complementary to the hybridizing sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and
  • the first strand and the second strand form a partially double-stranded structure or the second strand and the attached target nucleic acid form a partially double-stranded structure;
  • the oligonucleotide tag is releasably attached to the solid support.
  • it includes releasing the at least one oligonucleotide tag from the solid support, and in b) making the released oligonucleotide tag and the attached target Nucleic acids are linked to produce barcoded target nucleic acids.
  • the oligonucleotide tag is directly or indirectly attached to the solid support through the 5' end of its first strand.
  • a ligase is further included in the discrete partition, and the ligase connects the oligonucleotide tag to the attached target nucleic acid.
  • the ligase includes T4 ligase or T7 ligase.
  • the target nucleic acid sequence is located at the 3'end of the barcode sequence.
  • the solid support is a bead.
  • the discrete partitions are holes or droplets.
  • the barcode sequence includes a cell barcode sequence, and each oligonucleotide tag attached to the same solid support contains the same cell barcode sequence.
  • the cell barcode sequence comprises at least 2 cell barcode segments separated by a linker sequence.
  • a) includes co-distributing the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag into the discrete partitions.
  • b) includes connecting the hybridizing sequence of the first strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the Barcoded target nucleic acid.
  • b) includes hybridizing the second portion of the second strand of the oligonucleotide tag with the oligonucleotide adaptor attached to the target nucleic acid, and allowing the The hybridization sequence of the first strand of the oligonucleotide tag is connected to the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the barcoded target nucleic acid.
  • the attached target nucleic acid includes a unique molecular identification region.
  • the unique molecular identification region is located between the oligonucleotide adaptor sequence and the target nucleic acid sequence.
  • the oligonucleotide tag further includes an amplification primer recognition region.
  • the amplification primer recognition region is a universal amplification primer recognition region.
  • it further comprises a continuous nucleic acid sequence that assembles at least a part of the genome of the single cell from the sequence of the barcoded target nucleic acid.
  • the single cell is characterized based on the nucleic acid sequence of at least a portion of the genome of the single cell.
  • each of the discrete partitions includes at most the target nucleic acid derived from a single cell.
  • it further includes identifying a single nucleic acid sequence in the barcoded target nucleic acid as derived from a given nucleic acid in the target nucleic acid based at least in part on the existence of the unique molecular identification region.
  • it further includes determining the amount of a given nucleic acid in the target nucleic acid based on the existence of the unique molecular identification region.
  • Figure 1 shows a schematic diagram of the PCR method in this application for generating nucleotide tags suitable for non-transcriptome analysis.
  • Figure 2 shows a schematic diagram of the T4 ligase method in this application for generating nucleotide tags suitable for non-transcriptome analysis.
  • Figure 3 shows a schematic diagram of the PCR method in this application for generating nucleotide tags suitable for transcriptome analysis.
  • Figure 4 shows a schematic diagram of the T4 ligase method in this application for generating nucleotide tags suitable for non-transcriptome analysis.
  • Figure 5 shows the fragment length distribution diagram of the ATAC sequencing results of human 293T cells mediated by the Tn5 transposition reaction in the present application.
  • FIGS 6A and 6B show the signal-enriched transcription start site (TSS) map of the human 293T cell ATAC sequencing result mediated by the Tn5 transposition reaction in the present application.
  • TSS signal-enriched transcription start site
  • FIG. 7 shows the ratio diagram of different types of sequences of the ATAC sequencing results of human 293T cells mediated by the Tn5 transposition reaction in the present application.
  • Figure 8 shows a schematic diagram of the microfluidic chip in this application.
  • Figure 9 shows the stacking curve of the ATAC sequencing results in this application based on the number of reads in each barcode.
  • Figure 10 shows the distribution of the number of unique mapped reads in a single cell as a result of ATAC sequencing in this application.
  • Figure 11 shows the distribution map of the ATAC data of the cells in this application in the gene region.
  • Figure 12 shows the result of the ATAC signal correlation analysis of single cells in the present application.
  • Figure 13 shows the results of the Cuttag library fragment distribution in this application.
  • Fig. 14 shows the result of the position of the cut tag fragments in the transcription initiation site in this application.
  • Figure 15 shows the result of the proportion of Cuttag fragments distributed in the genome in this application.
  • Figure 16 shows the results of the single-cell Cut tag distribution results in this application.
  • Figure 17 shows the result of clearly distinguishing single cells of mixed cells according to the single-cell transcriptome in this application.
  • Figure 18 shows the distribution results of the number of transcripts and genes detected in each cell in this application.
  • Figure 19 shows the result of clearly distinguishing single cells of mixed cells according to the single-cell genome in the present application.
  • Figure 20 shows the results of single-cell sequencing in this application with different degrees of coverage for each cell and each genomic site.
  • Fig. 21 shows the result of clearly distinguishing single cells of mixed cells based on single-cell DNA modification in the present application.
  • Figure 22 shows the results of the methylation modification distribution detected in each cell in this application.
  • Figure 23 shows the 5hmC modification distribution results detected in each cell in this application.
  • FIG. 24 shows the result of the single cell in the mixed cell can be well distinguished according to the transcriptome and ATAC in this application.
  • FIG. 25 shows the result of the single cell in the mixed cell can be well distinguished according to the transcriptome and the cut tag in this application.
  • Figure 26 shows the result of the transcriptome and methylome of the same cell in this application that can be well matched with the gene model and the known methylation sites.
  • Figure 27 shows a schematic diagram of a spatial lattice chip in this application.
  • Figure 28 shows the result of the number of genes after the HE staining of the slices and the space lattice chip are superimposed in this application.
  • sequencing generally refers to a technology for obtaining sequence information of nucleic acid molecules.
  • analysis of the base sequence of a specific DNA fragment for example, the arrangement of adenine (A), thymine (T), cytosine (C) and guanine (G), etc.
  • sequencing methods can include Sanger dideoxy chain termination Method (Chain Termination Method), Pyrosequencing method, and "Synthetic Parallel Sequencing” or "Connected Sequencing” platforms used by Illumina, Life Technologies, and Roche for next-generation sequencing, sequencers from MGI/Complete Genomics; usually It may also include nanopore sequencing methods, such as the method developed by Oxford Nanopore Technology, PacBio's third-generation sequencer, or electronic detection-based methods, such as Ion Torrent technology launched by Life Technologies.
  • characterization result generally refers to the information description of nucleic acids and other related molecules obtained by sequencing or other biological analysis methods such as genomics and/or proteomics.
  • it can include sequence information of whole genome sequencing, accessible chromatin sequence and distribution information, nucleic acid sequence and its binding factor binding information, pathogenic gene mutation information, single nucleotide polymorphism (SNP), nucleotide methyl Chemistry, transcriptome information (such as temporal or spatial changes in gene expression levels), etc.
  • protein A generally refers to a cell-derived protein that can bind to the conserved region of the antibody heavy chain derived from different species (ie, the recognition protein of the antibody). For example, it can bind to the Fc fragment in human and various mammalian serum IgG molecules.
  • the mammals can include pigs, dogs, rabbits, humans, monkeys, mice, mice, and cattle, etc.; protein A binds to IgG.
  • Classes can mainly include IgG1, IgG2 and IgG4; besides binding to IgG, protein A can also bind to IgM and IgA in the serum.
  • protein A may include protein A (SPA) from Staphylococcus aureus.
  • SPA is the main component of cell wall antigens. Almost 90% of Staphylococcus aureus strains contain this component, but the content of different strains varies greatly.
  • the ability of protein A to bind to antibodies can be used to locate and/or analyze the target protein by forming a target protein-antibody-protein A complex.
  • solid support generally refers to any material that is suitable or can be modified to be suitable for attaching the oligonucleotide tags, barcode sequences, primers, etc. described herein.
  • a solid support includes an array of holes or recesses located in the surface. These can be manufactured using a variety of technologies, such as photolithography, stamping technology, molding technology, and microetching technology; the composition and geometry of the solid support can be based on Its use varies.
  • the solid support can be a planar structure (such as a slide, chip, microchip, and/or array, etc.); for example, the solid support or its surface can also be non-planar, such as that of a tube or container.
  • the inner or outer surface; for example, the solid support may also include microspheres or beads.
  • beads or “microspheres” or “parcitiles” generally refer to small discrete particles.
  • Suitable bead compositions include, but are not limited to: plastics, ceramics, glass, polystyrene, methyl styrene, acrylic polymers, paramagnetic materials, thorium oxide sol, carbon graphite, titanium dioxide, latex or cross-linked dextran (Such as agarose), cellulose, nylon, cross-linked micelles and Teflon, and any other materials for solid supports outlined in this article can all be used. Fishers Ind.) Microsphere Detection Guide; in some embodiments, the microspheres may be magnetic microspheres or beads.
  • unique molecular identification area can also be referred to as “molecular barcode”, “molecular marker”, “unique identifier (UID)”, “unique molecular identifier (UMI)”, etc., usually referring to A unique sequence code attached to each original nucleotide fragment of the same sample.
  • the subsequent amplification bias can be corrected by directly counting the unique molecular identifiers (UMI) sequenced after amplification.
  • UMI unique molecular identifiers
  • UMI can be designed, incorporated, and applied according to methods known in the art, for example, by WO2012/142213, Islam et al. (Nat. Methods) (2014) 11:163-166, and Kivioja, T. et al. The publication of (Nat. Methods) (2012) 9:72-74 is exemplified, and the document is incorporated herein by reference in its entirety.
  • the term "amplification primer recognition region” generally refers to a nucleotide sequence capable of complementary hybridization with the primer sequence for amplifying the target nucleic acid.
  • the combination of the primer and the primer can trigger nucleotide extension, ligation and/or synthesis, for example, to increase the copy number of the target nucleic acid (ie amplification) under the action of polymerase chain reaction, and in some embodiments, it also includes an oligonucleotide tag. , Amplification of sequences such as molecular unique identifiers.
  • the term "discrete partition” generally refers to independent spatial units that contain the target substance to be analyzed.
  • the discrete partitions may also contain other Other substances, such as dyes, emulsifiers, surfactants, stabilizers, polymers, aptamers, reducing agents, initiators, biotin markers, fluorophores, buffers, acidic solutions, alkaline solutions, Light-sensitive enzymes, pH-sensitive enzymes, aqueous buffers, detergents, ionic detergents, non-ionic detergents, etc.
  • the term "releasably attached” generally means that the connection between the oligonucleotide tag and the solid support is releasable, cleavable or reversible or destructible and destructible.
  • the connection between the oligonucleotide tag and the solid support contains unstable bonds, such as chemical, thermal or light-sensitive bonds, such as disulfide bonds, UV-sensitive bonds, etc., which are destroyed by corresponding treatments.
  • connection between the oligonucleotide tag and the solid support includes a specific base that can be recognized by a nuclease, such as dU, which can be cleaved by the action of the UNG enzyme; for example,
  • the connection between the oligonucleotide tag and the solid support contains an endonuclease recognition sequence, which can be cleaved by the action of nuclease; for example, the solid support is degradable and passes through the solid Degradation of the support releases the oligonucleotide tag, enabling releasable attachment and the like.
  • linker generally refers to a nucleotide sequence that connects various functional sequences together, and can also include a molecular sequence (nucleic acid, polypeptide or other Chemical connection structure, etc.) wherein the functional sequence may include cell barcode segment, barcode sequence, amplification primer recognition region, sequencing primer recognition region, unique molecular identifier, etc.
  • the nucleotide It can be a fixed nucleotide sequence.
  • the linker can also include chemical modifications.
  • random leader sequence generally refers to a random primer that can present a fourfold degeneracy at each position.
  • the random guide sequence recognizes and binds to the corresponding region of the target nucleic acid (including the target nucleic acid sequence and other nucleotide sequences attached thereto) to realize the synthesis and/or amplification of the nucleotide sequence.
  • barcode sequence generally refers to a nucleotide sequence capable of identifying a target nucleic acid or its derivative or modified form.
  • cell barcode sequence generally refers to a nucleotide sequence that can be used to identify the source of a target nucleic acid sample.
  • the source can be, for example, from the same cell or different cells.
  • different cell barcode sequences can be used to label the nucleic acid in each source, so that the source of the sample can be identified.
  • Bar codes also commonly referred to as indexes, labels, etc.
  • any suitable bar code or bar code group can be used, such as the cell bar code sequence described in the publication of US2013/0274117.
  • cell barcode segment generally refers to the barcode nucleotide units constituting the cell barcode sequence
  • N of the cell barcode segments can form a cell barcode segment through the action of PCR or DNA ligase.
  • N can be greater than or equal to 1, so that the cell barcode sequence formed is sufficient to identify the cell source of each nucleic acid sample derived from multiple sources.
  • oligonucleotide adaptor generally refers to a nucleotide sequence that is attached to a target nucleic acid and includes a sequence that is capable of complementary hybridization to the oligonucleotide tag.
  • the nucleotide sequence may be a partially double-stranded structure, for example, it may have a protruding sequence that hybridizes with the oligonucleotide tag; in some embodiments, the oligonucleotide adaptor may also include a transposase (such as Tn5 transposition Enzyme) binding sequence; in some embodiments, the oligonucleotide adaptor may also include an amplification primer recognition sequence; in some embodiments, the oligonucleotide adaptor may also include a reverse transcription primer sequence.
  • a transposase such as Tn5 transposition Enzyme
  • barcoded target nucleic acid generally refers to a target nucleic acid to which at least a cell barcode sequence is attached.
  • the term "common barcode domain” generally refers to a barcode sequence used to identify the source of the target nucleic acid.
  • the common barcode domains contained in oligonucleotide tags attached to the same solid support are the same, and the common barcode domains contained in oligonucleotide tags attached to different solid supports are mutually exclusive.
  • the oligonucleotide tag released from the same solid support is connected to the target nucleic acid derived from one cell, and its cellular origin can be identified through the common barcode domain.
  • variable domain generally refers to a nucleotide sequence set according to different needs outside the common barcode domain.
  • linker sequence for example, linker sequence, amplification primer recognition sequence, sequencing primer recognition sequence, etc.
  • transposase-nucleic acid complex generally refers to a complex formed by a transposase and a sequence containing the oligonucleotide adaptor.
  • Transposase usually refers to an enzyme that can bind to the end of a transposon and catalyze its movement to other parts of the genome through a cut, paste mechanism or a replicative transposition mechanism.
  • a transposon usually refers to a segment of nucleotides that can freely jump in the genome. It was proposed by Barbara McClintock in the late 1940s when he was studying the genetic mechanism of maize. Later other research groups described the transposable molecule.
  • chromosome fragments can change position, jumping from one chromosome to another.
  • the relocation of these transposons can change the expression of other genes.
  • transposition in corn can cause color changes, and in other organisms such as bacteria, it can cause antibiotic resistance in the process of human evolution.
  • the transposase-nucleic acid complex can include two dimers formed by the transposases respectively combined with oligonucleotide adaptors, and the two transposases can be the same transposase or different ,
  • the oligonucleotide adaptors that they bind respectively can be the same or different.
  • Tn5 generally refers to the Tn5 transposase, which is a member of the ribonuclease (RNase) superfamily.
  • RNase ribonuclease
  • Tn5 can be found in Shewanella and Escherichia coli.
  • Tn5 can include the naturally occurring Tn5 transposase and various active mutant forms;
  • Tn5 like most other transposases, contains the DDE motif, which is the active site that catalyzes the transfer of the transposon.
  • DDE motifs can coordinate with divalent metal ions (such as magnesium and manganese) and play an important role in catalyzing reactions.
  • the transposase Tn5 may increase the transposition activity through mutations in the DDE region and catalyze the movement of the transposon.
  • the glutamic acid at position 326 is converted to aspartic acid
  • the two aspartic acids at position 97 and 188 are converted to glutamic acid (amino acid numbering based on the amino acid sequence of GenBank Accession No. YP_001446289) and so on.
  • microfluidic device generally refers to a device or system capable of implementing microfluidic control.
  • microfluidics usually refers to a technology for precise control and manipulation of micro-scale fluids, especially those with sub-micron structures.
  • Micro usually refers to tiny volumes or volumes (such as nanoliters, picoliters, and other types of microfluidics). ).
  • Microfluidic technology has been widely used in many fields, such as the field of biomedicine, for example, enzyme analysis in molecular biology methods (such as glucose and lactate analysis), DNA analysis (such as polymerase chain reaction and high-throughput sequencing), Proteomics analysis, etc.
  • the main structure of the microfluidic device may include a simple reservoir connected to it, a fluid pipe that delivers fluid from external sources, manifolds, fluid flow units (for example, actuators, pumps, compressors), etc., and distributes microfluidics Delivery to subsequent processing operations, fluid conduits of instruments or components, etc.
  • a fluid pipe that delivers fluid from external sources, manifolds, fluid flow units (for example, actuators, pumps, compressors), etc., and distributes microfluidics Delivery to subsequent processing operations, fluid conduits of instruments or components, etc.
  • hybridization generally refers to the nucleus contained in nucleic acid (such as RNA, DNA) under in vitro and/or in vivo conditions at a suitable temperature and ionic strength of the solution.
  • nucleic acid such as RNA, DNA
  • the nucleotide sequence enables it to specifically non-covalently bind (ie form Watson-Crick base pairs and/or G/U base pairs) to another nucleic acid sequence.
  • Watson-Crick base pairing includes: adenine/adenosine (A) paired with thymidine/thymine (T), A paired with uracil/uridine
  • the hybridization between two RNA molecules for example, dsRNA
  • G guanine/guanosine
  • C cytosine/cytidine
  • the hybridization between two RNA molecules for example, dsRNA
  • G can also be U base pairing.
  • Hybridization requires that the two nucleic acids contain complementary sequences, but possible mismatches between bases cannot be ruled out.
  • the conditions suitable for hybridization between two nucleic acids depend on the length and degree of complementarity of the nucleic acids, which are well known in the art. The greater the degree of complementarity between two nucleotide sequences, the greater the value of melting temperature (Tm) of hybrids of nucleic acids having these complementary sequences.
  • read length refers to reads, which usually refers to a sequence obtained by a reaction in nucleotide sequencing. Reads can be a short sequencing fragment, which is the base sequence data obtained by a single sequencing by a sequencer. The length of reads can be different for different sequencing instruments.
  • the present application provides a method for analyzing target nucleic acid from a cell, the method comprising:
  • a target nucleic acid derived from a single cell wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid;
  • a solid support attached with at least one oligonucleotide tag wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and is located in the barcode sequence 3'end of the hybridization sequence, the second strand includes a first portion complementary to the hybridization sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid Part, and the first chain and the second chain form a partially double-stranded structure;
  • Two parts, and the second strand and the attached target nucleic acid form a partially double-stranded structure
  • the oligonucleotide tag is connected to the attached target nucleic acid, thereby generating a barcoded target nucleic acid.
  • it further includes:
  • this application also provides a method for amplifying a target nucleic acid from a cell, the method comprising:
  • a) Provide discrete partitions comprising: i. a target nucleic acid derived from a single cell, wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid; and ii. A solid support attached with at least one oligonucleotide tag, wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and a 3'end of the barcode sequence.
  • a hybridizing sequence, the second strand comprising a first portion complementary to the hybridizing sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and
  • the first strand and the second strand form a partially double-stranded structure; or the step ii.
  • each of the oligonucleotide tags Contains a first strand and a second strand, the first strand includes a barcode sequence and a hybridization sequence located at the 3'end of the barcode sequence, and the second strand includes a first portion complementary to the hybrid sequence of the first strand And a second part complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and the second strand and the attached target nucleic acid form a partially double-stranded structure;
  • this application also provides a method for sequencing a target nucleic acid from a cell, the method comprising:
  • a) Provide discrete partitions comprising: i. a target nucleic acid derived from a single cell, wherein at least part of the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid; and ii. A solid support attached with at least one oligonucleotide tag, wherein each of the oligonucleotide tags includes a first strand and a second strand, and the first strand includes a barcode sequence and a 3'end of the barcode sequence.
  • a hybridizing sequence, the second strand comprising a first portion complementary to the hybridizing sequence of the first strand and a second portion complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and
  • the first strand and the second strand form a partially double-stranded structure; or the step ii.
  • each of the oligonucleotide tags Contains a first strand and a second strand, the first strand includes a barcode sequence and a hybridization sequence located at the 3'end of the barcode sequence, and the second strand includes a first portion complementary to the hybrid sequence of the first strand And a second part complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and the second strand and the attached target nucleic acid form a partially double-stranded structure;
  • the oligonucleotide tag in the present application may include a first strand and a second strand, and the first strand and the second strand may be provided at the same time or separately.
  • the first chain and the second chain when the first chain and the second chain are provided at the same time, the first chain and the second chain may form a partially double-stranded structure; when the first chain and the second chain are When the two strands are provided separately, the second strand may form a partially double-stranded structure with the attached target nucleic acid.
  • the barcoded target nucleic acid is generated by linking the oligonucleotide tag with the attached target nucleic acid.
  • the hybridization sequence of the first strand of the oligonucleotide tag is connected to the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the barcoded target nucleic acid.
  • the second portion of the second strand of the oligonucleotide tag is hybridized with the oligonucleotide adaptor attached to the target nucleic acid, and the second portion of the oligonucleotide tag is hybridized
  • One strand of the hybridization sequence is connected to the oligonucleotide adaptor attached to the target nucleic acid, thereby generating the barcoded target nucleic acid.
  • the conditions suitable for hybridization between two nucleic acids depend on the length and degree of complementarity of the nucleic acids, which are well known in the art. The greater the degree of complementarity between two nucleotide sequences, the greater the value of melting temperature (Tm) of hybrids of nucleic acids having these complementary sequences.
  • the length of the second part of the second strand of the oligonucleotide tag is sufficient for its complementary sequence (the oligonucleotide adaptor sequence attached to the target nucleic acid or a partial sequence thereof) Form a double-stranded structure.
  • the length of the second part of the second strand may be 1 nucleotide or more, 2 nucleotides or more, 3 nucleotides or more, 5 nucleotides or More, 8 nucleotides or more, 10 nucleotides or more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides Nucleotides or more, 25 nucleotides or more or 30 nucleotides or more.
  • the hybridization does not exclude possible mismatches between bases.
  • the sequence of the first part of the second strand or the second part of the second strand need not be 100% complementary to the sequence of the hybridizing sequence.
  • it can be 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more More, 98% or more, 99% or more, 99.5% or more complementary.
  • the remaining non-complementary nucleotides can be clustered or interspersed with complementary nucleotides and need not be adjacent to each other or complementary nucleotides.
  • polynucleotides can hybridize on one or more segments so that no intermediate or adjacent segments are involved in the hybridization event (e.g., forming a hairpin structure, "bumps", etc.).
  • a ligation reaction is used to ligate the oligonucleotide tag with the attached target nucleic acid.
  • the linking may include joining two nucleic acid segments together by catalyzing the formation of a phosphodiester bond, such as the hybridizing sequence of the first strand of the oligonucleotide tag and the attachment to the target nucleic acid.
  • the ligation reaction can include DNA ligase, such as E. coli DNA ligase, T4 DNA ligase, T7 DNA ligase, mammalian ligase (for example, DNA ligase I, DNA ligase III, DNA ligase IV), thermostable Ligase etc.
  • T4 DNA ligase can join segments containing DNA, oligonucleotides, RNA and RNA-DNA hybrids.
  • the ligation reaction may not include DNA ligase, but instead use alternatives such as topoisomerase.
  • Using high concentration of DNA ligase and including PEG can achieve rapid ligation.
  • the optimum temperature of the DNA ligase for example, 37°C
  • the melting temperature of the DNA to be ligated can be considered.
  • the target nucleic acid and the barcoded solid support can be suspended in a suitable buffer to minimize the effects of ions that may affect the connection.
  • the releasing at least a portion of the target nucleic acid from the single cell in the discrete partition to the outside of the cell includes releasing at least a portion of the target nucleic acid from the single cell in the discrete partition to the outside of the cell, and linking the released target nucleic acid to the oligonucleotide tag in b) , So as to produce barcoded target nucleic acid.
  • the releasing at least a portion of the target nucleic acid from the single cell in the discrete partition to the outside of the cell may include contacting the cell with a lysis reagent to release the contents of the cell in the discrete partition.
  • the lytic agent may include a biologically active agent, for example, a lytic enzyme used to lyse different cell types (such as gram positive or negative bacteria, plants, yeast, mammals, etc.), such as lysozyme, leuco peptide Enzymes, lysostaphin, thioglucosidase kitalase, lyticase, and other commercially available lytic enzymes.
  • a surfactant-based dissolving solution may also be used to dissolve the cells.
  • the dissolving solution may include nonionic surfactants such as Triton X-100 and Tween 20.
  • the dissolving solution may include ionic surfactants such as sodium lauryl sarcosinate and sodium dodecyl sulfate (SDS).
  • ionic surfactants such as sodium lauryl sarcosinate and sodium dodecyl sulfate (SDS).
  • SDS sodium dodecyl sulfate
  • other methods that can be used such as electroporation, heat, sound, or mechanical cell destruction can also be used for lysis.
  • the releasing at least a portion of the target nucleic acid from the single cell in the discrete partition to the outside of the cell may include at least 5%, at least 10%, at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45% of the target nucleic acid is released outside the cell from the single cell in the discrete partition.
  • it includes allowing at least a portion of the oligonucleotide tag released from the solid support to enter the single cell, and to connect with the target nucleic acid in b), thereby generating a barcoded target nucleic acid.
  • the release of at least a portion of the oligonucleotide tag from the solid support into the single cell may include at least 25%, at least 30%, at least 35%, at least 40%, at least 50%, At least 55%, at least 60%, at least 70%, at least 75%, at least 75% of the oligonucleotide tag enters the single cell.
  • the oligonucleotide tag is releasably attached to the solid support.
  • an oligonucleotide tag that is releasably, cleavably or reversibly attached to the solid support includes being released by the cleavage/disruption of the linkage between the oligonucleotide tag molecule and the solid support Or a releasable oligonucleotide tag, or an oligonucleotide tag released by the degradation of the solid support itself, so that the oligonucleotide tag can be accessed or accessible by other reagents, or both.
  • the acrydite moiety connected to the solid support precursor, another substance connected to the solid support precursor, or the precursor itself contains an unstable bond, for example, a chemical, heat, or light sensitive bond, for example, a disulfide bond , UV sensitive keys, etc.
  • the unstable bond can be used to reversibly link (covalently link) a substance (such as an oligonucleotide tag) to a solid support.
  • a thermally labile bond may include attachment based on nucleic acid hybridization (e.g., when an oligonucleotide hybridizes to a complementary sequence attached to a solid support) such that the thermal melting of the hybrid is removed from the solid support (or Beads) release oligonucleotides, for example, sequences containing oligonucleotide tags.
  • nucleic acid hybridization e.g., when an oligonucleotide hybridizes to a complementary sequence attached to a solid support
  • Beads Beads
  • adding multiple types of unstable bonds to a gel solid support can lead to the production of a solid support that can respond to different stimuli.
  • Each type of unstable bond can be sensitive to related stimuli (eg, chemical stimulation, light, temperature, etc.), so that the release of substances attached to the solid support through each type of unstable bond can be controlled by applying appropriate stimuli .
  • agents can be provided that are releasably attached to a solid support or otherwise arranged in discrete partitions, such that once delivered to a desired set of agents (for example, by co-dispensing)
  • the activatable group can react with the desired reagent.
  • activatable groups include caged groups, removable blocking or protecting groups, for example, photolabile groups, thermally labile groups, or chemically removable groups.
  • ester linkages e.g., acids, bases, or Hydroxylamine cleavable
  • adjacent diol linkage e.g., cleavable by sodium periodate
  • Diels-Alder linkage e.g., thermally cleavable
  • sulfone linkage e.g., cleavable by alkali
  • monosilane Base ether linkage e.g., cleavable by acid
  • glycoside linkage e.g., cleavable by amylase
  • peptide linkage e.g., cleavable by protease
  • phosphodiester linkage e.g., cleavable by nuclease ( DNA enzyme) cleaved.
  • the oligonucleotide tag is directly or indirectly attached to the solid support through the 5' end of its first strand. For example, including releasing the at least one oligonucleotide tag from the solid support, and linking the released oligonucleotide tag with the attached target nucleic acid in b), thereby producing Barcoded target nucleic acid.
  • the target nucleic acid sequence is located at the 3'end of the barcode sequence.
  • the target nucleic acid can be directly connected to the 3'end of the barcode sequence; for example, the target nucleic acid is not directly connected to the 3'end of the barcode sequence, and the target nucleic acid can be directly connected to the barcode sequence.
  • the target nucleic acid can be directly connected to the barcode sequence.
  • the barcoded target nucleic acid is amplified.
  • the barcoded target nucleic acid is released from the discrete partition, and the amplification is performed after the barcoded target nucleic acid is released from the discrete partition.
  • further chemical or enzymatic modification may be performed, for example, the modification may include bisulfite conversion, 5hmc conversion, etc., before amplification.
  • amplification primers are used in the amplification.
  • the amplification may also include further modification of the barcoded target nucleic acid so that it also has a fixed sequence on the other side that can be used for PCR amplification.
  • the modification may include reverse transcription. Chain switching, second-strand synthesis, terminal transferase (terminal transferase) reaction, and connection of a second adaptor (adaptor) can be used.
  • the amplification primers may also include universal primers.
  • an amplification primer is used in the amplification, and the amplification primer may include a random guide sequence.
  • the random leader sequence includes random primers that can exhibit four-fold degenerate at each position.
  • random primers include any nucleic acid primers having various random sequence lengths known in the art.
  • random primers can include lengths of 3, 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more nucleotides Random sequence.
  • the plurality of random primers may include random primers having different lengths.
  • the plurality of random primers may include random primers having an equal length.
  • the plurality of random objects may include random sequences of about 5 to about 18 nucleotides in length.
  • the plurality of random objects includes random hexamers.
  • the random hexamer is commercially available and widely used in amplification reactions, such as multiple displacement amplification (MDA), for example, REPLI-g Whole Genome Amplification Kit (QIAGEN, Valencia, CA) as an example .
  • MDA multiple displacement amplification
  • REPLI-g Whole Genome Amplification Kit QIAGEN, Valencia, CA
  • Random primers of any suitable length can be used in the methods and compositions described in this application.
  • the amplifying includes at least partially hybridizing the random leader sequence with the barcoded target nucleic acid and extending the random leader sequence in a template-directed manner.
  • the oligonucleotide tag includes a first strand and a second strand
  • the first strand includes a barcode sequence and a hybridization sequence located at the 3'end of the barcode sequence
  • the second strand includes A first part complementary to the hybridizing sequence of the first strand and a second part complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and the first strand and the second strand form Partially double-stranded structure.
  • the oligonucleotide tag includes a first strand and a second strand
  • the first strand includes a barcode sequence and a hybridization sequence located at the 3'end of the barcode sequence
  • the second strand includes The first portion of the first strand that is complementary to the hybridization sequence and the second portion that is complementary to the oligonucleotide adaptor sequence attached to the target nucleic acid, and the second strand and the attached
  • the connected target nucleic acid forms a partially double-stranded structure.
  • the conditions suitable for hybridization between two nucleic acids depend on the length and degree of complementarity of the nucleic acids, which are well known in the art. The greater the degree of complementarity between two nucleotide sequences, the greater the value of melting temperature (Tm) of hybrids of nucleic acids having these complementary sequences.
  • Tm melting temperature
  • the length of the first part of the second strand or the second part of the second strand is sufficient for its complementary sequence (for example, the hybridization of the first strand at the 3'end of the barcode sequence)
  • the sequence for example, the oligonucleotide adaptor sequence attached to the target nucleic acid or a partial sequence thereof) forms a double-stranded structure.
  • the length of the first part of the second strand or the second part of the second strand may be 1 nucleotide or more, 2 nucleotides or more, and 3 nucleosides. Acid or more, 5 nucleotides or more, 8 nucleotides or more, 10 nucleotides or more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides or more, 25 nucleotides or more, or 30 nucleotides or more.
  • the length of the sequence of the first part of the second strand and the second part of the second strand may be the same or different.
  • the double-stranded structure does not exclude possible mismatches between bases.
  • the sequence of the first part of the second strand or the second part of the second strand need not be 100% complementary to the sequence of the hybridizing sequence.
  • it can be 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more More, 98% or more, 99% or more, 99.5% or more complementary.
  • the remaining non-complementary nucleotides can be clustered or interspersed with complementary nucleotides and need not be adjacent to each other or complementary nucleotides.
  • polynucleotides can hybridize on one or more segments so that no intermediate or adjacent segments are involved in the hybridization event (e.g., forming a hairpin structure, "bumps", etc.).
  • the second part of the oligonucleotide tag attached to the same solid support may be the same.
  • the second part of the oligonucleotide tag attached to the same solid support may be different.
  • the second part of each oligonucleotide tag attached to the same solid support may include one or more nucleotide sequences, for example, the sequence of the second part may be 2.
  • the number of the oligonucleotide tags attached to the same solid support and containing the same second part may be 1 or more, for example, 50 or more, 100 or more, 500 One or more, 1,000 or more, 1,500 or more, 2,000 or more, 3,000 or more, 5,000 or more, 8,000 or more, 10,000 or more, 12,000 or more, 15,000 or more, 18,000 One or more, 20,000 or more, 22,000 or more, 25,000 or more, 28,000 or more, 30,000 or more, 35,000 or more, 40,000 or more, 45,000 or more, 50,000 or more.
  • the number of the oligonucleotide tags containing different second parts attached to the same solid support can be set to different ratios as needed, so as to be connected to the corresponding attached target nucleic acid.
  • the barcode sequence includes a cell barcode sequence, and each oligonucleotide tag attached to the same solid support contains the same cell barcode sequence.
  • the oligonucleotide tags attached to the same solid support may include 1 or more oligonucleotide tags, for example, 50 or more, 100 or more, 500 or more , 1000 or more, 1500 or more, 2000 or more, 3000 or more, 5000 or more, 8000 or more, 10000 or more, 12000 or more, 15000 One or more, 18,000 or more, 20,000 or more, 22,000 or more, 25,000 or more, 28,000 or more, 30,000 or more, 35,000 or more, 40,000 or More, 45,000 or more, 50,000 or more, 55,000 or more, 60,000 or more, 65,000 or more, 70,000 or more, 75,000 or more, 80,000 or more , 85,000 or more, 90,000 or more, 95,000 or more, 100,000 or more, 110,000 or more, 120,000 or more, the cell barcode sequences of these oligonucleotide tags are the same
  • the sequence of the second part of the second strand may be one or more, for example, the sequence of the second part is 2 or more
  • the cell barcode sequences contained in the oligonucleotide tag sets attached to different solid supports are different from each other, and the oligonucleotide tag sets may be all the barcodes attached to the same solid support.
  • the cell barcode sequence includes at least 2 cell barcode segments.
  • the cell barcode segment is 4 or more nucleotides (nt), for example, 5 or more, for example, 10 or more, 12 or more, 15 or more, 18 or more, 20 or more, 21 or more, 22 or more, 23 or more, 24 or more, 25 or more, 26 or more, 27 or more, 28 or more, 29 or more, 30 or more, 31 or more, 32 or more, 33 or more, 34 or more, or 35 or more.
  • the cell barcode sequence includes at least 2 cell barcode segments, at least 3 cell barcode segments, at least 4 cell barcode segments, at least 5 cell barcode segments, and at least 6 cell barcode segments. Segment, at least 7 cell barcode segments, at least 8 cell barcode segments, the cell barcode segment is encoded as a cell barcode segment 1 in the sequence from the 5'end to the 3'end in the oligonucleotide tag , Cell barcode section 2, Cell barcode section 3, Cell barcode section 4, Cell barcode section 5...Cell barcode section n.
  • the at least two cell barcode segments can form the cell barcode sequence by PCR or DNA ligase.
  • the cell barcode sequence can be generated by the following method:
  • At least one solid support into at least 2 primary aliquots, for example, at least 8 aliquots, at least 16 aliquots, at least 24 aliquots, at least 32 aliquots, at least 40 equals, at least 48 equals, at least 56 equals, at least 64 equals, at least 72 equals, at least 80 equals, at least 88 equals, at least 96 equals;
  • each of the primary aliquots with at least 1 cell barcode segment 1, for example, at least 1000 cell barcode segment 1, for example, at least 10,000 cell barcode segment 1, for example, at least 100,000 cells
  • Barcode section for example, at least 1,000,000 cells.
  • Barcode section for example, at least 10,000,000 cells. Barcode section 1.
  • the cell barcode section 1 in each aliquot and the cells in any other aliquot The sequence and/or length of barcode segment 1 are different from each other;
  • At least 1 cell barcode segment 2 or its complementary sequence to each of the secondary aliquots for example, at least 1000 cell barcode segment 2 or its complementary sequence, for example, at least 10,000 cell barcode regions Segment 2 or its complementary sequence, for example, at least 100,000 cell barcode segment 2 or its complementary sequence, for example, at least 1,000,000 cell barcode segment 2 or its complement, for example, at least 10 million cell barcode segment 2 or its complement Sequence, the cell barcode segment 2 or its complementary sequence in each aliquot is different from the cell barcode segment 2 or its complementary sequence in any other aliquot in sequence and/or length;
  • steps 4)-6) can be repeated, the number of repetitions can be n, n can be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more to connect cell barcode segment 3.
  • Cell barcode section 4, cell barcode section 5...Cell barcode section n to generate a cell barcode with a unique sequence for each cell, so that the target nucleic acid in the first cell can have a first cell barcode with a unique sequence .
  • the target nucleic acid in the second cell may have a second cell barcode with a unique sequence, the target nucleic acid in the second cell may have a second cell barcode with a unique sequence, and so on.
  • the barcoded target nucleic acid is released from the discrete partition.
  • c) is further performed: sequencing the barcoded target nucleic acid to obtain the characterization result.
  • the characterization result may include the nucleotide sequence information of the barcoded target nucleic acid, for example, including the cell barcode nucleotide sequence information, the nucleotide sequence information of the target nucleic acid, and UMI sequence information.
  • a continuous nucleic acid sequence of at least a part of the genome of the single cell is assembled from the sequence of the barcoded target nucleic acid.
  • the single cell is characterized based on the nucleic acid sequence of at least a portion of the genome of the single cell.
  • the oligonucleotide tag further includes a linker sequence 1, and the 5'end of the cell barcode segment 1 can be connected to a solid support through the linker sequence 1.
  • the linker sequence 1 may include acrydite modification, photocleavage modification, S-S modification, dU base modification and other sequences, which can be disconnected by various methods to release the oligonucleotide tag.
  • the oligonucleotide tag also includes other functional sequences, and the other functional sequences may be located between the cell barcode segment 1 and the linker sequence 1, for example, a complete or partial functional sequence (e.g., Primer sequence (for example, universal primer sequence, targeting primer sequence, random primer sequence) recognition region, primer annealing sequence, attachment sequence, sequencing primer recognition region, amplification primer recognition region (for example, universal amplification primer recognition region), etc. , For subsequent processing.
  • a complete or partial functional sequence e.g., Primer sequence (for example, universal primer sequence, targeting primer sequence, random primer sequence) recognition region, primer annealing sequence, attachment sequence, sequencing primer recognition region, amplification primer recognition region (for example, universal amplification primer recognition region), etc.
  • the subsequent processing includes amplification.
  • the amplification may include PCR amplification (for example, Taq DNA polymerase amplification, Super Taq DNA polymerase amplification, LA Taq DNA polymerase amplification, Pfu DNA polymerase amplification, Phusion DNA polymerase amplification , KOD DNA polymerase amplification, etc.), isothermal amplification (for example, loop-mediated isothermal amplification (LAMP), helicase-dependent amplification (HDA), recombinase polymerase amplification (RPA), Strand displacement amplification (SDA), nucleic acid sequence-based amplification (NASBA), transcription-mediated amplification (TMA), etc.), T7 promoter linear amplification, degenerate oligonucleotide primer PCR amplification (DOP-PCR) ), Multiple Displacement Amplification (MDA), Multiple Annealing Circular Cycle Amplification (MALBAC), etc.
  • PCR amplification for example, Taq DNA
  • the cell barcode may not contain a linker, and the cell barcode may be a separate nucleic acid sequence synthesized by other methods.
  • the universal primer sequence may include P5 or other suitable primers.
  • Universal primers (for example, P5) are also compatible with the sequencing device, for example, can be attached to the flow cell in the sequencing device.
  • such universal primer sequences can provide complementary sequences of oligonucleotides constrained on the surface of the flow cell in the sequencing device, so that the barcoded target nucleic acid sequence can be immobilized on the surface for sequencing.
  • an amplification primer sequence is a primer sequence used for an amplification or replication process (for example, extending the primer along the target nucleic acid sequence), so as to generate an amplified barcoded target nucleic acid sequence.
  • the resulting amplified target sequence will contain such primers and be easily transferred to the sequencing system.
  • the sequencing primer sequence may include the R1 primer sequence and the R2 primer sequence.
  • the oligonucleotide tag may comprise a T7 promoter sequence.
  • the T7 promoter sequence includes the nucleotide sequence shown in SEQ ID NO:1 (TAATACGACTCACTATAG).
  • the oligonucleotide tag may contain at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, and any one of SEQ ID NO: 6-9. 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94% , 95%, 96%, 97%, 98%, 99%, or 100% identity regions.
  • the nucleotide adaptor sequence may include a P5 sequence.
  • the nucleotide adaptor sequence includes the P7 sequence.
  • the cell barcode segment 1 and the linker sequence 1 may include any sequence or a combination of the above-mentioned multiple functional sequences.
  • these oligonucleotides may include any one or more of the following: P5, R1 and R2 sequences, non-cleavable 5'acrydite-P5, cleavable 5'acrydite-SS-P5, R1c, sequencing primers, reading Take primers, universal primers, P5_U, universal read primers and/or binding sites of any of these primers.
  • the cell barcode sequence includes at least two cell barcode segments separated by a linker sequence.
  • the 3'end of the cell barcode segment 1 has a linker sequence 2
  • the 5'end and the 3'end of the cell barcode segment 2 have linker sequences 3 and 4, respectively
  • the 5'end and 3 of the cell barcode segment 3 The'end has linker sequences 5 and 6, respectively
  • the 5'end and 3'end of cell barcode segment 4 have linker sequences 7 and 8, respectively, and so on, the 5'end and 3'end of cell barcode segment n
  • linker sequences 2n-1 and 2n respectively; linker sequence 2 and linker sequence 3 can be at least partially complementary paired to form a double-stranded structure, and linker sequence 4 and linker sequence 5 can be at least partially complementary paired to form a double-stranded structure.
  • the subsequence 6 and the linker sequence 7 can be at least partially complementary paired to form a double-stranded structure, and so on, to activate the cell barcode section 1, the cell barcode section 2, the cell barcode section 3, the cell barcode section 4, and so on. Connection of bar code segment n.
  • a ligation reaction is used to ligate barcode segments of each cell to form an oligonucleotide tag.
  • the linking may include joining two nucleic acid segments together by catalyzing the formation of a phosphodiester bond, such as cell barcode segment 1 and the aforementioned functional sequence, for example, linker sequence 2 and cell barcode segment 2, linking Subsequence 3 and cell barcode segment 3, linker sequence 4 and cell barcode segment 4, linker sequence 5 and cell barcode segment 5, linker sequence 6 and cell barcode segment 6, and so on.
  • the ligation reaction may include DNA ligase, such as E.
  • T4 DNA ligase can join segments containing DNA, oligonucleotides, RNA and RNA-DNA hybrids.
  • the ligation reaction may not include DNA ligase, but instead use alternatives such as topoisomerase.
  • Using high concentration of DNA ligase and including PEG can achieve rapid ligation.
  • the optimum temperature of the DNA ligase for example, 37° C.
  • the melting temperature of the DNA to be ligated can be considered.
  • the sample and barcoded solid support can be suspended in a buffer to minimize the effects of ions that may affect the connection.
  • the cell barcode segment provided in each round may contain the following structure: the cell barcode segment and the linker sequence located at the 3'end of the cell barcode segment are double-stranded structures located in the cell
  • the linker sequence at the 5'end of the barcode segment is a protruding single-stranded structure, and a double-stranded structure is formed by at least partially complementary pairing with the linker sequence at the 5'end of the previous cell barcode segment.
  • an example of using a ligation reaction to ligate barcode segments of each cell to form an oligonucleotide tag can be as shown in FIG. 2 or FIG. 4.
  • each cell barcode segment is connected to form an oligonucleotide tag.
  • the polymerase chain reaction can be performed by any one or more of the following polymerases: Taq DNA polymerase, Super Taq DNA polymerase, LA Taq DNA polymerase, UltraPF DNA polymerase, Tth DNA polymerase, Pfu DNA Polymerase, VentR DNA polymerase, Phusion DNA polymerase, KOD DNA polymerase, Iproof DNA polymerase.
  • the polymerase chain reaction may further include a buffer solution and metal ions that enable the polymerase to maintain activity; for example, the polymerase chain reaction may also include dNTP and its modified derivatives.
  • each round provides the complementary sequence of the cell barcode segment, the complementary sequence is a single-stranded structure, and the 5'end and the 3'end each have A linker sequence of a single-stranded structure, wherein the linker sequence at the 5'end can be at least partially complementary to the linker sequence at the 3'end of the cell barcode segment connected in the previous round to form a double-stranded structure, and the linker sequence at the 3'end can be paired with The linker sequence at the 5'end of the cell barcode segment connected in the latter round is at least partially complementary paired to form a double-stranded structure.
  • PCR polymerase chain reaction
  • PCR polymerase chain reaction
  • the target nucleic acid includes one or more selected from the group consisting of DNA, RNA and cDNA.
  • the target nucleic acid includes cDNA derived from RNA in the single cell.
  • the RNA includes mRNA.
  • the target nucleic acid is added with an oligonucleotide adaptor sequence to become an attached target nucleic acid.
  • the oligonucleotide adaptor sequence is located at the 5'end of the target nucleic acid.
  • the oligonucleotide adaptor sequence may include a nucleotide sequence L that is complementary to the second part of the second strand in the oligonucleotide tag, and the nucleotide sequence L is The length may be the same as or different from the length of the second part of the second strand in the oligonucleotide tag; for example, the length of the nucleotide sequence L may be 1 nucleotide or more.
  • nucleotides or more More, 2 nucleotides or more, 3 nucleotides or more, 5 nucleotides or more, 8 nucleotides or more, 10 nucleotides or more, 12 nuclei Nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides or more, 25 nucleotides or more or 30 nucleotides or more .
  • the nucleotide sequence L may be complementary to the second part of the second strand in the oligonucleotide tag to form a double-stranded structure.
  • the double-stranded structure cannot exclude possible mismatches between bases.
  • the sequence of the nucleotide sequence L need not be 100% complementary to the sequence of the second part of the second strand in the oligonucleotide tag.
  • it can be 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more More, 98% or more, 99% or more, 99.5% or more complementary.
  • the remaining non-complementary nucleotides can be clustered or interspersed with complementary nucleotides and need not be adjacent to each other or complementary nucleotides.
  • a polynucleotide can hybridize on one or more segments so that no intermediate or adjacent segments are involved in the hybridization event (e.g., forming a hairpin structure, "bulge", etc.).
  • the nucleotide adaptor sequence includes a transposon end sequence.
  • the transposon end sequence is Tn5 or a modified Tn5 transposon end sequence.
  • the transposon end sequence is the Mu transposon end sequence.
  • the Tn5 or modified Tn5 transposon end sequence or Mu transposon end sequence may comprise 15 to 25 nucleotides, for example, 16 nucleotides, 17 nucleotides, 18 nucleotides , 19 nucleotides, 20 nucleotides, 21 nucleotides, 22 nucleotides, 23 nucleotides, 24 nucleotides.
  • Tn5 chimeric end sequence A14 Tn5MEA
  • Tn5 chimeric end sequence B15 Tn5MEB
  • NTS complementary non-transferred sequence
  • Tn5MEA 5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3’; (SEQ ID NO: 2)
  • Tn5MEB 5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3’; (SEQ ID NO: 3)
  • Tn5NTS 5’-CTGTCTCTTATACACATCT-3’.
  • the RNA is reverse transcribed before step a) in the method of the present application, and the attached target nucleic acid is produced.
  • the first strand synthesis primer is used to synthesize the first strand of cDNA from mRNA in each mRNA sample.
  • the first-strand synthetic primer includes an oligo dT primer.
  • the first-strand synthetic primer used in the reverse transcription may be a reverse transcription primer that contains the oligonucleotide adaptor sequence and the polyT sequence in a 5'to 3'direction.
  • the reverse transcription includes hybridizing the polyT sequence with the RNA and extending the polyT sequence in a template-directed manner.
  • the first-strand synthetic primer is random.
  • the first-strand synthetic primer is a mixture of an oligo dT primer and a random substance.
  • the method further includes incorporating template-switching oligonucleotide primers (TSO primers) together with a mixture of oligo dT primers and random substances.
  • TSO primers template-switching oligonucleotide primers
  • the second strand of cDNA is synthesized using TSO primers.
  • the second strand of cDNA is synthesized using a second amplification primer that is complementary to the first strand of cDNA, and the first strand extends beyond the mRNA template to include the complementary TSO strand.
  • the target nucleic acid includes DNA derived from the single cell.
  • the DNA includes genomic DNA
  • the DNA includes genomic DNA, open chromatin DNA, protein-bound DNA regions, and/or exogenous nucleic acids linked to proteins, lipids and/or small molecule compounds, the proteins, lipids and/or small molecules
  • the compound can bind to the target molecule in the cell.
  • the protein may include antibodies and antigens.
  • the target molecule may include a target nucleic acid sequence to be analyzed in a cell.
  • the DNA derived from a single cell is fragmented before step a) in the method described in this application.
  • DNA fragmentation can include separating or breaking DNA strands into small pieces or segments.
  • a variety of methods can be used to fragment DNA, for example, the sequence of the oligonucleotide adaptor is attached after the DNA is fragmented (the sequence of the attached oligonucleotide adaptor under this condition is not Including the transposon end sequence), including restriction digestion or various methods of generating shear force.
  • restriction digestion can use restriction enzymes to make nicks in the DNA sequence by cutting the blunt ends of the two strands or by uneven cutting to create sticky ends.
  • shear force-mediated DNA strand destruction can include sonication, acoustic shearing, needle shearing, pipetting, or atomization.
  • Sonication is a type of hydrodynamic shear that exposes DNA sequences to short-term shear forces, which can produce fragment sizes of about 700 bp.
  • Acoustic shearing applies high-frequency acoustic energy to the DNA sample in the bowl-shaped transducer.
  • Needle shearing generates shearing force by passing DNA through a small diameter needle to physically tear the DNA into smaller segments.
  • the atomizing force can be generated by passing DNA through the small holes of the nebulizer unit, where the resulting DNA fragments are collected from the fine mist leaving the unit.
  • these fragments can be any length between about 200 to about 100,000 bases.
  • the fragment will be about 200 bp to about 500 bp, about 500 bp to about 1 kb, about 1 kb to about 10 kb, or about 5 kb to about 50 kb, or about 10 kb to about 30 kb, for example, about 15 kb to about 25 kb.
  • the fragmentation of larger genetic components can be performed by any convenient method, for example, including commercially available shear-based fragmentation systems (for example, Covaris fragmentation system), size-targeted fragmentation systems (For example, Blue Pippin (Sage Science)), enzyme fragmentation methods (for example, DNA endonuclease, DNA exonuclease) and so on.
  • the fragmentation includes using ultrasonic fragmentation, and then adding a sequence containing the oligonucleotide adaptor to the fragmented DNA, thereby obtaining the attached target nucleic acid.
  • the attached target nucleic acid is produced after the fragmentation or during the fragmentation.
  • the fragmentation includes using a transposase-nucleic acid complex to integrate the sequence containing the oligonucleotide adaptor into the DNA, and releasing the transposase to obtain the attached target Nucleic acid.
  • the transposase includes Staphylococcus aureus Tn5 (Colegio et al., “Journal of Bacterology” (J. BacterioL), 183: 2384-8, 2001; Kirby C et al., "Molecular Microbiology” (Mol.
  • the transposase-nucleic acid complex includes a transposase and a transposon end nucleic acid molecule, wherein the transposon end nucleic acid molecule includes the oligonucleotide adaptor sequence.
  • the transposase is Mu transposase.
  • the transposase is Tn5 transposase or Tn10 transposase.
  • the Tn5 transposase is selected from the group consisting of full-length Tn5 transposase, partial functional domains of Tn5 transposase, mutations of Tn5 transposase.
  • the Tn10 transposase is selected from the group consisting of full-length Tn10 transposase, partial functional domains of Tn10 transposase, and Tn10 transposase mutants.
  • the Tn5 transposase mutant may be selected from: R30Q, K40Q, Y41H, T47P, E54K/V, M56A, R62Q, D97A, E110K, D188A, Y319A, R322A/K/Q, E326A, K330A/R, K333A, R342A, E344A, E345K, N348A, L372P, S438A, K439A, S445A, G462D, A466D.
  • the two transposase molecules can bind to the same or different double-stranded DNA transposons, so that the insertion site is marked by one or two types of DNA.
  • the two transposase molecules (such as Tn5 and superactive T years or other types of transposase containing point mutations) can be linked to one of the oligonucleotide adaptor sequence and another standard transposon
  • the DNA sequence assembles into a hybrid transposition complex, or only the above-mentioned double-stranded structure 2 is used to form a single Tn5 transcomplex.
  • the standard transposon DNA sequence may include an amplification primer sequence and/or a sequencing primer sequence.
  • the DNA may include a DNA region that binds to a protein
  • the transposase-nucleic acid complex may also include a portion that directly or indirectly recognizes the protein.
  • the part that directly or indirectly recognizes the protein may include Staphylococcus aureus protein A (ProteinA), streptococcal protein G (ProteinG), streptococcal protein L (ProteinL) or other protein analogs that have the function of binding antibodies.
  • the portion that directly or indirectly recognizes the protein may also include an antibody that specifically binds to the protein.
  • Staphylococcus aureus protein A (ProteinA), streptococcal protein G (ProteinG), streptococcal protein L (ProteinL) or other protein analogs with the function of binding antibodies can each bind to the specific binding to the protein.
  • Staphylococcus aureus protein A (ProteinA)
  • streptococcal protein G (ProteinG)
  • streptococcal protein L ProteinL
  • other protein analogs with the function of binding antibodies can each bind to the specific binding to the protein.
  • the transposase forms a fusion protein with the Staphylococcus aureus protein A (ProteinA), streptococcal protein G (ProteinG), streptococcal protein L (ProteinL) or other protein analogs that have the function of binding antibodies.
  • Staphylococcus aureus protein A ProteinA
  • streptococcal protein G ProteinG
  • streptococcal protein L ProteinL
  • other protein analogs that have the function of binding antibodies.
  • the fusion protein binds to the antibody that specifically binds to the protein to form a complex, and then targets the protein.
  • the antibody that specifically binds to the protein binds to the protein, and then the fusion protein binds to the antibody to target the protein.
  • the oligonucleotide adaptor sequence may also include an antibody recognition sequence, which is used to recognize/track different antibodies.
  • the antibody recognition sequence can be generated in a manner similar to random primers.
  • the attached target nucleic acid includes a unique molecular identification region.
  • the unique molecular identification region refers to a unique nucleic acid sequence attached to each of a plurality of nucleic acid molecules.
  • UMI can be used to correct subsequent amplification bias by directly counting the unique molecular identification regions (UMI) sequenced after amplification.
  • UMI includes identifying a single nucleic acid sequence in the barcoded target nucleic acid as derived from a given nucleic acid in the target nucleic acid based at least in part on the existence of the unique molecular identification region.
  • UMI can be designed, incorporated and applied in a manner known in the art, for example, through WO 2012/142213, Islam “Nat.Methods” (2014) 11:163-166, and Kivioja, T. et al. "Nat.Methods” (2012) 9:72-74 As shown in the publication, each of the documents is incorporated herein by reference in its entirety.
  • the unique molecular identification region is located between the oligonucleotide adaptor sequence and the target nucleic acid sequence.
  • the target nucleic acid may also include an exogenous nucleic acid, which includes an exogenous nucleic acid linked to a protein, lipid, and/or small molecule compound, and the protein, lipid, and/or small molecule compound can interact with The target molecule within the cell binds.
  • the protein may include antibodies and antigens.
  • the target molecule may include a target nucleic acid sequence to be analyzed in a cell.
  • the transposition reactions and methods described herein are performed in batches, and then the biological particles (e.g., nuclei/cells/chromatin from a single cell) are distributed so that multiple discrete partitions are covered by the biological particles (e.g., cells, nuclei). , Chromatin or cell beads) alone occupy.
  • the biological particles e.g., cells, nuclei). , Chromatin or cell beads
  • a plurality of biological particles may be allocated to a plurality of discrete partitions such that the discrete partitions of the plurality of discrete partitions include a single biological particle.
  • the solid support may include beads.
  • the beads may be porous, non-porous, and/or a combination thereof.
  • the beads may be solid, semi-solid, semi-fluid, fluid, and/or combinations thereof.
  • the beads may be soluble, destructible, and/or degradable.
  • the beads may be non-degradable.
  • the beads may be gel beads.
  • the gel beads may be hydrogel beads. Gel beads can be formed from molecular precursors, such as polymers or monomer substances.
  • the semi-solid beads may be liposomal beads.
  • the solid beads may contain metals, including iron oxide, gold, and silver.
  • the beads may be silica beads.
  • the beads are magnetic beads.
  • the beads can be rigid.
  • the beads may be flexible and/or compressible.
  • the beads can have any suitable shape.
  • the shape of the beads may include, but is not limited to, spherical, non-spherical, elliptical, oblong, amorphous, circular, cylindrical, and deformed forms thereof.
  • the beads may have a uniform size or a non-uniform size.
  • the diameter of the beads may be at least about 10 nm, 100 nm, 500 nm, 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 50 ⁇ m, 60 ⁇ m, 70 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, 1 mm or more.
  • the diameter of the beads may be less than about 10 nm, 100 nm, 500 nm, 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 50 ⁇ m, 60 ⁇ m, 70 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, 1 mm or less.
  • the diameter of the beads can be in the range of about 40-75 ⁇ m, 30-75 ⁇ m, 20-75 ⁇ m, 40-85 ⁇ m, 40-95 ⁇ m, 20-100 ⁇ m, 10-100 ⁇ m, 1-100 ⁇ m, 20-250 ⁇ m, or 20-500 ⁇ m Inside.
  • the beads may be provided in a bead population or multiple beads having a relatively monodisperse size distribution.
  • maintaining relatively consistent bead characteristics can contribute to overall consistency.
  • the beads described herein may have a coefficient of variation of their cross-sectional dimensions of less than 50%, less than 40%, less than 30%, less than 20%, and for example less than 15%, less than 10%, less than 5% or less The size distribution.
  • the beads may comprise natural and/or synthetic materials.
  • the beads may comprise natural polymers, synthetic polymers, or natural and synthetic polymers.
  • Natural polymers may include proteins and sugars, such as deoxyribonucleic acid, rubber, cellulose, starch (e.g., amylose, pullulan), protein, enzyme, polysaccharide, silk, polyhydroxyalkanoate, chitosan , Dextran, collagen, carrageenan, plantago ovata, gum arabic, agar, gelatin, shellac, karaya, xanthan gum, corn syrup, guar gum, karaya, agarose, Alginic acid, alginate or its natural polymer.
  • proteins and sugars such as deoxyribonucleic acid, rubber, cellulose, starch (e.g., amylose, pullulan), protein, enzyme, polysaccharide, silk, polyhydroxyalkanoate, chitosan , Dextran, collagen, carrageenan,
  • Synthetic polymers can include acrylic, nylon, siloxane, spandex, viscose rayon, polycarboxylic acid, polyvinyl acetate, polyacrylamide, polyacrylate, polyethylene glycol, polyurethane, polylactic acid, dioxide Silicon, polystyrene, polyacrylonitrile, polybutadiene, polycarbonate, polyethylene, polyethylene terephthalate, polychlorotrifluoroethylene, polyethylene oxide, polyethylene terephthalate Glycol ester, polyisobutylene, polymethyl methacrylate, polyoxymethylene, polypropylene, polystyrene, polytetrafluoroethylene, polyvinyl alcohol, polyvinyl chloride, polyvinylidene chloride, polyvinylidene fluoride, poly Vinyl fluoride and/or combinations thereof (e.g., copolymers).
  • the beads can also be formed of materials other than polymers, such as lipids, micelles, ceramics, glass ceramics, material composites,
  • the beads may contain molecular precursors (e.g., monomers or polymers), which can form a polymer network through the polymerization of the molecular precursors.
  • the precursor may be an already polymerized substance, which can be further polymerized by, for example, chemical crosslinking.
  • the precursor may include one or more of acrylamide or methacrylamide monomers, oligomers, or polymers.
  • the beads may contain a prepolymer, which is an oligomer that can be further polymerized.
  • prepolymers can be used to prepare polyurethane beads.
  • the beads may contain separate polymers that can be further polymerized together.
  • beads can be produced by the polymerization of different precursors so that they comprise mixed polymers, copolymers and/or block copolymers.
  • the beads can include covalent or ionic bonds between polymer precursors (e.g., monomers, oligomers, linear polymers), nucleic acid molecules (e.g., oligonucleotides), primers, and other entities.
  • the covalent bond may be a carbon-carbon bond, a thioether bond, or a carbon-heteroatom bond.
  • crosslinking can be permanent or reversible, depending on the specific crosslinking agent used.
  • Reversible crosslinking can allow the polymer to be linearized or dissociated under appropriate conditions.
  • reversible crosslinking can also allow the binding substance to be reversibly attached to the surface of the bead.
  • crosslinking agents can form disulfide bonds.
  • the chemical crosslinking agent that forms disulfide bonds may be cystamine or modified cystamine.
  • disulfide bonds can be formed between molecular precursor units (e.g., monomers, oligomers, or linear polymers) or precursors incorporated into the beads and nucleic acid molecules (e.g., oligonucleotides).
  • cystamine including modified cystamine
  • cystamine is an organic reagent containing disulfide bonds, which can be used as a crosslinking agent between individual monomers of beads or polymer precursors.
  • Polyacrylamide can be polymerized in the presence of cystamine or a substance containing cystamine (e.g., modified cystamine) to produce polyacrylamide gel beads containing disulfide bonds (e.g., containing chemically reducible cross-linked Chemically degradable beads). Disulfide bonds can allow the beads to degrade or dissolve when they are exposed to a reducing agent.
  • chitosan a linear polysaccharide polymer
  • glutaraldehyde a hydrophilic chain
  • the cross-linking of chitosan polymers can be achieved by chemical reactions triggered by heat, pressure, pH changes and/or radiation.
  • the beads can be macromolecules of single or mixed monomers polymerized by various monomers such as agarose, polyenamide, PEG, or macromolecular gels such as chitin, hyaluronic acid, and dextran.
  • monomers such as agarose, polyenamide, PEG, or macromolecular gels such as chitin, hyaluronic acid, and dextran.
  • macromolecular gels such as chitin, hyaluronic acid, and dextran.
  • the microfluidic droplet platform in which the droplets aggregate into gel beads of uniform size.
  • the beads may comprise an acrydite portion, which in certain aspects can be used to attach one or more nucleic acid molecules (e.g., barcode sequence, barcoded nucleic acid molecule, barcoded oligonucleotide, primer or other oligonucleotide) Receive beads.
  • the acrydite moiety can refer to acrydite analogs produced by the reaction of acrydite with one or more substances, such as the reaction of acrydite with other monomers and crosslinkers during the polymerization reaction.
  • the acrydite moiety can be modified to form a chemical bond with the substance to be attached, such as a nucleic acid molecule (e.g., barcode sequence, barcoded nucleic acid molecule, barcoded oligonucleotide, primer or other oligonucleotide).
  • the acrydite moiety can be modified with a thiol group capable of forming a disulfide bond, or it can be modified with a group that already contains a disulfide bond. Thiols or disulfides (via disulfide exchange) can be used as anchor points for the substance to be attached, or another part of the acrydite part can be used for attachment.
  • the attachment may be reversible, such that when the disulfide bond is broken (e.g., in the presence of a reducing agent), the attached substance is released from the beads.
  • the acrydite moiety can contain reactive hydroxyl groups that can be used for attachment.
  • it can also include other release methods, such as UV photo-induced release, or it can be released by enzymes.
  • the present application provides a device for co-dispensing a solid support (such as beads) with a sample, for example, for co-dispensing sample components and beads to the same discrete partition.
  • a solid support such as beads
  • sample components and beads for example, for co-dispensing sample components and beads to the same discrete partition.
  • the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag are co-distributed into the discrete partitions.
  • the device can be formed of any suitable material.
  • the device may be formed of a material selected from the group consisting of fused silica, soda lime glass, borosilicate glass, poly(methyl methacrylate) PMMA, PDMS, sapphire, silicon, germanium, cycloolefin copolymer, Polyethylene, polypropylene, polyacrylate, polycarbonate, plastics, thermosetting plastics, hydrogels, thermoplastics, paper, elastomers, and combinations thereof.
  • the discrete partitions may include holes or droplets.
  • the target nucleic acid derived from a single cell and the solid support to which at least one oligonucleotide tag is attached are co-dispensed into the wells or droplets.
  • the wells may include sample loading holes of a cell culture plate or any other container wells that can cooperate with the device and are suitable for co-dispensing.
  • the discrete partitions are droplets.
  • each of the discrete partitions includes at most the target nucleic acid derived from a single cell.
  • the target nucleic acid is located in a single cell or cell nucleus.
  • a microfluidic device is used to co-distribute the target nucleic acid derived from a single cell and the solid support attached with at least one oligonucleotide tag into the discrete partitions.
  • discrete partitions e.g., droplets or wells
  • discrete partitions contain single cells and are processed according to the methods described in this application.
  • discrete partitions contain single cells and/or single cell nuclei.
  • Single cells and/or single cell nuclei can be allocated and processed according to the methods described in this application.
  • a single cell nucleus can be an integral part of a cell.
  • discrete partitions contain chromatin from a single cell or single cell nucleus (e.g., a single chromosome or other part of the genome), and are distributed and processed according to the methods described in this application.
  • a ligase is also included in the discrete partition, and the ligase connects the oligonucleotide tag to the attached target nucleic acid.
  • the discrete partition includes but is not limited to ligase, and may also include other required enzymes.
  • DNA polymerases, DNA endonucleases, DNA exonucleases, terminal transferases, and light-sensitive enzymes capable of releasing the oligonucleotide tag from the solid support are pH-sensitive enzymes.
  • the ligase includes T4 ligase, but is not limited to T4 ligase. For example, it may also include E.
  • DNA ligase for example, DNA ligase I, DNA ligase Enzyme III, DNA ligase IV), thermostable ligase, etc.
  • the device is formed in a manner that includes fluid flow channels. Any suitable channel can be used.
  • the device includes one or more fluid input channels (e.g., inlet channels) and one or more fluid outlet channels.
  • the inner diameter of the fluid channel may be about 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 85 ⁇ m, 90 ⁇ m, 100 ⁇ m, 125 ⁇ m, or 150 ⁇ m.
  • the inner diameter of the fluid channel may be greater than 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 85 ⁇ m, 90 ⁇ m, 100 ⁇ m, 125 ⁇ m, 150 ⁇ m or more.
  • the inner diameter of the fluid channel may be less than about 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 85 ⁇ m, 90 ⁇ m, 100 ⁇ m, 125 ⁇ m, or 150 ⁇ m.
  • the volumetric flow rate in the fluid channel can be any flow rate known in the art.
  • the microfluidic device is a droplet generator.
  • a microfluidic device can be used to make the solid support attached with at least one oligonucleotide tag and the solid support attached with at least one oligonucleotide tag and the sample simultaneously formed A combination of samples (e.g., samples containing target nucleic acid).
  • the small aqueous droplets serve as discrete partitions.
  • the aqueous droplet may be an aqueous core surrounded by an oil phase, for example, an aqueous droplet in a water-in-oil emulsion.
  • the aqueous droplet may contain one or more solid supports to which at least one oligonucleotide tag is attached, a sample, an amplification reagent, and a reducing agent.
  • the aqueous droplet may contain one or more of the following: water, nuclease-free water, solid support attached with at least one oligonucleotide tag, acetonitrile, solid support, gel solid support Compounds, polymer precursors, polymer monomers, polyacrylamide monomers, acrylamide monomers, degradable crosslinkers, non-degradable crosslinkers, disulfide bonds, acrydite parts, PCR reagents, cells, nuclei , Chloroplasts, mitochondria, ribosomes, primers, polymerases, barcodes, polynucleotides, oligonucleotides, DNA, RNA, peptide polynucleotides, complementary DNA (cDNA), double-stranded DNA (dsDNA),
  • the aqueous droplets can have a uniform size or an uneven size.
  • the diameter of the aqueous droplet may be about 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 45 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, or 1 mm.
  • the fluid droplet may have a diameter of at least about 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 45 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, 1 mm or more. .
  • the fluid droplet may have a diameter of less than about 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 45 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, or 1 mm.
  • the fluid droplet may have a diameter of about 40-75 ⁇ m, 30-75 ⁇ m, 20-75 ⁇ m, 40-85 ⁇ m, 40-95 ⁇ m, 20-100 ⁇ m, 10-100 ⁇ m, 1-100 ⁇ m, 20-250 ⁇ m, or 20-500 ⁇ m. The diameter within the range.
  • the microfluidic device e.g., a small droplet generator
  • a sample e.g., nucleic acid sample
  • a first fluid input channel fluidly connected to a first fluid intersection (e.g., a first fluid junction).
  • a pre-formed solid support for example, a solid support to which at least one oligonucleotide tag is attached, such as a degradable solid support
  • the second fluid input channel that is also fluidly connected to the first fluid intersection point , wherein the first fluid input channel and the second fluid input channel meet at the intersection of the first fluid.
  • the sample and the solid support to which at least one oligonucleotide tag is attached can be mixed at the first fluid intersection to form a mixture (e.g., an aqueous mixture).
  • the fourth fluid input channel can be provided with a reducing agent (or other required reagents, such as surfactants, stabilizers, polymers, aptamers, initiators, biotin markers, fluorophores, buffers, acidic solutions, Alkaline solution, light-sensitive enzyme, pH-sensitive enzyme, aqueous buffer, etc.), the fourth fluid input channel is also fluidly connected to the first fluid intersection, and is connected to the first and second fluid input channels in the first fluid Intersection points meet.
  • the reducing agent can then be mixed with the solid support to which at least one oligonucleotide tag is attached and the sample at the first fluid intersection.
  • the reducing agent (or other required reagents, such as surfactants, stabilizers, polymers, aptamers, initiators, biotin markers, fluorophores, buffers, acidic Solutions, alkaline solutions, light-sensitive enzymes, pH-sensitive enzymes, aqueous buffers, etc.) are premixed with the sample and/or the solid support to which at least one oligonucleotide tag is attached so as to pass through the first fluid input channel
  • a sample is provided to the microfluidic device and/or a solid support to which at least one oligonucleotide tag is attached is provided to the microfluidic device through the second fluid input channel.
  • the sample containing the target nucleic acid and the solid support mixture to which at least one oligonucleotide tag is attached can pass through a first fluid connected to the first fluid intersection (and to any fluid channel that constitutes the first fluid intersection).
  • the outlet channel leaves the first fluid intersection.
  • the mixture may be provided to a second fluid intersection (e.g., a second fluid junction) fluidly connected to the first outlet channel.
  • a second fluid intersection e.g., a second fluid junction
  • an oil (or other suitable immiscible) fluid can be fluidly connected from the point of intersection with the second fluid (and to any fluid channel that constitutes that point of intersection) and meets the first outlet channel at the second fluid intersection point.
  • One or more separate fluid input channels enter the second fluid intersection.
  • oil can be provided in one or two separate fluid input channels that are fluidly connected to the second fluid intersection (and to the first outlet channel) and that intersect with the first outlet channel and each other at the second fluid intersection point (or Other suitable immiscible fluids).
  • the oil and the mixture of the sample and the solid support to which at least one oligonucleotide tag is attached can be mixed at the second fluid intersection.
  • the formed aqueous droplets can be transported within the oil through the second fluid outlet channel exiting from the second fluid intersection.
  • the formed aqueous droplets may also exit the second outlet channel from the first fluid intersection point and the fluid droplets may be dispensed into the holes for further processing.
  • the sample containing the target nucleic acid is formed into droplets such that at least 50%, 60%, 70%, 80%, 90% or more of the droplets contain no more than one label with at least one oligonucleotide attached.
  • Solid support allowing at least 50%, 60%, 70%, 80%, 90%, or more of the sample containing the target nucleic acid to form a droplet includes exactly one solid support to which at least one oligonucleotide tag is attached.
  • the sample before the mixture enters the microfluidic device, the sample can be combined with a solid support that contains any other reagents (for example, an amplifying agent, a reducing agent, etc., required for sample amplification) to which at least one oligonucleotide tag is attached.
  • the substance e.g., a degradable solid support
  • the mixture can flow from the first fluid input channel and enter the fluid intersection.
  • the oil phase may enter the fluid intersection from a second fluid input channel (for example, a fluid channel perpendicular or substantially perpendicular to the first fluid input channel) that is also fluidly connected to the fluid intersection.
  • the aqueous mixture and oil can be mixed at the point of fluid intersection, so that a water-in-oil emulsion (e.g., a solid support-water-oil emulsion) is formed.
  • the emulsion may contain a plurality of small aqueous droplets (e.g., small droplets containing an aqueous reaction mixture) in a continuous oil phase.
  • each aqueous droplet may contain a single solid support (e.g., a gel solid support attached to the same set of barcodes), an aliquot of a sample (e.g., target nucleic acid from one cell), and any other reagents (E.g., reducing agent, reagents required for sample amplification, etc.).
  • the fluid droplet may comprise a plurality of solid supports to which at least one oligonucleotide tag is attached.
  • the droplets can be transported by the continuous oil phase through the fluid outlet channel away from the fluid intersection.
  • the fluid droplets leaving the outlet channel can be dispensed into the holes for further processing.
  • the fluid droplets formed at the second fluid intersection may contain the reducing agent.
  • the reducing agent can degrade or dissolve the solid support contained in the fluid droplet when the droplet travels through the exit channel leaving the intersection of the second fluid.
  • a microfluidic device may contain three discrete fluid intersection points in parallel. Liquid droplets can be formed at any of the three fluid intersection points.
  • the sample and the solid support to which at least one oligonucleotide tag is attached can be mixed in any of the three fluid intersections.
  • a reducing agent or other arbitrary and required reagents, such as a permeabilizing agent, an amplifying agent, a cutting agent that releases the oligonucleotide tag from the solid support
  • Oil can be added at any of the intersections of these three fluids.
  • the microfluidic device includes a first input channel and a second input channel, which merge at a junction fluidly connected to the output channel.
  • the outlet channel may be fluidly connected with the third input channel at the junction.
  • the method further includes introducing a sample containing the target nucleic acid into the first input channel, and introducing the solid support attached with at least one oligonucleotide tag into the second input channel, thereby A mixture of the sample and the solid support to which at least one oligonucleotide tag is attached is generated in the output channel.
  • a fourth input channel may also be included and it may intersect the third input channel and the outlet channel at the junction.
  • the microfluidic device may include first, second, and third input channels, where the third input channel intersects the first input channel, the second input channel, or the junction of the first input channel and the second input channel.
  • the output channel and the third input channel are fluidly connected at the junction.
  • the first input channel and the second input channel form a substantially perpendicular angle to each other.
  • each of the discrete partitions contains at most the target nucleic acid from a single cell.
  • oil can be used to produce droplets.
  • the oil may include fluorinated oil, silicone oil, mineral oil, vegetable oil, and combinations thereof.
  • the aqueous fluid in the microfluidic device may also contain alcohol.
  • the alcohol can be glycerol, ethanol, methanol, isopropanol, pentanol, ethane, propane, butane, pentane, hexane, and combinations thereof.
  • the alcohol can be at about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19% or 20% (v/v) exists in the aqueous fluid.
  • the alcohol may be at least about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, Concentrations of 19%, 20% or higher (v/v) are present in the aqueous fluid.
  • the alcohol may be less than about 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19% or 20% (v/v) is present in the aqueous fluid.
  • the oil may also contain surfactants to stabilize the emulsion.
  • the surfactant may be a fluorosurfactant, Krytox lubricant, Krytox FSH, engineered fluid, HFE-7500, silicone compound, PEG-containing silicon compound, such as bis krytoxpeg (BKP).
  • the surfactant can be used at about 0.1%, 0.5%, 1%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2%, 5% or 10% %(W/w) exists.
  • the surfactant can be at least about 0.1%, 0.5%, 1%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2%, 5 %, 10% (w/w) or higher concentration exists.
  • the surfactant may be less than about 0.1%, 0.5%, 1%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, 2%, 5 % Or 10% (w/w) is present.
  • accelerators and/or initiators can be added to the oil.
  • the accelerator may be tetramethylethylenediamine (TMEDA or TEMED).
  • the initiator may be ammonium persulfate or calcium ion.
  • the accelerator can be at a rate of about 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1%, 1.1%, 1.2%, 1.3%, 1.4%, 1.5% , 1.6%, 1.7%, 1.8%, 1.9% or 2% (v/v) are present.
  • the accelerator may be at least about 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1%, 1.1%, 1.2%, 1.3%, 1.4% , 1.5%, 1.6%, 1.7%, 1.8%, 1.9%, or 2% (v/v) or higher concentration.
  • the accelerator may be less than about 0.1%, 0.2%, 0.3%, 0.4%, 0.5%, 0.6%, 0.7%, 0.8%, 0.9%, 1%, 1.1%, 1.2%, 1.3%, 1.4% , 1.5%, 1.6%, 1.7%, 1.8%, 1.9% or 2% (v/v) are present.
  • the cell is a cell of any organism.
  • the cells of the organism may be in vitro cells (for example, an established cultured cell line), or may be isolated cells (cultured cells from an individual, primary cells).
  • the cell may be a cell in the body (a cell in a biological individual), for example, a cell from various tissues.
  • the biological cells may include animal cells, plant cells, and microbial cells.
  • the plant cell may include Arabidopsis thaliana cells, and may also include cells of agricultural crops, such as plant somatic cells such as wheat, corn, rice, sorghum, millet, soybean, etc.; the plant cells may also include cells of fruit and nut plants , For example, produce apricots, oranges, lemons, apples, plums, pears, almonds, walnuts and other plants.
  • the plant cell may be a cell derived from any part of the plant body, for example, root cells, leaf cells, xylem cells, phloem cells, cambium cells, apical meristem cells, parenchyma cells.
  • the microbial cells may include bacteria (e.g. Escherichia coli, archaea), fungi (e.g. yeast), actinomycetes, rickettsiae, mycoplasma, chlamydia, spirochetes, and the like.
  • the animal cell may include invertebrate (e.g., Drosophila, nematode, planarian, etc.) cells, vertebrate (e.g., zebrafish, chicken, mammalian) cells.
  • invertebrate e.g., Drosophila, nematode, planarian, etc.
  • vertebrate e.g., zebrafish, chicken, mammalian cells.
  • the mammalian cells may include mice, rats, rabbits, pigs, dogs, cats, monkeys, humans, and the like.
  • the animal cells may include cells from any tissue of the organism, such as stem cells, induced pluripotent stem (iPS) cells, germ cells (eg oocytes, egg cells, sperm cells, etc.), adult stem cells, somatic cells ( For example, fibroblasts, hematopoietic cells, cardiomyocytes, neurons, muscle cells, bone cells, liver cells, pancreatic cells, epithelial cells, immune cells and those derived from lung, spleen, kidney, stomach, large intestine, small intestine and other organs or tissues Any cell) and embryos at any stage in vitro or in vivo.
  • stem cells eg oocytes, egg cells, sperm cells, etc.
  • germ cells eg oocytes, egg cells, sperm cells, etc.
  • adult stem cells e.g., etc.
  • somatic cells fibroblasts, hematopoietic cells, cardiomyocytes, neurons, muscle cells, bone cells, liver cells, pancreatic cells, epit
  • the cell may be a cell derived from a biological fluid.
  • the body fluid of the organism may include cerebrospinal fluid, aqueous humor, lymph, digestive juice (e.g., saliva, gastric juice, small intestinal fluid, bile, etc.), breast milk, blood, urine, sweat, tears, feces, respiratory secretions, reproduction Organ secretions (such as semen, cervical mucus), etc.
  • the sample includes the cell and/or the nucleus obtained therefrom.
  • the sample may include nucleic acid molecules of the organism.
  • the nucleic acid molecule can be isolated and extracted from any organism by the technical means known to those skilled in the art to separate nucleic acid molecules, including DNA and RNA.
  • the nucleic acid molecule is extracted from the aforementioned biological cells or body fluids of the biological body.
  • the target nucleic acid may include nucleic acid derived from any of the aforementioned cells.
  • nucleic acid in a single cell may be included in the target nucleic acid.
  • the target nucleic acid may be derived from a single cell polynucleotide, for example, double-stranded DNA.
  • the double-stranded DNA may include genomic DNA, for example, coding DNA and non-coding DNA; for example, open chromatin region DNA, protein binding site DNA, mitochondrial DNA and chloroplast DNA, for example, the polynucleotide may include RNA, for example Ribosomal RNA, mRNA.
  • the target nucleic acid can also be a sample containing cells from a formalin-fixed and paraffin-embedded (Formalin-Fixed and Parrffin-Embedded, FFPE).
  • FFPE Formin-Fixed and Parrffin-Embedded
  • the target nucleic acid may also include a sequence containing a SNP site in the genome of an organism, and a nucleotide sequence modified by methylation or hydroxymethylation.
  • the cells can also be pretreated.
  • the pretreatment also includes exposing the nucleus of the cell.
  • cell nuclei can be exposed by treatment with lysis buffer and concentrated sucrose solution.
  • the cell and/or the cell nucleus exposed (obtained) therefrom may be encapsulated in a suitable matrix to form microspheres, and the microspheres are used as a sample for reaction.
  • the pretreatment includes fixing the cell and/or the cell nucleus exposed (obtained) therefrom.
  • a fixative is used to fix the cells, and the fixative is selected from one or more of the following group: formaldehyde, paraformaldehyde, methanol, ethanol, acetone, glutaraldehyde, osmic acid and dichromic acid Potassium.
  • the pretreatment includes treating the cells or cell nucleus with a detergent, and the detergent includes Triton, NP-40 and/or digitonin.
  • the pretreatment may also include the removal of organelles such as mitochondria, chloroplasts, and ribosomes.
  • organelles such as mitochondria, chloroplasts, and ribosomes.
  • the cells can be dispensed with the lysis reagent to release the contents of the cells in the discrete subregions.
  • the lytic agent is brought into contact with the cell suspension at the same time that the cells are introduced into the droplet generation area through the additional channel, or when the cells are about to be introduced into the droplet generation area.
  • the lysing agent may include biologically active agents, such as lysing enzymes for lysing different cell types (e.g., gram positive or negative bacteria, plants, yeast, mammals, etc.), such as lysozyme, leuco peptidase, Lysostaphin, thioglucosidase kitalase, lyticase, and other commercially available lytic enzymes.
  • lytic agents can also be co-partitioned with the cells so that the contents of the cells are released into discrete partitions.
  • a surfactant-based dissolving solution may be used to dissolve the cells, for example, the dissolving solution may include a nonionic surfactant such as Triton X-100 and Tween 20.
  • the dissolving solution may include ionic surfactants such as sodium lauryl sarcosinate and sodium dodecyl sulfate (SDS).
  • SDS sodium dodecyl sulfate
  • other methods that can be used such as electroporation, heat, sound, or mechanical cell destruction
  • electroporation, heat, sound, or mechanical cell destruction can also be used for lysis.
  • the present application also provides a composition
  • a composition comprising: a plurality of solid supports, each of said solid supports is attached with at least one oligonucleotide tag, wherein each of said oligonucleotide tags comprises a first A strand and a second strand, the first strand includes a barcode sequence and a hybridization sequence at the 3'end of the barcode sequence, and the second strand includes a first portion complementary to the hybrid sequence of the first strand and The second part of the sequence of the nucleic acid to be tested is complementary, and the first strand and the second strand form a partially double-stranded structure or the second strand and the attached target nucleic acid form a partially double-stranded structure Structure; the barcode sequence of the oligonucleotide tag includes a common barcode domain and a variable domain, and the common barcode domain is the same in the oligonucleotide tag attached to the same solid support, and The common barcode domain is different between two or more solid supports in
  • the application also provides a kit for analyzing target nucleic acid from cells, which includes the composition described in the application.
  • the kit may also include a transposase.
  • the kit further includes at least one of a nucleic acid amplification agent, a reverse transcription agent, a fixative, a permeabilizing agent, a linking agent, and a lysing agent.
  • the nucleotide tag has two strands, forming a partial double-stranded structure 1, as shown below:
  • Chain I solid support ⁇ attachment sequence-barcode sequence (barcode)-hybridization sequence (fixed sequence, hybridizing with the complementary part of chain II), wherein the barcode sequence (barcode) is (barcode-linker) n greater than or equal to 1.
  • Bead-acrydite-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT read1, SEQ ID NO: 6
  • barcode-ATCCACGTGCTTGAG SEQ ID NO: 12
  • Strand II Hybrid sequence (fixed sequence, hybridized with fixed DNA sequence in strand I)-a sequence complementary to the 5'end of strand I of the transposon complex
  • the solid support is polyacrylamide microspheres, which are prepared by a microfluidic device.
  • Acrylamide Bis mixture, acrydite-DNA primer and APS inducer are mixed in a microfluidic device to form droplets, which contain TEMED catalyst, The droplets will spontaneously polymerize into gel microspheres, and then the microspheres will be labeled according to the barcode synthesis method.
  • the solution contains 10mM DTT, and the S-S bond can be reduced to release the primer.
  • Chain A Phosphate group-a sequence that is at least partially complementary to the fixed DNA sequence in chain I or chain II of the nucleic acid molecule in chain II-(UMI)-the sequence bound by Tn5 transposase
  • AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG (SEQ ID NO: 5)
  • Chain B Tn5 transposase binding sequence (sequence complementary to the sequence binding to the transposon protein (Tn5) in chain A)-phosphate group
  • the UMI in the A chain is not necessary; the sequences in (1) and (2) can contain modified bases, such as 5mC.
  • the Tn5 transposable complex is a dimer.
  • Two Tn5 proteins can bind to the same or different partial double-stranded DNA transposons, so that the insertion site is marked by one or two types of DNA;
  • Tn5 protein (which can contain point mutation super Active or other types of transposase) can be combined with the above double-stranded structure 2 and another standard transposon DNA to assemble into a hybrid transposable complex, or only the above-mentioned double-stranded structure 2 can be used to form a single Tn5 transposition.
  • Complex can be used to form a single Tn5 transposition.
  • samples can be non-fixed cells or nuclei, formaldehyde (or other fixatives) fixed cells or nuclei, non-fixed or fixed tissue sections, etc.
  • the fixed or non-fixed sample is treated with a buffer containing detergents (Triton, NP-40 or Digitonin, etc.), and it can also include an intermediate step of lysing cells (non-fixed samples) to obtain cell nuclei.
  • Typical penetrant solutions can include Tris, sucrose, sodium chloride, detergents.
  • Tn5 enzyme buffer containing divalent metal ion for example, magnesium ion
  • the reaction system includes: cell or cell nucleus or tissue; Tn5 transposition complex; buffer. After the reaction, the sample was washed with buffer to remove unreacted Tn5 enzyme.
  • the reaction system includes: cell or nucleus or tissue (after the transposition reaction); T4 DNA ligase, nucleotide tag, after the reaction, add excess free and nucleotide tag complementary sequences to the ligation reaction system to block excess unreacted Nucleotide tag.
  • Tn5 enzyme purchased from Epicenter
  • 10uM Tn5 enzyme purchased from Epicenter
  • the transposon formed by the Top1/Bottom double strand and Tn5 is p-Tn5
  • the Top2/Bottom double strand is formed by Tn5.
  • the transposon is Tn5-B.
  • the PCR adaptor sequence is ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 6)
  • the connection sequence 1 (Linker1) is CGACTCACTACAGGG (SEQ ID NO: 7)
  • the connection sequence 2 (Linker 2) sequence is TCGGTGACACGATCG (SEQ ID NO: 8)
  • the synthesized microspheres were evenly divided into 96-well plates, and PCR handle-96xbarcode1-linker1 was added respectively, and the first round of barcoding reaction was performed.
  • the reaction system and process are as follows: 10ul microspheres+2ul BstI buffer+1ul 10uM dNTP+1ul 100uM PCR handle-96xbarcode1-linker1, then keep at 95°C for 5min, 60°C for 20min; then add 1ul BstI+5ul H 2 O, hold at 60°C for 60 min.
  • microspheres After the microspheres are washed, they are annealed with the complementary sequence CGAATGCTCTGGCCTCAAGCACGTGGAT (SEQ ID NO: 9) to form a partial double-stranded structure, and finally the following micro-beads with a partial double-stranded structure attached are obtained.
  • ball the complementary sequence CGAATGCTCTGGCCTCAAGCACGTGGAT (SEQ ID NO: 9) to form a partial double-stranded structure, and finally the following micro-beads with a partial double-stranded structure attached are obtained.
  • the human 293T cell line was resuspended in lysis buffer (10mM Tris-Cl, pH 7.4; 10mM NaCl; 3mM MgCl2; 0.01% NP-40) to lyse the cells to obtain cell nuclei.
  • reaction system Take 100,000 cell nuclei and react with the p-Tn5 and Tn5-B obtained in step (1).
  • the reaction system is as follows:
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • step D Add an equal volume of perfluorooctanol breaker droplets to the droplets in step D, centrifuge, extract the water phase, use Qiagen DNA purification kit to purify the DNA in the water phase, and use the following reaction system to amplify the DNA to obtain the final sequencing library: 36ul DNA Template, 10ul 5xPCR Buffer, 1ul 10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul Taq, 94°C 2min, 94°C 30sec, 55°C 30sec, 72c 30sec, 18 cycles.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • the above library invested about 500 cells, each cell was sequenced 100,000 PE150reads, and the total data volume was 15G.
  • the sequenced fragment size presents a typical ATAC nucleosome gradient (see Figure 5), and the signal-enriched transcription start site (TSS) presents a typical ATAC signal (see Figures 6A and 6B).
  • TSS signal-enriched transcription start site
  • the peaks overlap with the known open area (see Figure 7), where the total peak value is 11898, and the peaks overlapped with union DHS ratio (Peaks overlapped with union DHS ratio) is 74.0%.
  • the peak ratio of the blacklist Peaks overlaped with blacklist ratio
  • FRiP Fraction of reads in peaks, the read length score falling into the peak domain
  • CUT&Tag is the latest method to study the interaction between DNA and protein, instead of the traditional ChIP-seq method. Its principle is to use a protein A (a cell-derived protein that can bind to the conserved region of antibody heavy chains from different species). ) A protein fused with Tn5, through the binding of protein A and antibody, the Tn5 enzyme is targeted to the target protein bound by the antibody, and the DNA fragment is directly inserted into the DNA region bound by the target protein through the transposition activity of the Tn5 enzyme. This product is amplified and sequenced to directly obtain the binding position information of the protein.
  • the molecular product of CUT&Tag is the same as ATAC, the difference is that the Tn5 enzyme insertion site in ATAC is in the open chromatin region, and the Tn5 insertion site in CUT&Tag is around the target protein, so it can be used similar to ATAC in Example 1.
  • Method to label this product the DNA transposon used is similar to ATAC, and it can also assemble one or hybrid Tn5 transposition complex. The different steps are: use protein A or G-Tn5 fusion protein to assemble the Tn5 transposition complex; in order to distinguish between multiple antibodies, in addition to the ATAC Tn5 sequence, the DNA transposon can also contain antibody identification codes at different positions. Used to distinguish multiple antibodies.
  • Sample preparation it can be non-fixed cells or nuclei, cells or nuclei fixed with formaldehyde (or other fixatives), non-fixed or fixed tissue sections, etc.
  • the fixed or non-fixed sample is treated with a buffer containing detergents (Triton, NP-40 or Digitonin, etc.), and it can also include an intermediate step of lysing cells (non-fixed samples) to obtain cell nuclei. Permeate the cell and nucleus so that the Tn5 enzyme can enter the nucleus for action.
  • a buffer containing detergents Triton, NP-40 or Digitonin, etc.
  • Transposition reaction Use protein A-Tn5 fusion protein (primary antibody-protein A-Tn5 fusion protein complex) to bind the sample, wash the excess enzyme, and then add the Tn5 reaction solution containing divalent ions to the sample to carry out the transposition reaction, (37°C, 30 minutes to 2 hours).
  • protein A-Tn5 fusion protein primary antibody-protein A-Tn5 fusion protein complex
  • the nucleotide tag has two strands, forming a partial double-stranded structure 1, as shown below:
  • Chain I solid support ⁇ attachment sequence-barcode sequence (barcode)-hybridization sequence (fixed sequence, hybridizing with the complementary part of chain II), wherein the barcode sequence (barcode) is (barcode-linker) n greater than or equal to 1.
  • Bead-acrydite-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT read1, SEQ ID NO: 6
  • barcode-ATCCACGTGCTTGAG SEQ ID NO: 12
  • Strand II Hybrid sequence (fixed sequence, hybridized with fixed DNA sequence in strand I)-a sequence complementary to the 5'end of strand I of the transposon complex
  • the solid support is polyacrylamide microspheres, which are prepared by a microfluidic device.
  • Acrylamide Bis mixture, acrydite-DNA primer and APS inducer are mixed in a microfluidic device to form droplets, which contain TEMED catalyst, The droplets will spontaneously polymerize into gel microspheres, and then the microspheres will be labeled according to the barcode synthesis method.
  • the solution contains 10mM DTT, and the S-S bond can be reduced to release the primer.
  • Chain A Phosphate group-a sequence that is at least partially complementary to the fixed DNA sequence in chain I or chain II of the nucleic acid molecule in chain II-(UMI)-the sequence bound by Tn5 transposase
  • AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG (SEQ ID NO: 5)
  • Chain B Tn5 transposase binding sequence (sequence complementary to the sequence binding to the transposon protein (Tn5) in chain A)-phosphate group
  • the UMI in the A chain is not necessary; the sequences in (1) and (2) can contain modified bases, such as 5mC.
  • the Tn5 transposition complex is a dimer.
  • Two pA-Tn5 proteins can bind to the same or different partial double-stranded DNA transposons, so that the insertion site is marked by one or two types of DNA;
  • pA-Tn5 protein can be Containing point mutation hyperactivity or other types of transposase
  • the equimolar concentration of pA-Tn5 protein and the annealed double-stranded primer are mixed and placed at room temperature for more than 1 hour to form a functional transposon complex.
  • samples can be non-fixed cells or nuclei, formaldehyde (or other fixatives) fixed cells or nuclei, non-fixed or fixed tissue sections, etc.
  • the fixed or non-fixed sample is treated with a buffer containing detergents (Triton, NP-40 or Digitonin, etc.), and it can also include an intermediate step of lysing cells (non-fixed samples) to obtain cell nuclei. Permeate the cell and the nucleus, so that the antibody and pA-Tn5 enzyme can enter the nucleus for action.
  • Typical penetrant solutions can include Tris, sucrose, sodium chloride, detergents.
  • the antibody against the target protein is incubated with the sample so that the antibody specifically binds to the target protein, and the unbound antibody is removed by washing. Then incubate the pA-Tn5 transposon with the sample, so that the pA-Tn5 protein binds to the antibody, thereby positioning it near the target protein.
  • Tn5 enzyme buffer containing divalent metal ion for example, magnesium ion
  • the transposition reaction 37°C, 30 minutes to 2 hours. That is, the reaction system includes: cell or cell nucleus or tissue; buffer. After the reaction, the sample is washed with buffer to remove unreacted reagents.
  • the reaction system includes: cell or nucleus or tissue (after the transposition reaction); T4 DNA ligase, nucleotide tag, after the reaction, add excess free and nucleotide tag complementary sequences to the ligation reaction system to block excess unreacted Nucleotide tag.
  • transposome A.pA-Tn5 transposome (transposome)
  • the pA-Tn5 transposon was assembled into a 10uM concentration of pA-Tn5 transposon.
  • the transposon formed by the Top1/Bottom double strand and Tn5 is p-pA-Tn5
  • Top2/Bottom The transposon formed by the double strand and Tn5 is pA-Tn5-B.
  • the PCR adaptor sequence is ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 6)
  • the connection sequence 1 (Linker1) is CGACTCACTACAGGG (SEQ ID NO: 7)
  • the connection sequence 2 (Linker 2) sequence is TCGGTGACACGATCG (SEQ ID NO: 8)
  • the synthesized microspheres were evenly divided into 96-well plates, and PCR handle-96xbarcode1-linker1 was added respectively, and the first round of barcoding reaction was performed.
  • the reaction system and process are as follows: 10ul microspheres+2ul BstI buffer+1ul 10uM dNTP+1ul 100uM PCR handle-96xbarcode1-linker1, then keep at 95°C for 5min, 60°C for 20min; then add 1ul BstI+5ul H 2 O, hold at 60°C for 60 min.
  • microspheres After the microspheres are washed, they are annealed with the complementary sequence CGAATGCTCTGGCCTCAAGCACGTGGAT (SEQ ID NO: 9) to form a partial double-stranded structure, and finally the following micro-beads with a partial double-stranded structure attached are obtained.
  • ball the complementary sequence CGAATGCTCTGGCCTCAAGCACGTGGAT (SEQ ID NO: 9) to form a partial double-stranded structure, and finally the following micro-beads with a partial double-stranded structure attached are obtained.
  • the human cell line 293T were resuspended in lysis buffer (10mM Tris-Cl, pH 7.4 ; 10mM NaCl; 3mM MgCl 2; 0.01% NP-40) cells were lysed to obtain a cell nucleus.
  • the binding conditions are as follows: 0.05% Digitonin, 20mM HEPES, pH 7.5, 300mM NaCl, 0.5mM Spermidine, 1X Protease inhibitor (Roche) buffer In, the antibody concentration is 1ug/100ul, bind at room temperature for 1hr or 4 degrees Celsius overnight.
  • the target protein antibody such as anti-histone H3K4me3 antibody (Abcam)
  • the binding conditions are as follows: 0.05% Digitonin, 20mM HEPES, pH 7.5, 300mM NaCl, 0.5mM Spermidine, 1X Protease inhibitor (Roche) buffer
  • the antibody concentration is 1ug/100ul, bind at room temperature for 1hr or 4 degrees Celsius overnight.
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • cell nucleus solution 100 cell nucleus/ul concentration
  • cell nucleus solution 100 cell nucleus/ul concentration
  • concentration 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • step D Add an equal volume of perfluorooctanol breaker droplets to the droplets in step D, centrifuge, extract the water phase, use Qiagen DNA purification kit to purify the DNA in the water phase, and use the following reaction system to amplify the DNA to obtain the final sequencing library: 36ul DNA Template, 10ul 5xPCR Buffer, 1ul 10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul Taq, 94°C2min, 94°C30sec, 55°C30sec, 72°C30sec, 18 cycles.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • the above library invested about 500 cells, each cell was sequenced 100,000 PE150reads, and the total data volume was 15G.
  • FIG. 13 shows the results of the Cuttag library fragment distribution.
  • Figure 14 shows the results of the distribution of Cuttag fragments at the transcription start site.
  • Figure 15 shows the proportion of Cuttag fragments distributed in the genome.
  • Figure 16 shows the distribution results of single-cell Cuttag results.
  • the superimposed single-cell data shows typical H3K4me3 histone modification distribution characteristics, which are highly similar to the experimental results of multi-cell samples, indicating that the single-cell data obtained by this method is true and accurate sex.
  • a buffer containing detergents Triton, NP-40, Digitonin, etc.
  • Triton, NP-40, Digitonin, etc. may include intermediate steps of lysing cells (non-fixed samples) to obtain nuclei, detergent lysis or permeabilization Cells and nuclei allow molecular biology reagents such as enzymes to enter the cells or nucleus.
  • step (3) Use the reverse transcription primer of step (1) to provide a reverse transcriptase reaction system, add a chain conversion template, and perform an intracellular reverse transcription reaction on the sample. After the reaction, the cell/nucleus is still in an independent and complete form.
  • the reaction system and conditions are as follows: cells/tissues, reverse transcriptase buffer, RNase inhibitor, dNTP, TSO chain conversion primer, reverse transcription primer; 50-55°C, 5 minutes, 4°C + reverse transcriptase, 42°C. Wash and remove primers and enzyme system, and carry out nucleotide tag ligation reaction on cells or tissues. After the end, add primers to neutralize excess primers.
  • Purify mRNA/cDNA directly purify mRNA/cDNA from non-fixed tissue, and purify mRNA/cDNA after uncrosslinking of fixed tissue; perform PCR amplification of cDNA on mRNA/cDNA to obtain cDNA library, and use Tn5 or other DNA interruption methods to cDNA library Construct a sequencing library.
  • Cell nucleus homogenize the tissue in 10mM Tris-Cl, pH 7.4; 10mM NaCl; 3mM MgCl 2 ; 0.01% NP-40 buffer, lyse the cells, centrifuge at 500g for 5min, resuspend once with buffer, centrifuge at 500g for 5min, and resuspend in the above buffer.
  • each component is as follows: 1000/ul cell nucleus, 1x RT Buffer, 1uM dNTP, 1uM above reverse transcription primer, 1u/ul RNase enzyme inhibitor, 1uM TSO primer primer sequence (5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID) NO:14)-3', where the G at the 3 end can be rG, rG represents ribose guanine, 1 unit/ul RT enzyme (Superscript II reverse transcriptase); reaction conditions: 50°C 5min, 4°C 5min, 42°C 60min, Wash the nucleus with PBS, centrifuge at 500g 5min and wash twice to remove unreacted enzymes and primers.
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme. 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles. Purify the library with AMPure XP magnetic beads in a 1:1 volume.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • the above library invested about 500 cells, each cell was sequenced 100,000 PE150reads, and the total data volume was 15G.
  • Figure 17 shows that the single cell result clearly distinguishes a single cell of the two types of cells.
  • Figure 18 shows the distribution of the number of transcripts and genes detected in each cell. The method of this application can be used for single cell transcriptome detection.
  • a single-cell genome experiment can be performed on the two mixed cells 293T (human) and 3T3 (mouse) using the method of this application. According to the ratio of the measured sequence alignment to the human or mouse genome, Figure 19 shows that the single cell result clearly distinguishes a single cell of the two types of cells.
  • FIG. 20 shows that the coverage of the genome of a single human cell, according to the arrangement of the chromosomes, shows that single-cell sequencing has a different degree of coverage in each cell and each genome site.
  • the method of this application can be used for single cell genome detection.
  • the method of the present application can also be used to distinguish single cells from various cells in mixed cells through genome and transcriptome detection.
  • nucleotide tag is the same as in Example 1. The only difference is that in the 5'amine-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 6) sequence coupled to the microsphere, all C bases are replaced with 5mC modified bases.
  • Sample Fixed single cell or cell nucleus.
  • Sample processing method Cells or cell nuclei are treated with a certain concentration of SDS and/or other detergents and heated for a certain period of time to remove the protein bound to the DNA, but does not untie the cross-links, so the DNA is still fixed In the cell structure.
  • the reaction system includes: cell or cell nucleus or tissue (after transposition reaction); T4 DNA ligase, nucleotide tag, T4 DNA ligase. After the reaction, excess free and nucleotide tag complementary sequences are added to the ligation reaction system to block the excess unreacted nucleotide tag.
  • CNV copy number information
  • SNV point mutation information
  • the connecting primer is designed to resist the transformed base or the modified base to ensure amplification.
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • a transformation kit such as EpiTect Fast Bisulfite Conversion Kit or NEB Enzymatic Methylation conversion kit to transform the genomic DNA obtained above, for example, Qiagen kit as an example, refer to the instructions to configure the bisulfite conversion reagent.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Recover DNA 12.5 ⁇ L 4X Enzyme Reaction Buffer, 10ul 5-hmC Modifying Enzyme, add water to 50ul, and react at 30°C for 1hr. Purify DNA with a 1:1 volume of magnetic beads.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Figure 23 shows the results of the distribution of 5hmC modification sites in a single cell.
  • the single-cell 5hmC modification data obtained in this application is true and accurate.
  • the dT primer and Tn5 enzyme containing the same 5'end linking sequence were used. Prepare the cell nucleus, and then perform RT (reverse transcription) reaction on the cells, wash and remove the RT reaction system, and then perform the Tn5 ATAC reaction, and then the mRNA and ATAC in the cell will be labeled at the same time. Then connect to the primer released on the microsphere. Recover ATAC DNA and RT mRNA/cDNA mixture.
  • the human cell line 293T were resuspended in lysis buffer (10mM Tris-Cl, pH 7.4 ; 10mM NaCl; 3mM MgCl 2; 0.01% NP-40) cells were lysed to obtain a cell nucleus.
  • reaction system Take 100,000 cell nuclei and react with p-Tn5 and Tn5-B obtained in the examples of this application.
  • the reaction system is as follows:
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution bead solution, oil (FC40 fluorocarbon oil, containing 1% surfactant FluoroSurfactant, Ran Biotech) form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • oil FC40 fluorocarbon oil, containing 1% surfactant FluoroSurfactant, Ran Biotech
  • reaction system to amplify DNA to obtain the final sequencing library: 36ul DNA template, 10ul 5xPCR Buffer, 1ul 10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul 10mM ISPCR primer, 1ul Taq, 72°C 5min, 94°C 2min, 94°C30sec, 55°C30sec, 72°C3min, 12 cycles.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme, 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • This application is used to detect the transcriptome and ATAC of the same cell at the same time.
  • Figure 24 shows that a single cell of the two types of cells can be well distinguished based on the transcriptome and ATAC genome.
  • the method of this application is used to simultaneously detect the transcriptome and ATAC of the same cell with accuracy.
  • the human cell line 293T were resuspended in lysis buffer (10mM Tris-Cl, pH 7.4 ; 10mM NaCl; 3mM MgCl 2; 0.01% NP-40) cells were lysed to obtain a cell nucleus.
  • the binding conditions are as follows: 0.05% Digitonin, 20mM HEPES, pH 7.5, 300mM NaCl, 0.5mM Spermidine, 1X Protease inhibitor (Roche) buffer In, the antibody concentration is 1ug/100ul, bind at room temperature for 1hr or 4°C overnight.
  • the target protein antibody such as anti-histone H3K4me3 antibody (Abcam)
  • the binding conditions are as follows: 0.05% Digitonin, 20mM HEPES, pH 7.5, 300mM NaCl, 0.5mM Spermidine, 1X Protease inhibitor (Roche) buffer
  • the antibody concentration is 1ug/100ul, bind at room temperature for 1hr or 4°C overnight.
  • the cell nucleus obtained above is subjected to the following RT reaction and the final concentration of each component is as follows: 1000/ul cell nucleus, 1x RT Buffer, 1uM dNTP, 1uM above reverse transcription primer, 1u/ul RNase enzyme inhibitor, 1uM TSO primer primer sequence (5′- AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID NO: 14)-3', where the G at the 3 end can be rG, rG represents ribose guanine, 1 unit/ul RT enzyme (Superscript II reverse transcriptase); reaction conditions: 50°C for 5 minutes, 4°C for 5 minutes, At 42°C for 60 minutes, wash the nucleus with PBS, centrifuge at 500g for 5 minutes and wash twice to remove unreacted enzymes and primers.
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • reaction system uses the following reaction system to amplify DNA to obtain the final sequencing library: 36ul DNA template, 10ul 5xPCR Buffer, 1ul10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul 10mM ISPCR primer, 1ul Taq, 72°C 5min, 94°C 2min, 94 °C30sec, 55°C30sec, 72°C3min, 12 cycles.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • DNA template 10ul 5xPCR Buffer, 1ul 10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul Taq 94°C2min, 94°C30sec, 55°C30sec, 72°C30sec, 18 cycles
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme, 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Figure 25 shows that, according to both the transcriptome and the cut tag group, a single cell of the two types of cells can be well distinguished.
  • the method of this application is used to simultaneously detect the transcriptome and cut tag of the same cell with accuracy.
  • the sample is processed in the same way as simply detecting genomic DNA.
  • the nucleus is first stripped, then the Tn5 transposition reaction is performed, and then the RT (reverse transcription) reaction is performed, and then the processing is performed in the same manner as in Example 5.
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • reaction system to amplify DNA to obtain the final sequencing library: 36ul DNA template, 10ul 5xPCR Buffer, 1ul 10mM dNTP, 1ul 10uM primer TrueseqD501, 1ul 10uM primer N701, 1ul 10mM ISPCR primer, 1ul Taq, 72°C 5min, 94°C 2min, 94°C30sec, 55°C30sec, 72°C3min, 12 cycles.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme, 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • This application is used to detect the transcriptome and genome of the same cell at the same time, which can well distinguish a single cell of two types of cells. It is accurate in detecting the transcriptome and cut tag of the same cell at the same time.
  • the cell nucleus obtained above undergoes the following RT reaction
  • microfluidic chip as shown in Figure 8 is used for cell labeling, the bead channel: 100um, and the nuclei channel: 50um.
  • Cell nucleus solution 1ml (100 cell nucleus/ul concentration), including: 200ul 10xT4 DNA ligase Buffer, 10ul T4 DNA ligase, 10ul 1M DTT, 780ul nucleus/water.
  • Bead solution (100 bead/ul concentration): Bead in PBS.
  • Cell nucleus solution, bead solution, and oil form a 120um diameter drop collection on the microfluidic chip, and connect for 1 hour at 37°C.
  • a transformation kit such as EpiTect Fast Bisulfite Conversion Kit or NEB Enzymatic Methylation conversion kit to transform the genomic DNA obtained above, for example, Qiagen kit as an example, refer to the instructions to configure the bisulfite conversion reagent.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Recover DNA 12.5 ⁇ L 4X Enzyme Reaction Buffer, 10ul 5-hmC Modifying Enzyme, add water to 50ul, and react at 30°C for 1hr. Purify DNA with a 1:1 volume of magnetic beads.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme, 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • This application is used to detect the transcriptome and methylation of the same cell at the same time.
  • Figure 26 shows that the transcriptome and methylation of the same cell can perform well with gene models and known methylation sites. match.
  • the method of this application is used to simultaneously detect the transcriptome and methylation of the same cell with accuracy.
  • Spatial lattice chip There are fixed-space DNA oligo clusters on the chip, the structure is as follows:
  • the space lattice is synthesized by microarray in-situ synthesis method (Affymetrix, NimbleGene) or other methods, including transfer from the existing array by PCR method, and extension by sequential labeling method.
  • microarray in-situ synthesis method Affymetrix, NimbleGene
  • transfer from the existing array by PCR method and extension by sequential labeling method.
  • tissue sections stick frozen sections of non-fixed tissues on a cover glass, add 1% formaldehyde, fix the tissues, and wash.
  • NNNNNNNN in the lattice DNA sequence is an 8bp specific primer sequence, and each point on the lattice corresponds to a specific 8bp sequence.
  • the OCT-embedded tissue was sliced with a cryostat and attached to a glass slide treated with polylysine.
  • the tissue on the slide was treated with lysis solution (10mM Tris-Cl, pH 7.4; 10mM NaCl; 3mM MgCl 2 ; 0.01% NP-40) at room temperature for 5 minutes.
  • the p-Tn5 and Tn5-B obtained in the examples of this application are used to react on the glass slide, and the reaction system is as follows:
  • the G can be rG, rG represents ribose guanine, 1 unit/ul RT enzyme (Superscript II reverse transcriptase); reaction conditions: 50°C for 5 min, 4°C for 5 min, 42°C for 60 min, and wash the section with PBS.
  • the tissue is contacted with the synthetic primer matrix glass slide, and 1xT4 ligase buffer, 1 unit/ul T4 DNA ligase is added, so that part of the double-stranded adaptor on the slide is connected with the RT product and the AATC product on the tissue section.
  • Recover cDNA and ATACDNA Add proteinase K reaction buffer and proteinase K at the top of the slice, de-crosslink at 55-65°C, and purify the DNA, then use Qiagen kit to obtain genomic DNA and reverse transcribed mRNA/cDNA
  • Primer TrueseqD501 sequence AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 10)
  • Primer N701 sequence CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG (SEQ ID NO: 11)
  • reaction system 25ul or more reaction system, 1ul 10uM primer TrueseqD501, 1ul 10uM primer Nextera N701 primer, 1ul Taq enzyme, 72°C5min, 94°C2min, 94°C30sec, 60°C30sec, 72°C3sec, 18 cycles.
  • Illumina Novaseq measures 100,000 PE150 reads per cell.
  • Figure 28 shows that the slice HE staining is superimposed on the spatial lattice chip, and the color intensity of each dot represents the number of genes obtained by measurement.
  • the method of this application can be used for the research of spatial multi-omics technology platform.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本申请涉及一种分析来自细胞的目标核酸的方法,包括1)提供如下离散分区:源于单个细胞并加了寡核苷酸衔接子序列的目标核酸,以及附接有至少一个寡核苷酸标签的固体支持物,其中每个寡核苷酸标签包含第一链以及第二链,第一链包含条码序列以及位于条码序列3'端的杂交序列,第二链包含与第一链的杂交序列互补的第一部分以及与附接至目标核酸的寡核苷酸衔接子序列互补的第二部分,且第一链与第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;2)在离散分区中,使寡核苷酸标签与经附接的目标核酸连接,从而产生条码化的目标核酸。

Description

分析来自细胞的目标核酸的方法 技术领域
本申请涉及生物医药领域,具体的涉及一种分析来自细胞的目标核酸的方法及相关制剂。
背景技术
目前,核酸测序技术已经经历了快速和巨大的进步,测序技术产生大量的序列数据,可用于对基因组和基因组区域的研究阐释,并且提供广泛应用于常规生物学研究和诊断信息。基因组测序可以用来获得各种各样的生物医学背景的信息,包括诊断学、预后、生物技术和法医生物学。测序包括Maxam-Gilbert测序和链终止法或从头测序法(包括鸟枪法测序和桥式PCR),或新一代方法,其包括聚合酶克隆测序、454焦磷酸测序、Illumina测序、SOLiD测序、Ion Torrent半导体测序、HeliScope单分子测序、[image]测序,等等。对于大多数测序应用,样品如核酸样品在引入测序仪之前先进行处理。
传统对于基因组或转录组表达研究方法,通常是在多细胞水平进行的。因此,最终得到的信号值是多个细胞的平均,丢失了细胞异质性的信息。例如,目前通过直接测序分析细胞的mRNA含量时依靠对从包含数百万个细胞的组织样品中获得的大量mRNA进行分析,这意味着当在大量mRNA中分析基因表达时,单细胞中呈现的很多功能信息将丢失或变得模糊;此外,也不能按总体平均值观察诸如细胞周期的动态过程。类似地,只有单独分析细胞,才能研究复杂组织(例如,大脑)中的某些细胞类型。
目前没有合适的细胞表面标志物来用于分离单细胞以进行研究,并且即使存在合适的细胞表面标志物,少量的单细胞仍不足以捕获基因表达中自然变异的范围。因此需要一种分析方法可用于分析大量的单细胞中的基因信息。
发明内容
本申请提供了一种分析来自细胞的目标核酸的方法,所述方法包括:
a)提供包含下述的离散分区:
ⅰ.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及
ⅱ.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链 包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
在某些实施方式中,所述寡核苷酸标签可释放地附接至所述固体支持物。
在某些实施方式中,包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
在某些实施方式中,所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
在某些实施方式中,所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
在某些实施方式中,所述连接酶包括T4连接酶。
在某些实施方式中,在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
在某些实施方式中,所述固体支持物为珠粒。
在某些实施方式中,所述珠粒为磁性珠粒。
在某些实施方式中,所述离散分区为孔或微滴。
在某些实施方式中,所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
在某些实施方式中,所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
在某些实施方式中,a)包括将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
在某些实施方式中,b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,b)包括使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,所述经附接的目标核酸中包含独特分子鉴别区。
在某些实施方式中,所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
在某些实施方式中,所述寡核苷酸标签还包含扩增引物识别区。
在某些实施方式中,所述扩增引物识别区为通用扩增引物识别区。
在某些实施方式中,所述方法进一步包括:
c)获得所述条码化的目标核酸的表征结果;以及
d)至少部分基于c)中获得的所述表征结果中存在相同的所述细胞条码序列而将所述目标核酸的序列识别为源于所述单个细胞。
在某些实施方式中,所述方法进一步包括,在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸。
在某些实施方式中,c)包括对所述条码化的目标核酸进行测序,从而获得所述表征结果。
在某些实施方式中,所述方法进一步包括由所述条码化的目标核酸的序列组装所述单个细胞的基因组的至少一部分的连续核酸序列。
在某些实施方式中,基于所述单个细胞的所述基因组的至少一部分的所述核酸序列来表征所述单个细胞。
在某些实施方式中,每个所述离散分区至多包括源自单个细胞的所述目标核酸。
在某些实施方式中,所述方法进一步包括至少部分基于所述独特分子鉴别区的存在将所述条码化的目标核酸中的单个核酸序列鉴别为源于所述目标核酸中的给定核酸。
在某些实施方式中,所述目标核酸包括外源核酸,所述外源核酸包括与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。
在某些实施方式中,所述方法进一步包括基于所述独特分子鉴别区的存在确定所述目标核酸中给定核酸的量。
在某些实施方式中,包括在a)之前对所述细胞进行预处理。
在某些实施方式中,所述预处理包括固定所述细胞。
在某些实施方式中,使用固定剂对所述细胞进行固定,所述固定剂选自下组中的一种或多种:甲醛、多聚甲醛、甲醇、乙醇、丙酮、戊二醛、锇酸和重铬酸钾。
在某些实施方式中,所述预处理包括使所述细胞的细胞核被暴露。
在某些实施方式中,所述预处理包括使用去垢剂处理所述细胞,所述去垢剂包括Triton、Tween,SDS,NP-40和/或digitonin。
在某些实施方式中,所述目标核酸包括选自下组的一种或多种:DNA、RNA和cDNA。
在某些实施方式中,其进一步包括,在b)之后并且在c)之前,对所述条码化的目标核酸进行扩增。
在某些实施方式中,包括在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸,且所述扩增在所述条码化的目标核酸从所述离散分区中释放后进行。
在某些实施方式中,所述扩增中使用扩增引物,且所述扩增引物中包含随机引导序列。
在某些实施方式中,所述随机引导序列为随机六聚体。
在某些实施方式中,所述扩增包括使所述随机引导序列与所述条码化的目标核酸至少部分杂交并且以模板定向的方式延伸所述随机引导序列。
在某些实施方式中,包括使至少一部分所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外,并在b)中使经释放的所述目标核酸与所述寡核苷酸标签连接,从而产生条码化的目标核酸。
在某些实施方式中,包括使至少一部分从所述固体支持物释放的所述寡核苷酸标签进入所述单个细胞中,并在b)中与所述目标核酸连接,从而产生条码化的目标核酸。
在某些实施方式中,包括使用微流控装置将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
在某些实施方式中,所述离散分区为微滴,且所述微流控装置为微滴发生器。
在某些实施方式中,所述微流控装置包括第一输入通道和第二输入通道,它们在与输出通道流体连接的接合处汇合。
在某些实施方式中,所述方法还包括将包含所述目标核酸的样品引入所述第一输入通道,且将附接有至少一个寡核苷酸标签的所述固体支持物引入所述第二输入通道,从而在所述输出通道中生成所述样品与所述固体支持物的混合物。
在某些实施方式中,所述输出通道与第三输入通道在接合处流体连接。
在某些实施方式中,还包括将油引入所述第三输入通道,使得形成油包水乳液内的水性小滴作为所述离散分区。
在某些实施方式中,每个所述离散分区中至多包含来自单个细胞的所述目标核酸。
在某些实施方式中,所述第一输入通道和所述第二输入通道彼此之间形成基本上垂直的角度。
在某些实施方式中,所述目标核酸包括源自所述单个细胞中RNA的cDNA。
在某些实施方式中,所述RNA包括mRNA。
在某些实施方式中,包括在a)之前对所述RNA进行反转录,并产生所述经附接的目标核酸。
在某些实施方式中,述反转录中使用反转录引物,所述反转录引物以5‘至3’的方向包含所述寡核苷酸衔接子序列以及polyT序列。
在某些实施方式中,所述反转录包括使所述polyT序列与所述RNA杂交并且以模板定向的方式延伸所述polyT序列。
在某些实施方式中,所述目标核酸包括源自所述单个细胞的DNA。
在某些实施方式中,所述DNA包括基因组DNA、开放染色质DNA、蛋白质结合的DNA区域和/或与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。
在某些实施方式中,包括在a)之前对源自单个细胞的所述DNA进行片段化。
在某些实施方式中,在所述片段化之后或者在所述片段化的过程中产生所述经附接的目标核酸。
在某些实施方式中,所述片段化包括使用超声断裂,而后在经断裂的所述DNA上添加包含所述寡核苷酸衔接子的序列,从而获得所述经附接的目标核酸。
在某些实施方式中,所述片段化包括使用DNA内切酶、DNA外切酶打断,而后在经断裂的所述DNA上添加包含所述寡核苷酸衔接子的序列,从而获得所述经附接的目标核酸
在某些实施方式中,所述片段化包括使用转座酶-核酸复合物将包含所述寡核苷酸衔接子的序列整合到所述DNA中,并释放所述转座酶以获得所述经附接的目标核酸。
在某些实施方式中,所述转座酶-核酸复合物包含转座酶以及转座子末端核酸分子,其中所述转座子末端核酸分子包含所述寡核苷酸衔接子序列。
在某些实施方式中,所述转座酶包括Tn5。
在某些实施方式中,所述DNA包括与蛋白质结合的DNA区域,且所述转座酶-核酸复合物中还包含直接或间接识别所述蛋白质的部分。
在某些实施方式中,所述直接或间接识别所述蛋白质的部分包括下组中的一种或多种:特异性结合所述蛋白质的抗体和蛋白质A或蛋白质G。
另一方面,本申请还提供了一种组合物,其包含:多个固体支持物,每个所述固体支持物上附接有至少一个寡核苷酸标签,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与待测核酸中的序列互补的第二部分,且所述第一 链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;所述寡核苷酸标签的条码序列包含共同条码结构域和可变结构域,所述共同条码结构域在附接于同一个固体支持物的寡核苷酸标签中是相同的,且所述共同条码结构域在所述多个固体支持物中的两个或更多个固体支持物之间是不同的。
另一方面,本申请还提供了用于分析来自细胞的目标核酸的试剂盒,其包本申请所述的组合物。
在某些实施方式中,所述试剂盒包括转座酶。
在某些实施方式中,所述试剂盒进一步包含核酸扩增剂,逆转录剂,固定剂,通透剂,连接剂和裂解剂中的至少一种。
一种扩增来自细胞的目标核酸的方法,所述方法包括:
a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
c)对所述条码化的目标核酸进行扩增。
在某些实施方式中,所述寡核苷酸标签可释放地附接至所述固体支持物。
在某些实施方式中,包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
在某些实施方式中,所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
在某些实施方式中,所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
在某些实施方式中,所述连接酶包括T4连接酶。
在某些实施方式中,在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
在某些实施方式中,所述固体支持物为珠粒。
在某些实施方式中,所述离散分区为孔或微滴。
在某些实施方式中,所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
在某些实施方式中,所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
在某些实施方式中,a)包括将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
在某些实施方式中,b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,b)包括使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,所述经附接的目标核酸中包含独特分子鉴别区。
在某些实施方式中,所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
在某些实施方式中,所述寡核苷酸标签还包含扩增引物识别区。
在某些实施方式中,所述扩增引物识别区为通用扩增引物识别区。
在某些实施方式中,包括在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸,且所述扩增在所述条码化的目标核酸从所述离散分区中释放后进行。
在某些实施方式中,所述扩增中使用扩增引物,且所述扩增引物中包含随机引导序列。
在某些实施方式中,所述随机引导序列为随机六聚体。
在某些实施方式中,所述扩增包括使所述随机引导序列与所述条码化的目标核酸至少部分杂交并且以模板定向的方式延伸所述随机引导序列。
另一方面,本申请还提供了一种对来自细胞的目标核酸进行测序的方法,所述方法包括:
a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分, 且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
c)对所述条码化的目标核酸进行测序。
在某些实施方式中,所述寡核苷酸标签可释放地附接至所述固体支持物。
在某些实施方式中,包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
在某些实施方式中,所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
在某些实施方式中,所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
在某些实施方式中,所述连接酶包括T4连接酶或T7连接酶。
在某些实施方式中,在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
在某些实施方式中,所述固体支持物为珠粒。
在某些实施方式中,所述离散分区为孔或微滴。
在某些实施方式中,所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
在某些实施方式中,所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
在某些实施方式中,a)包括将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
在某些实施方式中,b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,b)包括使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
在某些实施方式中,所述经附接的目标核酸中包含独特分子鉴别区。
在某些实施方式中,所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸 序列之间。
在某些实施方式中,所述寡核苷酸标签还包含扩增引物识别区。
在某些实施方式中,所述扩增引物识别区为通用扩增引物识别区。
在某些实施方式中,进一步包括由所述条码化的目标核酸的序列组装所述单个细胞的基因组的至少一部分的连续核酸序列。
在某些实施方式中,基于所述单个细胞的所述基因组的至少一部分的所述核酸序列来表征所述单个细胞。
在某些实施方式中,每个所述离散分区至多包括源自单个细胞的所述目标核酸。
在某些实施方式中,进一步包括至少部分基于所述独特分子鉴别区的存在将所述条码化的目标核酸中的单个核酸序列鉴别为源于所述目标核酸中的给定核酸。
在某些实施方式中,进一步包括基于所述独特分子鉴别区的存在确定所述目标核酸中给定核酸的量。本领域技术人员能够从下文的详细描述中容易地洞察到本申请的其它方面和优势。下文的详细描述中仅显示和描述了本申请的示例性实施方式。如本领域技术人员将认识到的,本申请的内容使得本领域技术人员能够对所公开的具体实施方式进行改动而不脱离本申请所涉及发明的精神和范围。相应地,本申请的附图和说明书中的描述仅仅是示例性的,而非为限制性的。
附图说明
本申请所涉及的发明的具体特征如所附权利要求书所显示。通过参考下文中详细描述的示例性实施方式和附图能够更好地理解本申请所涉及发明的特点和优势。对附图简要说明书如下:
图1显示了本申请中PCR法生成适用于非转录组分析的核苷酸标签的示意图。
图2显示了本申请中T4连接酶法生成适用于非转录组分析的核苷酸标签的示意图。
图3显示了本申请中PCR法生成适用于转录组分析的核苷酸标签的示意图。
图4显示了本申请中T4连接酶法生成适用非转录组分析的核苷酸标签的示意图。
图5显示了本申请中Tn5转座反应介导的人293T细胞ATAC测序结果的片段长度分布图。
图6A和6B显示了本申请中Tn5转座反应介导的人293T细胞ATAC测序结果的信号富集转录起始位点(TSS)图。
图7显示了本申请中Tn5转座反应介导的人293T细胞ATAC测序结果的不同类型序列比例图。
图8显示了本申请中的微流控芯片示意图。
图9显示了本申请中ATAC测序结果依据每种barcode中的read数目得到的堆积曲线图。
图10显示了本申请中ATAC测序结果单个细胞中unique mapped reads数目分布图。
图11显示了本申请中细胞的ATAC数据在基因区域的分布图。
图12显示了本申请中单细胞的ATAC信号相关性分析结果图。
图13显示了本申请中Cut tag文库片段分布结果。
图14显示了本申请中Cut tag片段在转录起始位点分布位置结果图。
图15显示了本申请中Cut tag片段在基因组中分布的比例结果图。
图16显示了本申请中单细胞Cut tag结果分布结果。
图17显示了本申请中根据单细胞转录组清楚区分混合细胞的单个细胞的结果图。
图18显示了本申请中每一细胞中检测的转录本及基因数目分布结果。
图19显示了本申请中根据单细胞基因组清楚区分混合细胞的单个细胞的结果图。
图20显示了本申请中单细胞测序在每个细胞和每个基因组位点有不同的覆盖程度结果图。
图21显示了本申请中根据单细胞DNA修饰清楚区分混合细胞的单个细胞的结果图。
图22显示了本申请中每一细胞中检测的甲基化修饰分布结果。
图23显示了本申请中每一细胞中检测的5hmC修饰分布结果。
图24显示了本申请中根据转录组和ATAC均可以很好地区分混合细胞中的单个细胞的结果图。
图25显示了本申请中根据转录组和cut tag均可以很好地区分混合细胞中的单个细胞的结果图。
图26显示了本申请中同一细胞的转录组和甲基化组均可以很好地与基因模型以及已知的甲基化位点进行匹配的结果图。
图27显示了本申请中一种空间点阵芯片示意图。
图28显示了本申请中切片HE染色与空间点阵芯片叠加后基因数目结果图。
具体实施方式
以下由特定的具体实施例说明本申请发明的实施方式,熟悉此技术的人士可由本说明书所公开的内容容易地了解本申请发明的其他优点及效果。
术语定义
在本申请中,术语“测序”通常是指获取核酸分子序列信息的技术。例如分析特定DNA片段的碱基序列(例如,腺嘌呤(A)、胸腺嘧啶(T)、胞嘧啶(C)与鸟嘌呤(G)的排列方式等);测序方法可以包括Sanger双脱氧链终止法(Chain Termination Method),焦磷酸测序法,以及新一代测序的Illumina,Life Technologies和Roche等使用的“合成并行测序”或“连接测序”平台等,华大智造/Complete Genomics的测序仪;通常还可以包括纳米孔测序方法,例如牛津纳米孔技术公司开发的方法,PacBio的三代测序仪,或基于电子检测的方法,例如Life Technologies推出的离子激流技术(Ion Torrent technology)等。
在本申请中,术语“表征结果”通常是指通过测序或其他基因组和/或蛋白质组学等生物学分析方法获得的核酸及其他相关分子的信息描述。例如可以包括全基因组测序的序列信息、可接近染色质序列及分布信息、核酸序列与其结合因子的结合信息、致病基因突变信息、单核苷酸多态性(SNP)、核苷酸甲基化、转录组组信息(例如基因表达水平的时间或空间变化)等。
在本申请中,术语“蛋白质A”通常是指一种细胞来源的可以结合不同物种来源的抗体重链保守区的蛋白(即抗体的识别蛋白)。例如,能与人及多种哺乳动物血清IgG分子中的Fc片段结合,其中的哺乳动物可以包括猪、狗、兔、人、猴、鼠、小鼠及牛等;蛋白质A与IgG结合的亚类主要可以包括IgG1、IgG2和IgG4;蛋白质A除了与IgG结合外,还能与血清中的IgM和IgA结合。例如,蛋白质A可以包括来自金黄色葡萄球菌的蛋白质A(SPA),SPA是细胞壁抗原的主要成分,几乎90%以上的金黄色葡萄球菌菌株含有这种成分,但不同的菌株含量差别悬殊。利用蛋白质A能够与抗体结合的功能可以通过形成目标蛋白—抗体—蛋白质A复合体从而对目标蛋白进行定位和/或分析。
在本申请中,术语“固体支持物”通常是指适用于或可被修改以适用于附接本文描述的寡核苷酸标签、条码序列、引物等的任何材料。例如,固体支持物包括位于表面中的孔或凹陷的阵列,这些可使用多种技术进行制造,例如光刻法、冲压技术、成型技术和微蚀技术;固体支持物的组成和几何形状可以依据其用途而改变,例如,固体支持物可以是平面结构(例如载玻片、芯片、微芯片和/或阵列等);例如,固体支持物或其表面还可以是非平面的,例如管或容器的内表面或外表面;例如,固体支持物还可以包括微球或珠粒。
在本申请种,“珠粒(beads)”或“微球(microspheres)”或“粒子(parcitiles)”通常是指小的离散粒子。适合的珠粒组合物包括但不限于:塑料、陶瓷、玻璃、聚苯乙烯、甲基苯乙烯、丙烯酸聚合物、顺磁材料、氧化钍溶胶、碳石墨、二氧化钛、乳胶或交联葡聚糖(诸如 琼脂糖)、纤维素、尼龙、交联胶束和铁氟龙,并且本文概述的用于固体支持物的任何其他材料全部可以使用,可以参考费雪尔邦斯实验室(Bangs Laboratories,Fishers Ind.)的微球检测指南(Microsphere DetectionGuide);在某些实施例中,微球可以是磁性微球或珠粒。
在本申请中,术语“独特分子鉴别区”也可以称作“分子条形码”、“分子标记”、“唯一标识符(UID)”、“唯一分子标识符(UMI)”等,通常是指为同一样品的每个原始核苷酸片段连接上的一段独一无二的序列编码。其通常可以设计为完全随机的核苷酸链(例如NNNNNNN)、部分简并的核苷酸链(例如NNNRNYN)或指定核苷酸链(例如,模板分子有限时);当被引入核酸分子中时,例如在第一链cDNA合成期间,可通过直接计数在扩增后测序的唯一性分子标识符(UMI)来校正随后的扩增偏倚。可根据本领域已知的方式进行UMI的设计、并入和应用,例如,通过WO2012/142213、Islam等人的(Nat.Methods)(2014)11:163-166,以及Kivioja,T.等人的(Nat.Methods)(2012)9:72-74的公开示例的,所述文献通过引用的方式以其全部并入本文中。
在本申请中,术语“扩增引物识别区”通常是指一段能够与扩增所述目标核酸的引物序列互补杂交的核苷酸序列。所述引物与其结合能够引发核苷酸延伸、连接和/或合成,例如在聚合酶链反应作用下实现目标核酸拷贝数增多(即扩增),在一些实施方式中也包括寡核苷酸标签、分子唯一标识符等序列的扩增。
在本申请中,术语“离散分区”通常是指包含待分析目的物质的相互之间独立的空间单元。例如微滴或孔;例如,将目标核酸的样品与附接有寡核苷酸标签的固体支持物共分配形成的微滴;在一些实施方式中,所述离散分区中还可以包含其他根据不同的需求而分配其他物质,例如染料、乳化剂、表面活性剂、稳定剂、聚合物、适体、还原剂、引发剂、生物素标记物、荧光团、缓冲液、酸性溶液、碱性溶液、光敏感的酶、pH敏感的酶、水性缓冲液、去污剂、离子型去污剂、非离子型去污剂等等。
在本申请中,术语“可释放地附接”通常是指寡核苷酸标签与固体支持物之间的连接方式是可释放的、可切割的或可逆的或者可破坏、可消除的。例如,寡核苷酸标签与固体支持物的连接包含不稳定的键,例如,化学、热或光敏感的键,例如,二硫键、UV敏感的键等,通过相应的处理破坏这些不稳定的键从而实现可释放的附接;例如,寡核苷酸标签与固体支持物的连接包含可以被核酸酶识别的特定碱基,例如dU,可以通过UNG酶的作用切割所述连接;例如,寡核苷酸标签与固体支持物的连接包含核酸内切酶识别序列,可以通过核酸酶的作用切割所述连接;例如,所述固体支持物是可降解的,在施以降解条件时通过固体支持物的降解释放所述寡核苷酸标签,实现可释放的附接等。
在本申请中,术语“连接子”通常是指一段将各个功能性序列连接在一起的核苷酸序列也可以包括将寡核苷酸标签连接至固体支持物的分子序列(核酸、多肽或其他化学连接结构等)其中所述的功能性序列可以包括细胞条码区段、条码序列、扩增引物识别区、测序引物识别区、唯一分子识别符等,在某些实施方式中,该核苷酸可以是一段固定的核苷酸序列,在某些实施方式中,所述连接子还可以包含化学修饰。
在本申请中,术语“随机引导序列”通常是指一段在每一位置处可以呈现四重简并的随机引物。随机引导序列与目标核酸的相应区域(包括目标核酸自身序列及其附接的其他核苷酸序列)识别结合从而实现核苷酸序列的合成和/或扩增。
在本申请中,术语“条码序列”通常是指一段能够标识目标核酸的核苷酸序列或其衍生或修饰形式。
在本申请中,术语“细胞条码序列”通常是指可用于识别目标核酸样品来源的核苷酸序列。其中来源可以是,例如来自同一个细胞或不同的细胞。在核酸样品衍生自多种来源的情况下,可采用不同的细胞条码序列对每个来源中的核酸进行标记,使得样品的来源能被识别。条码(通常还称为索引、标签等)是本领域技术人员所熟知的,可使用任何合适的条形码或条形码组,例如在US2013/0274117的公开中所述的细胞条码序列。
在本申请中,术语“细胞条码区段”通常是指组成细胞条码序列的条码核苷酸单元,N个所述细胞条码区段可以通过PCR或DNA连接酶的作用形成细胞条码区段。N可以大于或等于1,使得形成的细胞条码序列足以识别衍生自多种来源的每个核酸样品的细胞来源。
在本申请中,术语“寡核苷酸衔接子”通常是指附接与目标核酸并且包含能够与所述寡核苷酸标签互补杂交序列的一段核苷酸序列。该核苷酸序列可以是部分双链结构,例如可以具有与寡核苷酸标签杂交的突出序列;在某些实施方式中,寡核苷酸衔接子还可以包含转座酶(例如Tn5转座酶)结合序列;在某些实施方式中,寡核苷酸衔接子还可以包含扩增引物识别序列;在某些实施方式中,寡核苷酸衔接子还可以包含反转录引物序列。
在本申请中,术语“条码化的目标核酸”通常是指至少附接了细胞条码序列的目标核酸。
在本申请中,术语“共同条码结构域”通常是指用于识别目标核酸来源的条码序列。附接于同一个固体支持物的寡核苷酸标签中包含的共同条码结构域是相同的,附接于不同的固体支持物的寡核苷酸标签中包含的共同条码结构域相互之间是不同的,在某些实施方式中,释放自同一个固体支持物的寡核苷酸标签与来源于一个细胞的目标核酸连接,可以通过所述共同条码结构域识别其细胞来源。
在本申请中,术语“可变结构域”通常是指共同条码结构域之外的根据不同的需要设置的核苷酸序列。例如,连接子序列,扩增引物识别序列、测序引物识别序列等。
在本申请中,术语“转座酶-核酸复合物”通常是指转座酶与包含所述寡核苷酸衔接子的序列形成的复合物。转座酶通常是指一种能够与转座子末端结合并通过剪切、粘贴机制或复制性转座机制催化其向基因组其他部分移动的酶。转座子通常是指一段能够在基因组中自由跳跃的核苷酸片段,是由Barbara McClintock在二十世纪四十年代后期研究玉米遗传机制时提出的,之后的其他研究小组描述了转座的分子基础,例如,McClintock发现染色体片段能够改变位置,从一条染色体跳到另一条染色体。这些转座子的重新定位能够改变其他基因的表达,例如在玉米中转座能够引起颜色变化,在细菌等其他生物中,能够引起抗生素耐药性在人类进化的过程中。转座酶-核酸复合物中可以包含2个分别结合了寡核苷酸衔接子的转座酶形成的二聚体,其中的2个转座酶可以是相同的转座酶也可以是不同的,其分别结合的寡核苷酸衔接子可以是相同的也可以是不同的。
在本申请中,术语“Tn5”通常是指Tn5转座酶,它是核糖核酸酶(RNase)超家族的成员。Tn5能够在希瓦氏菌和大肠埃希氏菌中发现。Tn5可以包括天然存在的Tn5转座酶及其各种活性突变形式;Tn5与大多数其他转座酶一样含有DDE基序,DDE基序是催化转座子转移的活性位点。据报道称DDE基序能够与二价金属离子(例如镁和锰)协调作用,在催化反应起到重要作用。转座酶Tn5可能通过DDE区域发生突变而使得转座活性升高,并催化转座子的移动。例如,其中326位的谷氨酸转化为天冬氨酸,而97和188位的两个天冬氨酸转化为谷氨酸(基于GenBank登录号YP_001446289的氨基酸序列的氨基酸编号)等。
在本申请中,术语“微流控装置”通常是指能够实现微流控的设备或系统。其中,微流控通常是指一种精确控制和操控微尺度流体的技术,尤其特指亚微米结构的技术,“微”通常是指微小的容量或体积(例如纳升,皮升,飞升级别)。微流控技术已广泛应用于多领域,例如生物医药领域,例如,分子生物学方法中的酶分析(如葡萄糖和乳酸分析)、DNA分析(如聚合酶链式反应和高通量测序)、蛋白质组学分析等。微流控装置主体结构可以包括与其连接的简易贮存器,从装置外来源、歧管、流体流动单元(例如,致动器、泵、压缩机)等递送流体的流体管道,以及将微流体分配递送到随后的处理操作、仪器或部件的流体导管等。
在本申请中,术语“杂交”、“可杂交的”或者“互补的”通常是指在合适的温度和溶液离子强度的体外和/或体内条件下,核酸(例如RNA,DNA)包含的核苷酸序列能够使其特异性地非共价结合(即形成Watson-Crick碱基对和/或G/U碱基对)至另一个核酸序列。Watson- Crick碱基配对包括:腺嘌呤/腺苷(A)与胸苷/胸腺嘧啶(T)配对,A与尿嘧啶/尿苷
(U)配对,鸟嘌呤/鸟苷(G)配对与胞嘧啶/胞苷(C)配对。在某些实施方式中,两个RNA分子(例如,dsRNA)之间的杂交,或者DNA分子与RNA分子的杂交(例如,当DNA靶核酸碱基与引导RNA配对时等),G也可以与U碱基配对。杂交需要两个核酸包含互补序列,但是不能排除碱基之间可能错配。适用于两种核酸之间杂交的条件取决于核酸的长度和互补程度,这是本领域众所周知的。两个核苷酸序列之间的互补程度越大,具有这些互补序列的核酸的杂交体的解链温度(Tm)的值越大。
在本申请中,术语“读长”即reads,通常是指核苷酸测序中一个反应获得的测序序列。Reads可以是一段短的测序片段,是测序仪单次测序所得到的碱基序列数据,不同的测序仪器,reads长度可以是不一样的。
发明详述
一方面,本申请提供一种分析来自细胞的目标核酸的方法,所述方法包括:
a)提供包含下述的离散分区:
ⅰ.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及
ⅱ.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构;
或者ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
例如,其进一步包括:
c)获得所述条码化的目标核酸的表征结果;以及
d)至少部分基于c)中获得的所述表征结果中存在相同的所述细胞条码序列而将所述目标核酸的序列识别为源于所述单个细胞。
另一方面,本申请还提供了一种扩增来自细胞的目标核酸的方法,所述方法包括:
a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构;或者ii.的步骤可以为附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
c)对所述条码化的目标核酸进行扩增。
另一方面,本申请还提供了一种对来自细胞的目标核酸进行测序的方法,所述方法包括:
a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构;或者ii.的步骤可以为附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第二链与所述经附接的目标核酸形成部分双链的结构;
b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
c)对所述条码化的目标核酸进行测序。
例如,本申请中的述寡核苷酸标签可以包含第一链以及第二链,所述第一链和所述第二链可以同时提供或者分别提供。在本申请中,当所述第一链和所述第二链同时提供时,所述 第一链可以与所述第二链形成部分双链的结构;当所述第一链和所述第二链分别提供时,所述第二链可以与所述经附接的目标核酸形成部分双链的结构。
条码化的目标核酸
在本申请中,条码化的目标核酸通过所述寡核苷酸标签与所述经附接的目标核酸连接生成。例如,使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。例如,使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。关于所述杂交,适用于两种核酸之间杂交的条件取决于核酸的长度和互补程度,这是本领域众所周知的。两个核苷酸序列之间的互补程度越大,具有这些互补序列的核酸的杂交体的解链温度(Tm)的值越大。
例如,所述寡核苷酸标签的第二链的所述第二部分的长度足以使其与其互补序列(附接至所述目标核酸的所述寡核苷酸衔接子序列或其部分序列)形成双链结构。
例如,所述第二链的所述第二部分的长度可以是1个核苷酸或更多,2个核苷酸或更多,3个核苷酸或更多,5个核苷酸或更多,8个核苷酸或更多,10个核苷酸或更多,12个核苷酸或更多,15个核苷酸或更多,20个核苷酸或更多,22个核苷酸或更多,25个核苷酸或更多或30个核苷酸或者更多。
例如,所述杂交不排除碱基之间可能错配。例如,所述第二链的所述第一部分或所述第二链的所述第二部分的序列不必与其杂交序列的序列有100%互补性。例如,可以是60%或更多,65%或更多,70%或更多,75%或更多,80%或更多,85%或更多,90%或更多,95%或更多,98%或更多,99%或更多,99.5%或更多地互补性。其余的非互补核苷酸可以与互补核苷酸成簇或散布,并且不需要彼此或与互补核苷酸相邻。例如,多核苷酸可在一个或多个区段上杂交,使得在杂交事件中不涉及中间或相邻区段(例如,形成发夹结构,“凸起”等)。
例如,使用连接反应将寡核苷酸标签与所述经附接的目标核酸连接。该连接可包括通过催化磷酸二酯键的形成将两个核酸区段接合在一起,例如所述寡核苷酸标签的第一链的所述杂交序列和所述附接至所述目标核酸的所述寡核苷酸衔接子。连接反应可包括DNA连接酶,诸如大肠杆菌DNA连接酶、T4 DNA连接酶、T7 DNA连接酶、哺乳动物连接酶(例如,DNA连接酶I、DNA连接酶III、DNA连接酶IV)、热稳定连接酶等。T4 DNA连接酶可以连接含有DNA、寡核苷酸、RNA和RNA-DNA杂合体的区段。连接反应可以不包括DNA连接酶,而是采用替代物如拓扑异构酶。采用高浓度的DNA连接酶且包含PEG可实现快速连接。为 了选择连接反应的有利温度,可以考虑DNA连接酶的最适温度(例如可以是37℃)以及待连接的DNA的解链温度。可将目标核酸和条形码化的固体支持物悬浮在合适的缓冲液中以使可能影响连接的离子作用最小化。
例如,其包括使至少一部分所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外,并在b)中使经释放的所述目标核酸与所述寡核苷酸标签连接,从而产生条码化的目标核酸。例如,所述使至少一部分所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外可以包括将细胞与溶解试剂接触,以释放离散分区内的细胞的内容物。所述溶解剂可以包括生物活性试剂,例如用于溶解不同细胞类型(例如革兰氏阳性(gram positive)或阴性细菌、植物、酵母、哺乳动物等)的溶解酶,例如溶菌酶、无色肽酶、溶葡球菌酶、硫葡糖苷酶白芥子(kitalase)、溶壁酶(lyticase)以及其他可商购的溶解酶。例如,还可使用基于表面活性剂的溶解溶液来溶解细胞,例如,溶解溶液可包括非离子表面活性剂,诸如TritonX-100和吐温(Tween)20。例如,溶解溶液可包括离子表面活性剂,诸如十二烷基肌氨酸钠和十二烷基硫酸钠(SDS)。例如,还可采用可使用的其他方法(诸如电穿孔、热、声或机械细胞破坏)的溶解方法。
例如,所述使至少一部分所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外可以包括至少5%、至少10%、至少15%、至少20%、至少25%、至少30%、至少35%、至少40%、至少45%的所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外。
例如,其包括使至少一部分从所述固体支持物释放的所述寡核苷酸标签进入所述单个细胞中,并在b)中与所述目标核酸连接,从而产生条码化的目标核酸。
例如,所述使至少一部分从所述固体支持物释放的所述寡核苷酸标签进入所述单个细胞中可以包括至少25%、至少30%、至少35%、至少40%、至少50%、至少55%、至少60%、至少70%、至少75%、至少75%的所述寡核苷酸标签进入所述单个细胞中。
例如,所述寡核苷酸标签可释放地附接至所述固体支持物。例如,可释放地、可切割地或可逆地附接至所述的固体支持物的寡核苷酸标签包括通过寡核苷酸标签分子与固体支持物之间的联接的切割/破坏而被释放或可释放的寡核苷酸标签,或通过固体支持物自身的降解而被释放的寡核苷酸标签,从而使寡核苷酸标签能够被其他试剂接近或可接近,或包括这两者。
例如,与固体支持物前体连接的acrydite部分、与固体支持物前体连接的另一物质或前体本身包含不稳定的键,例如,化学、热或光敏感的键,例如,二硫键、UV敏感的键等。所述不稳定的键可以在将物质(例如寡核苷酸标签)可逆地连接(共价连接)至固体支持物。例如,热不稳定的键可包括基于核酸杂交的附接(例如,当寡核苷酸与附接至固体支持物的 互补序列杂交时),使得杂合体的热解链从固体支持物(或珠粒)释放寡核苷酸,例如,含有寡核苷酸标签的序列。此外,向凝胶固体支持物添加多种类型的不稳定键可导致能够响应于不同刺激的固体支持物的产生。每种类型的不稳定键可以对相关的刺激(例如,化学刺激、光、温度等)敏感,使得可通过施加合适的刺激来控制通过每种不稳定键附接至固体支持物的物质的释放。例如,通过凝胶珠子的活化官能团,可在凝胶固体支持物形成后将包含不稳定键的另一物质连接至凝胶固体支持物。可提供可释放地附接至固体支持物或以其他方式布置在离散分区中的试剂(带有关联的可激活的基团),使得一旦递送至期望的一组试剂(例如,通过共同分配),可激活的基团可以与期望的试剂反应。这类可激活的基团包括笼蔽基团、可去除的阻断或保护基团,例如,光不稳定基团、热不稳定基团,或可化学去除的基团。除热可切割的键、二硫键和UV敏感的键之外,可与前体或固体支持物偶联的不稳定键的其他非限制性实例还包括酯联接(例如,可用酸、碱或羟胺切割的)、邻二醇联接(例如,可通过高碘酸钠切割的)、Diels-Alder联接(例如,可通过热切割的)、砜联接(例如,可通过碱切割的)、甲硅烷基醚联接(例如,可通过酸切割的)、糖苷联接(例如,可通过淀粉酶切割的)、肽联接(例如,可通过蛋白酶切割的)或磷酸二酯联接(例如,可通过核酸酶(DNA酶)切割的)。
例如,所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。例如,包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
例如,在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。例如,所述目标核酸可以直接与所述条码序列的3’端连接;例如,所述目标核酸不直接与所述条码序列的3’端连接,所述目标核酸与所述条码序列之间可以存在任意其他核苷酸序列。
例如,在b)之后并且在c)之前,对所述条码化的目标核酸进行扩增。例如,在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸,且所述扩增在所述条码化的目标核酸从所述离散分区中释放后进行。例如,在所述条码化的目标核酸从所述离散分区中释放后,可以进行进一步的化学或酶反应修饰,例如所述修饰可以包括bisulfite conversion、5hmc conversion等,之后再进行扩增。
例如,所述扩增中使用扩增引物。例如,所述扩增还可以包括对所述条码化的目标核酸进行进一步的修饰,使得其在另一侧也有固定的序列可以用于进行PCR扩增,例如,所述修饰可以包括反转录链转换、第二链合成、可以是末端转移酶(terminal transferase)反应,以及连接上第二种接头(adaptor)。
例如,所述扩增引物还可以包含通用引物。
例如,所述扩增中使用扩增引物,且所述扩增引物可以包含随机引导序列。所述随机引导序列包括在每一位置处可以呈现四重简并的随机引物。例如,随机引物包括本领域中已知的具有各种随机序列长度的任何核酸引物。例如,随机引物可以包括长度为3、4、5、6、7、8、10、11、12、13、14、15、16、17、18、19、20个或更多个核苷酸的随机序列。例如,多个随机引物可以包括具有不同长度的随机引物。例如,多个随机引物可以包括具有相等长度的随机引物。例如,多个随机物可以包括长度为约5至约18个核苷酸的随机序列。例如,多个随机物包括随机六聚体。所述随机六聚体,可商购获得,并且广泛地用于扩增反应,例如多重置换扩增(MDA),例如,REPLI-g全基因组扩增试剂盒(QIAGEN,Valencia,CA)为例。任何适合长度的随机引物可以用于本申请所述的方法和组合物中。
例如,所述扩增包括使所述随机引导序列与所述条码化的目标核酸至少部分杂交并且以模板定向的方式延伸所述随机引导序列。
寡核苷酸标签
在本申请中,所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构。或者,在本申请中,所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第二链与所述经附接的目标核酸形成部分双链的结构。
关于所述杂交序列,适用于两种核酸之间杂交的条件取决于核酸的长度和互补程度,这是本领域众所周知的。两个核苷酸序列之间的互补程度越大,具有这些互补序列的核酸的杂交体的解链温度(Tm)的值越大。
例如,所述第二链的所述第一部分或所述第二链的所述第二部分的长度足以使其与其互补序列(例如,所述第一链中位于所述条码序列3’端的杂交序列,例如,附接至所述目标核酸的所述寡核苷酸衔接子序列或其部分序列)形成双链结构。
例如,所述第二链的所述第一部分或所述第二链的所述第二部分的长度可以是1个核苷酸或更多,2个核苷酸或更多,3个核苷酸或更多,5个核苷酸或更多,8个核苷酸或更多,10个核苷酸或更多,12个核苷酸或更多,15个核苷酸或更多,20个核苷酸或更多,22个核苷酸或更多,25个核苷酸或更多或30个核苷酸或者更多。
例如,所述第二链的所述第一部分与所述第二链的所述第二部分的序列的长度可以相同,也可以不相同。
例如,所述双链结构不排除碱基之间可能错配。例如,所述第二链的所述第一部分或所述第二链的所述第二部分的序列不必与其杂交序列的序列有100%互补性。例如,可以是60%或更多,65%或更多,70%或更多,75%或更多,80%或更多,85%或更多,90%或更多,95%或更多,98%或更多,99%或更多,99.5%或更多地互补性。其余的非互补核苷酸可以与互补核苷酸成簇或散布,并且不需要彼此或与互补核苷酸相邻。例如,多核苷酸可在一个或多个区段上杂交,使得在杂交事件中不涉及中间或相邻区段(例如,形成发夹结构,“凸起”等)。
例如,附接至同一个固体支持物上的寡核苷酸标签的所述第二部分可以是相同的。
例如,附接至同一个固体支持物上的寡核苷酸标签的所述第二部分可以是不同的。例如,所述附接至同一个固体支持物上的各寡核苷酸标签的所述第二部分可以包括1种或以上的核苷酸序列,例如,所述第二部分的序列可以为2种或以上,例如,3种或以上,例如,4种或以上,例如,5种或以上,例如,6种或以上,例如,7种或以上,例如,8种或以上,例如,9种或以上,例如,10种或以上,例如,11种或以上,例如,12种或以上,例如,13种或以上,例如,14种或以上,例如,15种或以上,从而使得所述附接至同一个固体支持物上的寡核苷酸标签能够与相应的1种或以上的所述经附接的目标核酸连接。
例如,附接至同一个固体支持物上并且含有同一种所述第二部分的所述寡核苷酸标签的数量可以是1个或以上,例如,50个或以上,100个或以上,500个或以上,1000个或以上,1500个或以上,2000个或以上,3000个或以上,5000个或以上,8000个或以上,10000个或以上,12000个或以上,15000个或以上,18000个或以上,20000个或以上,22000个或以上,25000个或以上,28000个或以上,30000个或以上,35000个或以上,40000个或以上,45000个或以上,50000个或以上。
例如,附接至同一个固体支持物上的含有不同第二部分的所述寡核苷酸标签的数量可以根据需要设置为不同比例,从而与相应的所述经附接的目标核酸连接。
例如,所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
例如,附接至同一个固体支持物上的寡核苷酸标签可以包括1个或更多个寡核苷酸标签,例如,50个或更多,100个或更多,500个或更多,1000个或更多,1500个或更多,2000个或更多,3000个或更多,5000个或更多,8000个或更多,10000个或更多,12000个或更多, 15000个或更多,18000个或更多,20000个或更多,22000个或更多,25000个或更多,28000个或更多,30000个或更多,35000个或更多,40000个或更多,45000个或更多,50000个或更多,55000个或更多,60000个或更多,65000个或更多,70000个或更多,75000个或更多,80000个或更多,85000个或更多,90000个或更多,95000个或更多,100000个或更多,110000个或更多,120000个或更多,这些寡核苷酸标签的细胞条码序列是相同的,并且其所述第二链的所述第二部分的序列可以是1种或以上,例如,所述第二部分的序列为2种或以上,3种或以上,4种或以上,5种或以上,6种或以上,7种或以上,8种或以上,9种或以上,10种或以上,11种或以上,12种或以上,13种或以上,14种或以上,15种或以上,16种或以上,17种或以上,18种或以上,19种或以上,20种或以上。
例如,附接至不同的固体支持物上的寡核苷酸标签组所包含的细胞条码序列相互之间不同,所述寡核苷酸标签组可以是附接至同一个固体支持物上的所有寡核苷酸标签的组合。
例如,所述细胞条码序列包含至少2个细胞条码区段。例如,所述细胞条码区段为4或更多个核苷酸(nt),例如,5或更多,例如,10或更多,12或更多,15或更多,18或更多,20或更多,21或更多,22或更多,23或更多,24或更多,25或更多,26或更多,27或更多,28或更多,29或更多,30或更多,31或更多,32或更多,33或更多,34或更多,或35或更多。
例如,所述例如,所述细胞条码序列包含至少2个细胞条码区段,至少3个细胞条码区段,至少4个细胞条码区段,至少5个细胞条码区段,至少6个细胞条码区段,至少7个细胞条码区段,至少8个细胞条码区段,所述细胞条码区段按照在所述寡核苷酸标签中自5’端至3’端的顺序编码为细胞条码区段1,细胞条码区段2,细胞条码区段3,细胞条码区段4,细胞条码区段5……细胞条码区段n。例如,所述至少2个细胞条码区段可以通过PCR或DNA连接酶形成所述细胞条码序列。
例如,可以通过如下方法生成所述细胞条码序列:
1)将所述至少1个所述固体支持物分成至少2个初级等分试样,例如,至少8个等分,至少16个等分,至少24个等分,至少32个等分,至少40个等分,至少48个等分,至少56个等分,至少64个等分,至少72个等分,至少80个等分,至少88个等分,至少96个等分;
2)向每个所述初级等分试样提供至少1个细胞条码区段1,例如,至少1000个细胞条码区段1,例如,至少10000个细胞条码区段1,例如,至少100000个细胞条码区段1,例如,至少1000000个细胞条码区段1,例如,至少10000000个细胞条码区段1,每个等分中 的的细胞条码区段1与另外任一等分试样中的细胞条码区段1的序列和/或长度互相不同;
3)使每个所述初级等分试样中的至少1个固体支持物与细胞条码区段1直接或间接连接,每个固体支持物连接至少一个细胞条码区段1;
4)合并所述至少2个初级等分试样,将合并的初级等分试样分成至少2个二级等分试样,例如,至少8个等分,至少16个等分,至少24个等分,至少32个等分,至少40个等分,至少48个等分,至少56个等分,至少64个等分,至少72个等分,至少80个等分,至少88个等分,至少96个等分;
5)向每个所述二级等分试样提供至少1个细胞条码区段2或其互补序列,例如,至少1000个细胞条码区段2或其互补序列,例如,至少10000个细胞条码区段2或其互补序列,例如,至少100000个细胞条码区段2或其互补序列,例如,至少1000000个细胞条码区段2或其互补序列,例如,至少10000000个细胞条码区段2或其互补序列,每个等分试样中的细胞条码区段2或其互补序列与另外任一等分试样中的细胞条码区段2或其互补序列的序列和/或长度互相不同;
6)使每个所述二级等分试样中的至少1个连接至固体支持物的细胞条码区段1与细胞条码区段2直接或间接连接。
例如,可以重复步骤4)-6),重复次数可以为n,n可以为1,2,3,4,5,6,7,8,9,10或以上,以连接细胞条码区段3,细胞条码区段4,细胞条码区段5……细胞条码区段n,以产生足以针对每个细胞的独特序列的细胞条码,使第一细胞中的目标核酸可以具有独特序列的第一细胞条码,第二细胞中的目标核酸可以具有独特序列的第二细胞条码,第二细胞中的目标核酸可以具有独特序列的第二细胞条码,依此类推。
例如,在所述b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸。
例如,进一步进行c):对所述条码化的目标核酸进行测序,从而获得所述表征结果。
例如,所述表征结果可以包括所述条码化的目标核酸的核苷酸序列信息,例如包括细胞条码核苷酸序列信息,目标核酸的核苷酸序列信息、UMI序列信息。
例如,由所述条码化的目标核酸的序列组装所述单个细胞的基因组的至少一部分的连续核酸序列。
例如,基于所述单个细胞的所述基因组的至少一部分的所述核酸序列来表征所述单个细胞。
例如,所述寡核苷酸标签还包含连接子序列1,所述细胞条码区段1的5’端可以通过连接子序列1连接至固体支持物。所述连接子序列1可以包含acrydite修饰,光切割修饰,S-S 修饰,dU碱基修饰等序列,可以通过各种方法断开,将寡核苷酸标签释放。
例如,所述寡核苷酸标签还包含其他功能序列,所述其他功能序列可以位于所述细胞条码区段1和所述连接子序列1之间,例如,完全或部分的功能序列(例如,引物序列(例如,通用引物序列、靶向引物序列、随机引物序列)识别区、引物退火序列、附接序列、测序引物识别区、扩增引物识别区(例如,通用扩增引物识别区)等,以用于随后的处理。
例如,所述随后的处理包括扩增。例如,所述扩增可以包括PCR扩增(例如,Taq DNA聚合酶扩增、Super Taq DNA聚合酶扩增、LA Taq DNA聚合酶扩增、Pfu DNA聚合酶扩增、Phusion DNA聚合酶扩增、KOD DNA聚合酶扩增等)、等温扩增(例如,可以包括环介导的等温扩增(LAMP)、解旋酶依赖性扩增(HDA)、重组酶聚合酶扩增(RPA)、链置换扩增(SDA)、基于核酸序列的扩增(NASBA)、转录介导扩增(TMA)等)、T7启动子线性扩增、简并寡核苷酸引物PCR扩增(DOP-PCR)、多重置换扩增(MDA)、多次退火环状循环扩增技术(MALBAC)等。
例如,所述细胞条码也可以不包含连接子,所述细胞条码可以是一段单独的、由其他方法合成的的核酸序列。
例如,所述通用引物序列可以包括P5或其他合适的引物。通用引物(例如,P5)还可与测序装置相兼容,例如能够附接至测序装置内的流动池。例如,这类通用引物序列可提供约束在测序装置中流动池表面的寡核苷酸的互补序列,以使得条码化的目标核酸序列能够固定到该表面上以供测序。
例如,扩增引物序列,用于进行扩增或复制过程(例如,沿着目标核酸序列使引物延伸)的引物序列,以便产生扩增的条形码化目标核酸序列。
例如,测序引物序列,所得的扩增靶序列将包含这样的引物,并且容易地转移至测序系统中。例如,当采用Illumina测序系统对扩增的靶标进行测序时,所述测序引物序列可以包含R1引物序列、R2引物序列。
例如,所述寡核苷酸标签可以包含T7启动子序列。例如,所述T7启动子序列包含如SEQ ID NO:1所示的核苷酸序列(TAATACGACTCACTATAG)。
例如,所述寡核苷酸标签可以包含与SEQ ID NO:6-9中的任一项具有至少70%、71%、72%、73%、74%、75%、76%、77%、78%、79%、80%、81%、82%、83%、84%、85%、86%、87%、88%、89%、90%、91%、92%、93%、94%、95%、96%、97%、98%、99%或100%同一性的区域。
例如,所述的所述核苷酸衔接子序列可以包含P5序列。例如,所述的核苷酸衔接子序列 包含P7序列。
例如,所述细胞条码区段1和所述连接子序列1之间可以包含上述多种功能序列中的任何序列或其组合。例如,这些寡核苷酸可包括以下的任一个或多个:P5、R1和R2序列、不可切割的5’acrydite-P5、可切割的5’acrydite-SS-P5、R1c、测序引物、读取引物、通用引物、P5_U、通用读取引物和/或任意这些引物的结合位点。
例如,所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
例如,细胞条码区段1的3’端具有连接子序列2,细胞条码区段2的5’端、3’端分别具有连接子序列3和4,细胞条码区段3的5’端、3’端分别具有连接子序列5和6,细胞条码区段4的5’端、3’端分别具有连接子序列7和8,以此类推,细胞条码区段n的5’端、3’端分别有连接子序列2n-1和2n;连接子序列2与连接子序列3能够至少部分互补配对形成双链结构,连接子序列4与连接子序列5能够至少部分互补配对形成双链结构,连接子序列6与连接子序列7能够至少部分互补配对形成双链结构,以此类推,以启动细胞条码区段1、细胞条码区段2、细胞条码区段3、细胞条码区段4……细胞条码区段n的连接。
例如,使用连接反应将各细胞条码区段连接形成寡核苷酸标签。该连接可包括通过催化磷酸二酯键的形成将两个核酸区段接合在一起,例如细胞条码区段1和前文所述的功能序列,例如,连接子序列2和细胞条码区段2,连接子序列3和细胞条码区段3,连接子序列4和细胞条码区段4,连接子序列5和细胞条码区段5,连接子序列6和细胞条码区段6,以此类推。连接反应可包括DNA连接酶,诸如大肠杆菌DNA连接酶、T4 DNA连接酶、哺乳动物连接酶(例如,DNA连接酶I、DNA连接酶III、DNA连接酶IV)、热稳定连接酶等。T4 DNA连接酶可以连接含有DNA、寡核苷酸、RNA和RNA-DNA杂合体的区段。连接反应可以不包括DNA连接酶,而是采用替代物如拓扑异构酶。采用高浓度的DNA连接酶且包含PEG可实现快速连接。为了选择连接反应的有利温度,可以考虑DNA连接酶的最适温度(例如可以是37℃)以及待连接的DNA的解链温度。可将样品和条形码化的固体支持物悬浮在缓冲液中以使可能影响连接的离子作用最小化。
例如,连接酶生成寡核苷酸标签的条件下,每轮提供的细胞条码区段可以包含如下结构:细胞条码区段以及位于细胞条码区段3’端的连接子序列为双链结构,位于细胞条码区段5’端的连接子序列为突出的单链结构,通过其与前一轮细胞条码区段5’端连接子序列至少部分互补配对形成双链结构。
例如,使用连接反应将各细胞条码区段连接形成寡核苷酸标签的实例可以如图2或图4所示。
例如,通过聚合酶链式反应(PCR)将各细胞条码区段连接形成寡核苷酸标签。例如,所述聚合酶链式反应可以通过如下任意一种或多种聚合酶:Taq DNA聚合酶、Super Taq DNA聚合酶、LA Taq DNA聚合酶、UlltraPF DNA聚合酶、Tth DNA聚合酶、Pfu DNA聚合酶、VentR DNA聚合酶、Phusion DNA聚合酶、KOD DNA聚合酶、Iproof DNA聚合酶。例如,所述聚合酶链式反应中还可以包括使得上述聚合酶保持活性的缓冲液、金属离子;例如,所述聚合酶链式反应中还可以包括dNTP和或其修饰衍生物。
例如,在聚合酶链式反应(PCR)生成寡核苷酸标签的条件下,每轮提供细胞条码区段的互补序列,所述互补序列为单链结构,5’端和3’端各自具有单链结构的连接子序列,其中,5’端的连接子序列能够与前一轮连接的细胞条码区段3’端的连接子序列至少部分互补配对形成双链结构,3’端的连接子序列能够与后一轮连接的细胞条码区段5’端的连接子序列至少部分互补配对形成双链结构。
例如,通过聚合酶链式反应(PCR)将各细胞条码区段连接形成寡核苷酸标签实例可以如图1或图3所示。
经附接的目标核酸
在本申请中,所述目标核酸包括选自下组的一种或多种:DNA、RNA和cDNA。例如,所述目标核酸包括源自所述单个细胞中RNA的cDNA。例如,所述RNA包括mRNA。
在本申请中,所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸。例如,所述寡核苷酸衔接子序列位于所述目标核酸的5’端。
例如,所述寡核苷酸衔接子序列可以包含与所述寡核苷酸标签中的所述第二链的所述第二部分互补的核苷酸序列L,所述核苷酸序列L的长度可以与所述寡核苷酸标签中的所述第二链的所述第二部分长度相同,也可以不同;例如,所述核苷酸序列L的长度可以是1个核苷酸或更多,2个核苷酸或更多,3个核苷酸或更多,5个核苷酸或更多,8个核苷酸或更多,10个核苷酸或更多,12个核苷酸或更多,15个核苷酸或更多,20个核苷酸或更多,22个核苷酸或更多,25个核苷酸或更多或30个核苷酸或者更多。
例如,所述核苷酸序列L可以与所述寡核苷酸标签中的所述第二链的所述第二部分互补配对形成双链结构。例如,所述双链结构不能排除碱基之间可能错配。例如,所述核苷酸序列L的序列不必与所述寡核苷酸标签中的所述第二链的所述第二部分的序列有100%互补性。例如,可以是60%或更多,65%或更多,70%或更多,75%或更多,80%或更多,85%或更多,90%或更多,95%或更多,98%或更多,99%或更多,99.5%或更多地互补性。其余的非互补核苷酸可以与互补核苷酸成簇或散布,并且不需要彼此或与互补核苷酸相邻。例如,多 核苷酸可在一个或多个区段上杂交,使得在杂交事件中不涉及中间或相邻区段(例如,形成发夹结构,“凸起”等)。
例如,所述核苷酸衔接子序列包含转座子末端序列。例如,所述转座子末端序列是Tn5或修饰的Tn5转座子末端序列。例如,转座子末端序列是Mu转座子末端序列。例如,所述Tn5或修饰的Tn5转座子末端序列或Mu转座子末端序列可以包含15至25个核苷酸,例如,16个核苷酸,17个核苷酸,18个核苷酸,19个核苷酸,20个核苷酸,21个核苷酸,22个核苷酸,23个核苷酸,24个核苷酸。
例如,,Tn5嵌合端序列A14(Tn5MEA)和/或Tn5嵌合端序列B15(Tn5MEB)(包括下面阐述的互补的非转移序列(NTS))可作为所述的转座子末端序列。
Tn5MEA:5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3’;(SEQ ID NO:2)
Tn5MEB:5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3’;(SEQ ID NO:3)
Tn5NTS:5’-CTGTCTCTTATACACATCT-3’。(SEQ ID NO:4)
例如,在本申请所述的方法中的步骤a)之前对所述RNA进行反转录,并产生所述经附接的目标核酸。例如,所述反转录中使用第一链合成引物在每一个mRNA样品中由mRNA合成cDNA的第一链。例如,第一链合成引物是包括寡聚dT引物。例如,所述反转录中使用第一链合成引物可以为反转录引物,所述反转录引物以5‘至3’的方向包含所述寡核苷酸衔接子序列以及polyT序列。例如,所述反转录包括使所述polyT序列与所述RNA杂交并且以模板定向的方式延伸所述polyT序列。例如,第一链合成引物是随机物。例如,第一链合成引物是寡聚dT引物和随机物的混合物。例如,所述方法还包括将模板转换型寡核苷酸引物(TSO引物)连同寡聚dT引物和随机物的混合物一起并入。例如,cDNA的第二链是利用TSO引物进行合成。例如,cDNA的第二链是利用与cDNA的第一链互补的第二扩增引物进行合成,该第一链扩展超过mRNA模板,从而包括互补的TSO链。
例如,所述目标核酸包括源自所述单个细胞的DNA。例如,所述DNA包括基因组DNA、
例如,所述DNA包括基因组DNA、开放染色质DNA、蛋白质结合的DNA区域和/或与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。例如,所述蛋白可以包括抗体、抗原。例如,所述靶分子可以包括细胞内待分析的目标核酸序列。例如,在本申请所述的方法中的步骤a)之前对源自单个细胞的所述DNA进行片段化。例如,例如,DNA片段化可包括将DNA链分离或破坏成小片或区段。例如,可采用多种方法对DNA进行片段化,例如,在DNA片段化后再附接所述寡核苷酸衔接子的序列(在此条件下附接的寡核苷酸衔接子的序列不包含所述转座子末端序列),包括限 制性消化或产生剪切力的多种方法。例如,限制性消化可利用限制性内切酶,以通过对两条链的平端切割或通过不均匀切割以产生粘端而在DNA序列中制造切口。例如,剪切力介导的DNA链破坏可以包括超声处理、声剪切、针剪切、移液或雾化。超声处理是一种类型的流体动力学剪切,其使DNA序列暴露于短期的剪切力,这可产生约700bp的片段大小。声剪切向碗形换能器内的DNA样品施加高频声能。针剪切通过使DNA穿过小直径的针来产生剪切力,以将DNA物理地撕裂成较小的区段。雾化力可通过使DNA通过喷雾器单元的小孔而产生,在该单元中从离开该单元的微细雾沫中收集所得的DNA片段。通常,这些片段可为约200至约100000个碱基之间的任何长度。例如,片段将为约200bp至约500bp,约500bp至约1kb,约1kb至约10kb,或约5kb至约50kb,或约10kb至约30kb,例如,为约15kb至约25kb。例如,较大遗传组分的片段化可通过任何方便可得的方法中进行,例如包括可商购获得的基于剪切的片段化系统(例如,Covaris片段化系统)、大小靶向片段化系统(例如Blue Pippin(Sage Sciences))、酶片段化方法(例如,DNA内切酶、DNA外切酶)等等。例如,所述片段化包括使用超声断裂,而后在经断裂的所述DNA上添加包含所述寡核苷酸衔接子的序列,从而获得所述经附接的目标核酸。
例如,在所述片段化之后或者在所述片段化的过程中产生所述经附接的目标核酸。例如,所述片段化包括使用转座酶-核酸复合物将包含所述寡核苷酸衔接子的序列整合到所述DNA中,并释放所述转座酶以获得所述经附接的目标核酸。
例如,所述转座酶包括金黄色葡萄球菌Tn5(Colegio等人,《细菌学杂志》(J.BacterioL),183:2384-8,2001;Kirby C等人,《分子微生物学》(Mol.Microbiol.),43:173-86,2002),泰乐菌素(Tyl)(Devine和Boeke,《核酸研究》(Nucleic Acids Res.),22:3765-72,1994和国际公开案WO 95/23875),转座子Tn7(Craig,N L,《科学》(Science.)271:1512,1996;Craig,N L,在《微生物学和免疫学当前论题》(Curr Top Microbiol Immunol.)中的综述,204:27-48,1996),Tn/O和IS10(KlecknerN等人,《微生物学和免疫学当前论题》(Curr Top Microbiol Immunol.),204:49-82,1996),水手转座酶(Mariner transposase)(Lampe D J等人,《欧洲分子生物学组织杂志》(EMBO J.),15:5470-9,1996),Tel(Plasterk R H,《微生物学和免疫学当前论题》(Curr.Topics Microbiol.Immunol.),204:125-43,1996),P因子(Gloor,G B,《分子生物学方法》(Methods Mol.Biol.),260:97-114,2004),Tn3(Ichikawa和Ohtsubo,《生物化学杂志》(J Biol.Chem.)265:18829-32,1990),细菌插入序列(Ohtsubo和Sekine,《微生物学和免疫学当前论题》(Curr.Top.Microbiol.Immunol.)204:1-26,1996),逆转录病毒(Brown等人,《美国国家科学院院刊》(Proc Natl Acad Sci USA),86:2525-9,1989)和酵母逆转录转座子(Boeke和 Corces,《微生物学年度评论》(Annu Rev Microbiol.),43:403-34,1989),以及IS5、Tnl0、Tn903、IS911和转座酶家族酶的工程化形式(Zhang等人,(2009)《公共科学图书馆遗传学》(PLoS Genet.)5:el000689.电子出版于2009年10月16日;Wilson C.等人(2007)《微生物学方法杂志》(J.Microbiol.Methods)71:332-5)。
例如,所述转座酶-核酸复合物包含转座酶以及转座子末端核酸分子,其中所述转座子末端核酸分子包含所述寡核苷酸衔接子序列。
例如,转座酶是Mu转座酶。例如,所述转座酶为Tn5转座酶或Tn10转座酶。所述Tn5转座酶选自全长Tn5转座酶、Tn5转座酶的部分功能域、Tn5转座酶突变、。所述Tn10转座酶选自全长Tn10转座酶、Tn10转座酶的部分功能域、Tn10转座酶突变体。例如,所述Tn5转座酶突变体可以选自:R30Q,K40Q,Y41H,T47P,E54K/V,M56A,R62Q,D97A,E110K,D188A,Y319A,R322A/K/Q,E326A,K330A/R,K333A,R342A,E344A,E345K,N348A,L372P,S438A,K439A,S445A,G462D,A466D。
例如,所述两个转座酶分子可以结合相同或者不同的双链DNA转座子,使得插入位点被1种或2种DNA所标记。例如,所述两个转座酶分子(例如Tn5及包含点突变超活性T年或其他类型的转座酶)可以和一个所述寡核苷酸衔接子序列,以及另外一个标准的转座子DNA序列组装成杂合的转座复合物,或者只使用上述双链结构2形成单一的Tn5转复合物。所述标准的转座子DNA序列可以包含扩增引物序列和/或测序引物序列。
例如,所述DNA可以包括与蛋白质结合的DNA区域,且所述转座酶-核酸复合物中还包含直接或间接识别所述蛋白质的部分。例如,所述直接或间接识别所述蛋白质的部分可以包括金黄色葡萄球菌蛋白质A(ProteinA)、链球菌蛋白质G(ProteinG)、链球菌蛋白质L(ProteinL)或其他具有结合抗体功能的蛋白类似物。例如,所述直接或间接识别所述蛋白质的部分还可以包括特异性结合所述蛋白质的抗体。例如,所述金黄色葡萄球菌蛋白质A(ProteinA)、链球菌蛋白质G(ProteinG)、链球菌蛋白质L(ProteinL)或其他具有结合抗体功能的蛋白类似物各自能够结合所述特异性结合所述蛋白质的抗体。
例如,所述转座酶与所述金黄色葡萄球菌蛋白质A(ProteinA)、链球菌蛋白质G(ProteinG)、链球菌蛋白质L(ProteinL)或其他具有结合抗体功能的蛋白类似物形成融合蛋白。
例如,所述融合蛋白与所述特异性结合所述蛋白质的抗体结合形成复合物,之后靶向所述蛋白质。
例如,所述特异性结合所述蛋白质的抗体与所述蛋白质结合,之后所述融合蛋白与所述抗体结合从而靶向所述蛋白质。
例如,所述寡核苷酸衔接子序列还可以包含抗体识别序列,所述抗体识别序列用于识别/对映追踪不同的所述抗体。所述抗体识别序列可以通过类似随机引物的方式产生。
例如,所述经附接的目标核酸中包含独特分子鉴别区。所述独特分子鉴别区(UMI)是指附接到多个核酸分子中的每一个上的唯一性核酸序列。例如,当被并入在核酸分子中时,UMI可用于通过直接计数在扩增后测序的独特分子鉴别区(UMI)来校正随后的扩增偏倚。例如,包括至少部分基于所述独特分子鉴别区的存在将所述条码化的目标核酸中的单个核酸序列鉴别为源于所述目标核酸中的给定核酸。例如,包括基于所述独特分子鉴别区的存在确定所述目标核酸中给定核酸的量可以如本领域已知的方式进行UMI的设计、并入和应用,例如,通过WO 2012/142213、Islam等人的《自然-方法学》(Nat.Methods)(2014)11:163-166,以及Kivioja,T.等人的《自然-方法学》(Nat.Methods)(2012)9:72-74的公开所示的,所述每个文献都通过引用的方式以其全部并入本文中。例如,所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
例如,所述目标核酸还可以包括外源核酸,所述外源核酸包括与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。例如,所述蛋白可以包括抗体、抗原。例如,所述靶分子可以包括细胞内待分析的目标核酸序列。
例如,本文所述的转座反应和方法是批量执行的,然后将生物颗粒(例如,来自单细胞的细胞核/细胞/染色质)分配,使得多个离散分区被生物颗粒(例如,细胞、细胞核、染色质或细胞珠)单独占据。例如,可以将多个生物颗粒分配到多个离散分区中,使得多个离散分区中的离散分区包括单个生物颗粒。
固体支持物
在本申请中,所述固体支持物可以包括珠粒。例如,珠粒可以是多孔的、无孔的和/或其组合。例如珠粒可以是固体的、半固体的、半流体的、流体的和/或其组合。例如,珠粒可以是可溶解的、可破坏的和/或可降解的。例如,珠粒可以是不可降解的。例如,珠粒可以是凝胶珠粒。凝胶珠粒可以是水凝胶珠粒。凝胶珠粒可以由分子前体形成,例如聚合物或单体物质。半固体珠粒可以是脂质体珠粒。固体珠粒可包含金属,包括氧化铁、金和银。例如,珠粒可以是二氧化硅珠粒。例如,所述珠粒为磁性珠粒。例如,珠粒可以是刚性的。例如,珠粒可以是柔性的和/或可压缩的。
例如,珠粒可具有任何合适的形状。例如,珠粒的形状可以包括但不限于球形、非球形、椭圆形、长圆形、无定形、圆形、圆柱形及其变形形式。
例如,珠粒可具有均匀尺寸或不均匀尺寸。例如,珠粒的直径可以是至少约10nm、100nm、500nm、1μm、5μm、10μm、20μm、30μm、40μm、50μm、60μm、70μm、80μm、90μm、100μm、250μm、500μm、1mm或更大。例如,珠粒的直径可小于约10nm、100nm、500nm、1μm、5μm、10μm、20μm、30μm、40μm、50μm、60μm、70μm、80μm、90μm、100μm、250μm、500μm、1mm或更小。例如,珠粒的直径可以在约40-75μm、30-75μm、20-75μm、40-85μm、40-95μm、20-100μm、10-100μm、1-100μm、20-250μm或20-500μm的范围内。
例如,珠粒可以以具有相对单分散尺寸分布的珠粒群体或多个珠粒的方式提供。在需要在离散分区内提供相对一致量的试剂的情况下,保持相对一致的珠粒特性(例如尺寸)可有助于整体一致性。特别地,本文所述的珠粒可具有其横截面尺寸的变异系数小于50%、小于40%、小于30%、小于20%,并且例如小于15%、小于10%、小于5%或更小的尺寸分布。
例如,珠粒可包含天然和/或合成材料。例如,珠粒可包含天然聚合物、合成聚合物或天然和合成聚合物。天然聚合物可以包括蛋白质和糖,例如脱氧核糖核酸、橡胶、纤维素、淀粉(例如,直链淀粉、支链淀粉)、蛋白质、酶、多糖、丝、聚羟基链烷酸酯、壳聚糖、葡聚糖、胶原、角叉菜胶、卵叶车前子、阿拉伯胶、琼脂、明胶、虫胶、梧桐树胶、黄原胶、玉米糖胶、瓜尔胶、刺梧桐树胶、琼脂糖、海藻酸、藻酸盐或其天然聚合物。合成聚合物可以包括丙烯酸类、尼龙、硅氧烷、氨纶、粘胶人造丝、多元羧酸、聚乙酸乙烯酯、聚丙烯酰胺、聚丙烯酸酯、聚乙二醇、聚氨酯、聚乳酸、二氧化硅、聚苯乙烯、聚丙烯腈、聚丁二烯、聚碳酸酯、聚乙烯、聚对苯二甲酸乙二醇酯、聚三氟氯乙烯、聚环氧乙烷、聚对苯二甲酸乙二醇酯、聚异丁烯、聚甲基丙烯酸甲酯、聚甲醛、聚丙烯、聚苯乙烯、聚四氟乙烯、聚乙烯醇、聚氯乙烯、聚偏二氯乙烯、聚偏二氟乙烯、聚氟乙烯和/或其组合(例如,共聚物)。珠粒也可以由除聚合物之外的材料形成,例如脂质、胶束、陶瓷、玻璃陶瓷、材料复合物、金属、其他无机材料等。
例如,珠粒可含有分子前体(例如,单体或聚合物),其可通过分子前体的聚合形成聚合物网络。例如,前体可以是已经聚合的物质,其能够通过例如化学交联进行进一步的聚合。例如,前体可包含丙烯酰胺或甲基丙烯酰胺单体、低聚物或聚合物中的一种或多种。例如,珠粒可包含预聚物,其是能够进一步聚合的低聚物。例如,可以使用预聚物制备聚氨酯珠粒。例如,珠粒可含有可进一步聚合在一起的单独聚合物。例如,可以通过不同前体的聚合产生珠粒,使得它们包含混合聚合物、共聚物和/或嵌段共聚物。例如,珠粒可在聚合物前体(例如,单体、低聚物、线性聚合物)、核酸分子(例如,寡核苷酸)、引物和其他实体之间包含共价键或离子键。例如,共价键可以是碳-碳键、硫醚键或碳-杂原子键。
例如,交联可以是永久的或可逆的,这取决于所用的特定交联剂。可逆交联可允许聚合物在适当条件下线性化或解离。例如,可逆交联还可以允许结合物质可逆地附接于珠粒表面。例如,交联剂可形成二硫键。例如,形成二硫键的化学交联剂可以是胱胺或改性的胱胺。
例如,二硫键可以在掺入珠粒的分子前体单元(例如,单体、低聚物或线性聚合物)或前体与核酸分子(例如,寡核苷酸)之间形成。例如,胱胺(包括改性的胱胺)是包含二硫键的有机试剂,其可以用作珠粒的单独单体或聚合物前体之间的交联剂。聚丙烯酰胺可以在胱胺或包含胱胺(例如,改性的胱胺)的物质存在下聚合,以产生包含二硫键的聚丙烯酰胺凝胶珠粒(例如,包含可化学还原的交联剂的可化学降解的珠粒)。二硫键可以允许在珠粒暴露于还原剂时使珠粒降解或溶解。
例如,壳聚糖(线性多糖聚合物)可以通过亲水链与戊二醛交联以形成珠粒。壳聚糖聚合物的交联可以通过由热、压力、pH变化和/或辐射引发的化学反应来实现。
例如,珠粒可以是琼脂糖、聚烯酰胺、PEG等各种单体聚合而成的单一或混合单体的大分子,或是几丁质,玻尿酸、葡聚糖等大分子凝胶,使用微流控液滴平台,在液滴中聚合为大小均一的凝胶珠粒。
例如,珠粒可包含acrydite部分,其在某些方面可用于将一个或多个核酸分子(例如,条形码序列、条形码化核酸分子、条形码化寡核苷酸、引物或其他寡核苷酸)附接到珠粒。例如,acrydite部分可以指由acrydite与一种或多种物质的反应,例如acrydite与其他单体和交联剂在聚合反应期间的反应所产生的acrydite类似物。可以修饰acrydite部分以与待附接的物质形成化学键,例如核酸分子(例如条形码序列、条形码化核酸分子、条形码化寡核苷酸、引物或其他寡核苷酸)。acrydite部分可以用能够形成二硫键的硫醇基团改性,或者可以用已经包含二硫键的基团改性。硫醇或二硫化物(通过二硫化物交换)可以用作待附接物质的锚点,或者acrydite部分的另一部分可以用于附接。例如,附接可以是可逆的,使得当二硫键断裂时(例如,在还原剂存在下),附接的物质从珠粒中释放出来。在其他情况下,acrydite部分可包含可用于附接的反应性羟基。除了二硫键之外,还可以包括其他的释放方式,例如UV光促释放,或者可以用酶释放
离散分区和微流控装置
本申请提供了用于将固体支持物(例如珠粒)与样品共分配的装置,例如,用于共同分配样品组分和珠粒至同一离散分区。例如,将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
例如,该装置可以由任何合适的材料形成。例如,装置可由选自下组的材料形成:熔融 二氧化硅、钠钙玻璃、硼硅酸盐玻璃、聚(甲基丙烯酸甲酯)PMMA、PDMS、蓝宝石、硅、锗、环烯烃共聚物、聚乙烯、聚丙烯、聚丙烯酸酯、聚碳酸酯、塑料、热固性塑料、水凝胶、热塑性塑料、纸、弹性体及其组合。
例如,所述离散分区可以包括孔或微滴。例如,将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述孔或微滴中。例如,所述孔可以包括细胞培养板的上样孔或者其他任何能够与所述装置配合并适于共分配的容器孔。例如,所述离散分区为微滴。例如,其中每个所述离散分区至多包括源自单个细胞的所述目标核酸。例如,所述目标核酸位于单个细胞或细胞核中。例如,使用微流控装置将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
例如,离散分区(例如,液滴或孔)包含单细胞并根据本申请所述的方法进行处理。例如,离散分区包含单细胞和/或单细胞核。可以根据本申请所述的方法分配和处理单细胞和/或单细胞核。例如,单细胞核可以是细胞的组成部分。例如,离散分区包含来自单细胞或单细胞核的染色质(例如,单染色体或基因组的其他部分),并且根据本申请所述的方法进行分配和处理。
例如,所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。所述离散分区中包含但不限于连接酶,还可以包含其他需要的酶。例如,DNA聚合酶、DNA内切酶、DNA外切酶、末端转移酶以及能够使所述寡核苷酸标签从所述固体支持物释放的光敏感酶活pH敏感的酶。所述连接酶包括T4连接酶,但不限于T4连接酶,例如还可以包括大肠杆菌DNA连接酶、T4 DNA连接酶、T7 DNA连接酶、哺乳动物连接酶(例如,DNA连接酶I、DNA连接酶III、DNA连接酶IV)、热稳定连接酶等。
例如,以包含流体流动通道的方式形成所述装置。可以使用任何合适的通道。例如,装置包含一个或多个流体输入通道(例如,入口通道)和一个或多个流体出口通道。例如,流体通道的内径可以为约10μm、20μm、30μm、40μm、50μm、60μm、65μm、70μm、75μm、80μm、85μm、90μm、100μm、125μm或150μm。例如,流体通道的内径可以大于10μm、20μm、30μm、40μm、50μm、60μm、65μm、70μm、75μm、80μm、85μm、90μm、100μm、125μm、150μm或更大。例如,流体通道的内径可以小于约10μm、20μm、30μm、40μm、50μm、60μm、65μm、70μm、75μm、80μm、85μm、90μm、100μm、125μm或150μm。流体通道内的体积流速可以是本领域已知的任何流速。
例如,所述微流控装置为微滴发生器。例如,可以使用微流控装置通过形成同时包含附接有至少一个寡核苷酸标签的固体支持物和样品的水性小液滴而使附接有至少一个寡核苷酸 标签的固体支持物与样品(例如,包含目标核酸样品)组合。所述水性小液滴作为离散分区。该水性小液滴可以是被油相包围的水性核心,例如,油包水乳液内的水性小液滴。该水性小液滴可含有一个或多个附接有至少一个寡核苷酸标签的固体支持物、样品、扩增试剂和还原剂。例如,该水性小液滴可包含以下的一种或多种:水、无核酸酶的水、附接有至少一个寡核苷酸标签的固体支持物、乙腈、固体支持物、凝胶固体支持物、聚合物前体、聚合物单体、聚丙烯酰胺单体、丙烯酰胺单体、可降解的交联剂、不可降解的交联剂、二硫键、acrydite部分、PCR试剂、细胞、细胞核、叶绿体、线粒体、核糖体、引物、聚合酶、条形码、多核苷酸、寡核苷酸、DNA、RNA、肽多核苷酸、互补DNA(cDNA)、双链DNA(dsDNA)、单链DNA(ssDNA)、质粒DNA、粘粒DNA、染色体DNA、基因组DNA、叶绿体DNA、线粒体DNA、核糖体RNA、病毒DNA、细菌DNA、mtDNA(线粒体DNA)、mRNA、rRNA、tRNA、nRNA、siRNA、snRNA、snoRNA、scaRNA、微RNA、dsRNA、探针、染料、有机物、乳化剂、表面活性剂、稳定剂、聚合物、适体、还原剂、引发剂、生物素标记物、荧光团、缓冲液、酸性溶液、碱性溶液、光敏感的酶、pH敏感的酶、水性缓冲液、油、盐、去污剂、离子型去污剂、非离子型去污剂,等等。总之,该水性小液滴的组成将根据特定的处理需求而改变。
水性小液滴可以具有均匀的大小或不均匀的大小。例如,水性小液滴的直径可以为约1μm、5μm、10μm、20μm、30μm、40μm、45μm、50μm、60μm、65μm、70μm、75μm、80μm、90μm、100μm、250μm、500μm或1mm。例如,流体小液滴可以具有至少约1μm、5μm、10μm、20μm、30μm、40μm、45μm、50μm、60μm、65μm、70μm、75μm、80μm、90μm、100μm、250μm、500μm、1mm或更大的直径。例如,流体小液滴可以具有小于约1μm、5μm、10μm、20μm、30μm、40μm、45μm、50μm、60μm、65μm、70μm、75μm、80μm、90μm、100μm、250μm、500μm或1mm的直径。例如,流体小液滴可以具有在约40-75μm、30-75μm、20-75μm、40-85μm、40-95μm、20-100μm、10-100μm、1-100μm、20-250μm或20-500μm的范围内的直径。
如上文所述,所述微流控装置(例如,小液滴发生器)可用于将样品与固体支持物(例如,条形码化附接有至少一个寡核苷酸标签的固体支持物的文库)以及(在需要的情况下)能够降解固体支持物的试剂(例如,如果固体支持物以二硫键连接,则是还原剂)组合。例如,可向与第一流体交叉点(例如,第一流体接合处)流体连接的第一流体输入通道提供样品(例如,核酸样)。可以向同样与第一流体交叉点流体连接的第二流体输入通道提供预形成的固体支持物(例如,附接有至少一个寡核苷酸标签的固体支持物,例如可降解的固体支持物),其中第一流体输入通道与第二流体输入通道在该第一流体交叉点交汇。样品和附接有至 少一个寡核苷酸标签的固体支持物可以在第一流体交叉点混合以形成混合物(例如,水性混合物)。例如,可向第四流体输入通道提供还原剂(或其他需要的试剂,例如表面活性剂、稳定剂、聚合物、适体、引发剂、生物素标记物、荧光团、缓冲液、酸性溶液、碱性溶液、光敏感的酶、pH敏感的酶、水性缓冲液等),该第四流体输入通道同样与第一流体交叉点流体连接,并且与第一和第二流体输入通道在第一流体交叉点交汇。然后,还原剂可以与附接有至少一个寡核苷酸标签的固体支持物和样品在第一流体交叉点混合。例如,还可以在进入微流控装置之前将还原剂(或其他需要的试剂,例如表面活性剂、稳定剂、聚合物、适体、引发剂、生物素标记物、荧光团、缓冲液、酸性溶液、碱性溶液、光敏感的酶、pH敏感的酶、水性缓冲液等)与样品和/或附接有至少一个寡核苷酸标签的固体支持物预混合,使得通过第一流体输入通道向微流控装置提供样品和/或通过第二流体输入通道向微流控装置提供附接有至少一个寡核苷酸标签的固体支持物。
例如,包含目标核酸的样品和附接有至少一个寡核苷酸标签的固体支持物混合物可以通过与第一流体交叉点(并与构成第一流体交叉点的任何流体通道)流体连接的第一出口通道离开第一流体交叉点。可以向与第一出口通道流体连接的第二流体交叉点(例如,第二流体接合处)提供混合物。例如,油(或其他合适的不混溶的)流体可以从与第二流体交叉点(并与构成该交叉点的任何流体通道)流体连接且在第二流体交叉点与第一出口通道交汇的一个或多个单独的流体输入通道进入第二流体交叉点。例如,可以在与第二流体交叉点(并与第一出口通道)流体连接且在第二流体交叉点与第一出口通道以及彼此交汇的一个或两个单独的流体输入通道中提供油(或其他合适的不混溶的流体)。油以及样品与附接有至少一个寡核苷酸标签的固体支持物的混合物可以在第二流体交叉点混合。形成的水性小液滴可在油内被运送通过从第二流体交叉点离开的第二流体出口通道。例如,形成的水性小液滴还可从第一流体交叉点离开第二出口通道的流体小液滴可被分配到孔中以供进一步处理。
例如,还可以控制包含目标核酸的样品相对于附接有至少一个寡核苷酸标签的固体支持物的占有率。这种控制如美国专利申请公开号20150292988中描述,其全部公开内容为了所有目的通过引用以全文并入本文。通常,将包含目标核酸的样品形成小液滴,使得至少50%、60%、70%、80%、90%或更多的小液滴含有不超过一个附接有至少一个寡核苷酸标签的固体支持物。另外,使得至少50%、60%、70%、80%、90%或更多的包含目标核酸的样品形成小液滴包含恰好一个附接有至少一个寡核苷酸标签的固体支持物。
例如,可以在混合物进入微流控装置中之前将样品与包含任何其他试剂(例如,样品扩增所需的扩增剂、还原剂等)的附接有至少一个寡核苷酸标签的固体支持物(例如,可降解 的固体支持物)预混合以产生水性反应混合物。在水性混合物进入流体装置时,该混合物可从第一流体输入通道流动并进入流体交叉点。例如,油相可以从同样与流体交叉点流体连接的第二流体输入通道(例如,与第一流体输入通道垂直或基本垂直的流体通道)进入流体交叉点。该水性混合物和油可以在流体交叉点混合,使得油包水乳液(例如,固体支持物-水-油乳液)形成。该乳液可包含在连续油相中的多个水性小液滴(例如,包含水性反应混合物的小液滴)。例如,每个水性小液滴可包含单个固体支持物(例如,附接至一组相同的条形码的凝胶固体支持物)、样品的等份(例如来自一个细胞的目标核酸)以及任何其他试剂(例如,还原剂、样品扩增所需的试剂等)的等份。例如,流体小液滴可包含多个附接有至少一个寡核苷酸标签的固体支持物。在小液滴形成时,小液滴可通过连续油相被运送通过离开流体交叉点的流体出口通道。离开出口通道的流体小液滴可被分配到孔中以供进一步处理。
在可在进入微流控装置之前将还原剂添加至样品或者可在第一流体交叉点添加还原剂的情况下,在第二流体交叉点形成的流体小液滴可含有还原剂。在这种情况下,当小液滴穿过离开第二流体交叉点的出口通道行进时,还原剂可降解或溶解流体小液滴内含有的固体支持物。
例如,微流控装置可含有平行的三个离散的流体交叉点。流体小液滴可以在这三个流体交叉点的任一处形成。样品和附接有至少一个寡核苷酸标签的固体支持物可以在这三个流体交叉点的任一个内混合。还原剂(或其他任和需要的试剂,例如通透剂、扩增剂、使寡核苷酸标签自固体支持物释放的切割剂)可以在这三个流体交叉点的任一个处添加。油可以在这三个流体交叉点的任一处添加。
例如,所述微流控装置包括第一输入通道和第二输入通道,它们在与输出通道流体连接的接合处汇合。例如,出口通道可以与第三输入通道在接合处流体连接。
例如,所述方法还包括将包含所述目标核酸的样品引入所述第一输入通道,且将附接有至少一个寡核苷酸标签的所述固体支持物引入所述第二输入通道,从而在所述输出通道中生成所述样品与所述附接有至少一个寡核苷酸标签的固体支持物的混合物。
例如,还可以可包含第四输入通道并且其可以与第三输入通道和出口通道在接合处相交。例如,微流控装置可包含第一、第二和第三输入通道,其中第三输入通道与第一输入通道、第二输入通道或者第一输入通道与第二输入通道的接合处相交。例如,所述输出通道与第三输入通道在接合处流体连接。例如,所述第一输入通道和所述第二输入通道彼此之间形成基本上垂直的角度。
例如,还包括将油引入所述第三输入通道,使得形成油包水乳液内的水性小滴作为所述 离散分区。例如,每个所述离散分区中至多包含来自单个细胞的所述目标核酸。
本申请的方法、组合物、装置和试剂盒可与任何合适的油一起使用。例如,油可用于产生微滴。例如,该油可以包括氟化油、硅油、矿物油、植物油及其组合。
例如,微流控装置内的水性流体也可含有醇。例如,醇可以是甘油、乙醇、甲醇、异丙醇、戊醇、乙烷、丙烷、丁烷、戊烷、己烷及其组合。该醇可以以约5%、6%、7%、8%、9%、10%、11%、12%、13%、14%、15%、16%、17%、18%、19%或20%(v/v)存在于水性流体内。例如,该醇可以以至少约5%、6%、7%、8%、9%、10%、11%、12%、13%、14%、15%、16%、17%、18%、19%、20%或更高(v/v)的浓度存在于水性流体内。例如,该醇可以以小于约5%、6%、7%、8%、9%、10%、11%、12%、13%、14%、15%、16%、17%、18%、19%或20%(v/v)存在于水性流体内。
例如,所述油也可含有表面活性剂以稳定乳液。例如,表面活性剂可以是含氟表面活性剂、Krytox润滑剂、Krytox FSH、工程化的流体、HFE-7500、硅酮化合物、含PEG的硅化合物,如bis krytoxpeg(BKP)。该表面活性剂可以以约0.1%、0.5%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%、2%、5%或10%(w/w)存在。例如,该表面活性剂可以以至少约0.1%、0.5%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%、2%、5%、10%(w/w)或更高的浓度存在。例如,该表面活性剂可以以小于约0.1%、0.5%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%、2%、5%或10%(w/w)存在。
例如,可向油中添加加速剂和/或引发剂。例如,加速剂可以是四甲基乙二胺(TMEDA或TEMED)。例如,引发剂可以是过硫酸铵或钙离子。该加速剂可以以约0.1%、0.2%、0.3%、0.4%、0.5%、0.6%、0.7%、0.8%、0.9%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%或2%(v/v)存在。例如,该加速剂可以以至少约0.1%、0.2%、0.3%、0.4%、0.5%、0.6%、0.7%、0.8%、0.9%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%或2%(v/v)或更高的浓度存在。例如,该加速剂可以以小于约0.1%、0.2%、0.3%、0.4%、0.5%、0.6%、0.7%、0.8%、0.9%、1%、1.1%、1.2%、1.3%、1.4%、1.5%、1.6%、1.7%、1.8%、1.9%或2%(v/v)存在。
细胞和样品
在本申请中,所述细胞为任何生物体的细胞。所述生物体的细胞可以是体外细胞(例如,已建立的培养细胞系),可以是离体细胞(来自个体的培养细胞,原代细胞)。细胞可以是体内细胞(生物个体中的细胞),例如来自各种组织中的细胞。
例如所述生物体细胞可以包括动物细胞、植物细胞、微生物细胞。例如所述植物细胞可以包括拟南芥细胞,还可以包括农业作物的细胞,例如小麦,玉米,水稻,高粱,小米,大豆等植物体细胞;所述植物细胞还可以包括水果和坚果植物的细胞,例如产生杏,橙子,柠檬,苹果,李子,梨,杏仁、核桃等的植物体。例如所述植物细胞可以是来源于植物体任意部位的细胞,例如,是根细胞,叶细胞,木质部细胞,韧皮部细胞,形成层细胞,顶端分生组织细胞,薄壁组织细胞。
例如,所述微生物细胞可以包括细菌(例如大肠杆菌,古细菌)、真菌(例如酵母)、放线菌、立克次氏体、支原体、衣原体、螺旋体细胞等。
例如,所述动物细胞可以包括无脊椎动物(例如果蝇、线虫、涡虫等)细胞、脊椎动物(例如斑马鱼、鸡、哺乳动物)细胞。
例如,所述哺乳动物细胞可以包括小鼠、大鼠、兔子、猪、狗、猫、猴子、人类等。
例如,所述动物细胞可以包括来自生物体任何组织的细胞,例如干细胞、诱导性多能干(iPS)细胞、生殖细胞(例如卵母细胞,卵子细胞,精子细胞等),成体干细胞,体细胞(例如成纤维细胞,造血细胞,心肌细胞,神经元,肌肉细胞,骨细胞,肝细胞,胰腺细胞,上皮细胞,免疫细胞以及来源于肺、脾、肾、胃、大肠、小肠等器官或组织的任何细胞)以及胚胎的体外或体内任何阶段的细胞等。
例如,所述细胞可以是来自生物体液中的细胞。例如所述生物体的体液可以包括脑脊液、房水、淋巴液、消化液(例如唾液、胃液、小肠液、胆汁等)、乳汁、血液、尿液、汗液、泪液、粪便、呼吸道分泌物、生殖器官分泌物(例如精液、宫颈黏液)等。
所述样品包括所述细胞和/或由其获得的细胞核。
例如,所述样品可以包括所述生物体的核酸分子。所述核酸分子可以是通过所属领域技术人员已知的分离核酸分子的技术手段从任意生物体分离提取的,包括DNA和RNA。例如所述核酸分子提取自上述的生物体细胞或生物体的体液。
例如,所述目标核酸可以包括来自上述任何细胞中的核酸。例如,单个细胞中的核酸。
例如,所述目标核酸可以来自于单个细胞的多核苷酸,例如,双链DNA。例如所述双链DNA可以包括基因组DNA,例如,编码DNA和非编码DNA;例如,开放染色质区域DNA,蛋白结合处DNA,线粒体DNA和叶绿体DNA,例如所述多核苷酸可以包括RNA,例如核糖体RNA,mRNA。
例如,该目标核酸还可以是来自于福尔马林固定石蜡包埋的(Formalin-Fixed and Parrffin-Embedded,FFPE)含有细胞的样本。
例如,所述目标核酸还可以包括生物体基因组中含有SNP位点的序列,甲基化、羟甲基化修饰的核苷酸序列。
例如,还可以对所述细胞进行预处理。例如,所述预处理还包括使所述细胞的细胞核被暴露。例如,可以通过裂解缓冲液和浓蔗糖溶液处理从而暴露细胞核。
例如,所述细胞和/或由其暴露(获得)的所述细胞核可以被包裹在合适的基质中形成微球,所述微球作为样品进行反应。
例如,所述预处理包括固定所述细胞和/或由其暴露(获得)的所述细胞核。例如,使用固定剂对所述细胞进行固定,所述固定剂选自下组中的一种或多种:甲醛、多聚甲醛、甲醇、乙醇、丙酮、戊二醛、锇酸和重铬酸钾。
其中所述预处理包括使用去垢剂处理所述细胞或细胞核,所述去垢剂包括Triton、NP-40和/或digitonin。
例如,所述预处理还可以包括去除线粒体、叶绿体、核糖体等细胞器。
例如,可将细胞与溶解试剂一起分配,以释放离散分区分区内的细胞的内容物。例如在通过额外通道将细胞引入微滴产生区的同时,或在即将将细胞引入微滴产生区时使溶解剂与细胞悬浮液接触。溶解剂可以包括生物活性试剂,例如用于溶解不同细胞类型(例如革兰氏阳性(gram positive)或阴性细菌、植物、酵母、哺乳动物等)的溶解酶,例如溶菌酶、无色肽酶、溶葡球菌酶、硫葡糖苷酶白芥子(kitalase)、溶壁酶(lyticase)以及其他可商购的溶解酶。例如还可以将其他溶解剂与细胞共分配以使得细胞的内容物释放至离散分区中。例如,可使用基于表面活性剂的溶解溶液来溶解细胞,例如,溶解溶液可包括非离子表面活性剂,诸如TritonX-100和吐温(Tween)20。例如,溶解溶液可包括离子表面活性剂,诸如十二烷基肌氨酸钠和十二烷基硫酸钠(SDS)。例如,还可采用可使用的其他方法(诸如电穿孔、热、声或机械细胞破坏)的溶解方法。
组合物和试剂盒
本申请还提供了一种组合物,其包含:多个固体支持物,每个所述固体支持物上附接有至少一个寡核苷酸标签,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与待测核酸中的序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;所述寡核苷酸标签的条码序列包含共同条码结构域和可变结构域,所述共同条码结构域在附接于同一个固体支持物的寡核苷酸标签中是相同的,且所述共同条码结构域在所述多个固体支持 物中的两个或更多个固体支持物之间是不同的。本申请还提供了用于分析来自细胞的目标核酸的试剂盒,其包本申请所述的组合物。例如,所述试剂盒还可以包括转座酶。例如,所述试剂盒进一步包含核酸扩增剂,逆转录剂,固定剂,通透剂,连接剂和裂解剂中的至少一种。
不欲被任何理论所限,下文中的实施例仅仅是为了阐释本申请的方法和用途等,而不用于限制本申请发明的范围。
实施例
实施例1 检测开放染色质区域(ATAC)
(1)制备包含条码序列的核苷酸标签,其被固定于固相支持物上。
该核苷酸标签有两条链,形成部分双链结构1,如下所示:
链I:固相支持物~附接序列——条码序列(barcode)——杂交序列(固定序列,与链II的互补部分杂交),其中条码序列(barcode)为(barcode-linker)n大于等于1。
具体实例:Bead-acrydite-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(read1,SEQ ID NO:6)-barcode-ATCCACGTGCTTGAG(SEQ ID NO:12)
链II:杂交序列(固定序列,与链I中固定DNA序列杂交)——与转座子复合物链I的5’端互补的序列
具体实例:CGAATGCTCTGGCCTCTCAAGCACGTGGAT(SEQ ID NO:9)
固体支持物是聚丙烯酰胺微球,其通过微流控设备制备,将丙烯酰胺:Bis混合物、以及acrydite-DNA引物、APS诱发剂在微流控装置中混合成为液滴,其中含有TEMED催化剂,液滴会自发聚合成为凝胶微球,之后微球按照barcode合成方式加标签。
在连接反应中,溶液中含有10mM DTT,S-S键可以被还原从而释放引物。
(2)制备转座子复合物,组装含有DNA部分双链序列的Tn5转座子。
其中含有的一个DNA序列为A链和B链退火形成双链结构2。
链A:磷酸基团——与链II中核酸分子的链I或链II中固定DNA序列中至少部分互补的序列——(UMI)——Tn5转座酶结合的序列
具体实例:AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG(SEQ ID NO:5)
链B:Tn5转座酶结合的序列(与链A中的转座子蛋白(Tn5)结合的序列互补的序列)——磷酸基团
具体实例:p-CTGTCTCTTATACACATCT(SEQ ID NO:4)
其中,A链中的UMI不是必须的;(1)和(2)中的序列中可以含有修饰碱基,如5mC。
Tn5转座复合物是二聚体,两个Tn5蛋白可以结合相同或者不同的部分双链DNA转座子,使得插入位点被1种或2种DNA所标记;Tn5蛋白(可以包含点突变超活性或其他类型的转座酶)可以和以上的双链结构2,以及另外一个标准的转座子DNA组装成杂合的转座复合物,或者只使用上述双链结构2形成单一的Tn5转复合物。
(3)制备样品。可以是非固定的细胞或细胞核,甲醛(或其他固定剂)固定的细胞或细胞核,非固定或固定的组织切片等。其中,固定或非固定样品用包含有去垢剂(Triton,NP-40或Digitonin等)的缓冲液处理,还可以包括裂解细胞(非固定样品)得到细胞核的中间步骤,去垢剂裂解或通透细胞和细胞核,使得Tn5酶可以进入细胞核作用。典型的通透剂溶液可以包括Tris,蔗糖,氯化钠,去垢剂。
(4)转座反应。对上述处理好的样品加上包含2价金属离(例如,镁离子)的Tn5酶缓冲液,加入组装好的Tn5转座复合物,进行ATAC转座反应(37℃,30分钟-2小时)。即该反应体系包括:细胞或细胞核或组织;Tn5转座复合物;缓冲液。反应完后,用缓冲液对样品进行洗涤,去除未反应的Tn5酶。
(5)连接反应。加入T4 DNA连接酶反应缓冲液,连接步骤(1)中的核苷酸标签,T4 DNA连接酶,核苷酸标签,在适当温度(4℃-37℃)下进行连接反应20分钟以上。
反应体系包括:细胞或细胞核或组织(转座反应后的);T4 DNA连接酶,核苷酸标签,反应后在连接反应体系中加入过量的游离和核苷酸标签互补序列,封闭多余未反应的核苷酸标签。
(6)提取细胞中的DNA.对于非固定样品,直接加入裂解液后用DNA提取试剂盒,磁珠等方法纯化;对于固定样品,加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联后纯化DNA。
对于纯化的DNA,1)如果是使用杂合Tn5进行,产物双侧都有PCR扩增序列,可以直接扩增,获得测序文库。
2)如果是使用单一的Tn5进行,那么DNA产物只有一侧有PCR引物,我们需要对这个DNA进行打断和连接,在另一侧加入扩增引物,这可以使用单一的Tn5酶进行,也可以使用超声或酶打断,然后末端加A,连接头,最终获得测序文库。
按照上述步骤,以人293T细胞为例,取新鲜细胞,制备细胞核,用杂合Tn5进行ATAC反应,然后连接一个Illumina测序文库P5端(read1)侧的扩增序列的接头,构建文库,用read1引物和杂合Tn5中另一个DNA片段的read2引物扩增产物,最后进行分析。具体步骤 如下:
A.Tn5转座子(transposome)
使如下序列退火形成双链:
10uM Top1 5’p-AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG(SEQ ID NO:5)(链A)
10uM Top2GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG(SEQ ID NO:3)(链A)
20uM Bottom 5’p-CTGTCTCTTATACACATCT(SEQ ID NO:4)(链B)
之后与10uM的Tn5酶(购自Epicenter公司)室温孵育,组装成10uM浓度的Tn5转座子,Top1/Bottom双链与Tn5形成的转座子为p-Tn5,Top2/Bottom双链与Tn5形成的转座子为Tn5-B。
B.珠子制备
1)Bead上附接的序列如下所示:Bead-S-S-PCR adaptor-barcode1-linker1-barcode2-linker2-barcode3-ligation linker
其中,PCR接头(adaptor)序列为ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6),连接序列1(Linker1)为CGACTCACTACAGGG(SEQ ID NO:7),连接序列2(Linker2)序列为TCGGTGACACGATCG(SEQ ID NO:8),Ligation linker序列为ATCCACGTGCTTGAG(SEQ ID NO:12)。Barcode1=96种5bp碱基序列,Barcode2=96种5bp碱基序列,Barcode1=96种5bp碱基序列。
2)合成3x96种序列
1.PCR handle-96xbarcode1-linker1,合成96个此序列的反向互补序列;
2.linker1-96xbarcode2-linker2,合成96个此序列的反向互补序列;
3.linker2-96xbarcode3-ligation linker,合成96个此序列的反向互补序列。
3)微球合成:
合成如下氨基序列:5’amine-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)以及30um羧基修饰微球(知益,
Figure PCTCN2021097800-appb-000001
www.kbspheres.com/productshow.asp?id=903)。
偶联反应:微球+50mM EDC+100uM氨基序列(SEQ ID NO:6),将氨基序列和羧基微球偶联,获得如下结构:bead-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)
4)附接标签
将合成的微球均分到96孔板,分别加入PCR handle-96xbarcode1-linker1,进行第一轮加标签(barcoding)反应。反应体系及过程如下所示:10ul微球+2ul BstI缓冲液+1ul 10uM dNTP+1ul 100uM PCR handle-96xbarcode1-linker1,之后95℃保持5min,60℃保持20min;之后再加入1ul BstI+5ul H 2O,60℃保持60min。
第一轮加标签反应完成后收集所有微球,混合,95c反应5min去除互补链,洗涤,得到第一轮加标签(96xbarcode1-linker1)的微球。然后把微球均分到96孔板中,加入linker1-96xbarcode2-linker2(第二轮)、linker2-96xbarcode3-ligation linker(第三轮),按照第一轮的体系方法进行第二,三轮加标签反应,最后得到带有3重barcode的单链微球,微球洗涤之后,跟互补序列CGAATGCTCTGGCCTCTCAAGCACGTGGAT(SEQ ID NO:9)退火形成部分双链结构,最终获得如下附接部分双链结构的微球:
Bead-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)-barcode1-CGACTCACTACAGGG-barcode2-TCGGTGACACGATCG(SEQ ID NO:8)-barcode3-ATCCACGTGCTTGAG(SEQ ID NO:12)
3’-TAGGTGCACGAACTCTCCGGTCTCGTAAGC-5’(SEQ ID NO:9的反向排列)
C.ATAC实验
将人293T细胞系重悬在裂解液(10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl2;0.01%NP-40)中裂解细胞,获得细胞核。
取10万细胞核与步骤(1)获得的p-Tn5、Tn5-B进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina)+2.5ul 10uM p-Tn5+2.5ul 10uM Tn5-B+20ul细胞核(10万个),37℃反应30min,PBS洗涤细胞核。
D.高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran  Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
E.建库
在步骤D的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,使用Qiagen DNA purification kit纯化水相中的DNA,用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72c 30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
以上文库投入约500细胞,每个细胞测序10万个PE150reads,总数据量为15G。
使用esATAC软件对数据进行分析和质控,把所有测序数据合并在一起分析。结果如图5-7所示,测序片段大小呈现ATAC典型的核小体梯度(见图5),信号富集转录起始位点(TSS)呈现典型的ATAC信号(见图6A和6B),峰(peaks)与已知开放区域高度重合(见图7),其中,总峰数值为11898,落入DNaseI超敏位点合集的峰比例(Peaks overlaped with union DHS ratio)为74.0%,落入黑名单的峰比例(Peaks overlaped with blacklist ratio)为0.5%,FRiP(Fraction of reads in peaks,落入峰域的读长分数)为99.8%。这些结果表明该方法可以精确检测细胞中由Tn5介导的ATAC插入产物。
F单细胞数据拆分与分析
对上述测序数据,首先使用Dropseq pipeline识别细胞核酸标签(cell barcode),在read1中,1-45bp为barcode位置,根据96x96x96种barcode组合可能,统计每种barcode中的read数目,画出堆积曲线,确定文库中存在的有效细胞数目约为400,如图9所示。
ⅱ.获得每个细胞中的unique mapped reads数目,其分布如图10所示,即每一个细胞平均得到的ATAC reads数目中位值在10000左右,优于Bing Ren的约2-3000的平均值。
ⅲ.单细胞ATAC的结果,把每个read比对到基因组上位置(由dropseq程序流程获得)信息加载到IGV基因组浏览器中可视化,得到如图11所示结果,图中下部为45个单细胞的ATAC数据在基因区域的分布,图中部为45个单个细胞的ATAC数据加和在一起的结果显示,其跟图上部大量细胞(约1万细胞)的ATAC模式高度相似,处于基因转录起始位点。
ⅳ.单细胞相关性分析,通过R语言包中计算Pearson Correlation的函数得到如图12所示的结果,图中颜色的加深表示细胞之间的相关性越高,由图中的显示可以看到单细胞的ATAC信号呈现高相关性,表明了由该方法获得的单细胞数据的真实准确性。
实施例2 检测DNA和蛋白质相互作用
(1)CUT&Tag是最新的研究DNA和蛋白相互作用的方法,代替传统的ChIP-seq方法,其原理是用一个蛋白质A(一种细胞来源的可以结合不同物种来源的抗体重链保守区的蛋白)跟Tn5形成融合的蛋白,通过蛋白质A与抗体的结合,把Tn5酶靶向到抗体结合的目标蛋白上,通过Tn5酶的转座活性,把DNA片段直接插入到目标蛋白结合的DNA区域,对这个产物进行扩增测序,从而直接得到蛋白的结合位置信息。因此,CUT&Tag的分子产物是和ATAC一样的,不同的是ATAC中Tn5酶插入位点在开放染色质区域,CUT&Tag中Tn5插入位点是目的蛋白周边,因此可以使用跟实施例1中ATAC类似的方法来标记这一产物,使用的DNA转座子跟ATAC类似,同样也可以组装一种或杂合的Tn5转座复合物。不同的步骤在于:用蛋白质A或者G-Tn5融合蛋白来组装Tn5转座复合物;为了区分多种抗体,DNA转座子上面除了ATAC Tn5的序列外,还可以在不同位置包含抗体识别码,用于区分多种抗体。
(2)制备样品:可以是非固定的细胞或细胞核,甲醛(或其他固定剂)固定的细胞或细胞核,非固定或固定的组织切片等。其中,固定或非固定样品用包含有去垢剂(Triton,NP-40或Digitonin等)的缓冲液处理,还可以包括裂解细胞(非固定样品)得到细胞核的中间步骤,去垢剂裂解或通透细胞和细胞核,使得Tn5酶可以进入细胞核作用。
(3)抗体结合。用血清BSA等对样品进行封闭,然后加入一抗与目标蛋白结合,洗涤去除多余的一抗抗体。可以进一步用抗一抗的二抗继续结合样品(该步骤不是必须的),增加蛋白质A/G的结合位点,放大信号。如果要同时检测2种蛋白质的相互作用,可以将一抗和蛋白质A/G-Tn5融合蛋白先结合为复合物,每种抗体结合的蛋白质A/G-Tn5融合蛋白上的DNA带有不同的抗体识别码。同时将2个或多个一抗-蛋白质A/G-Tn5融合蛋白复合物直接跟细胞/组织结合,一步把Tn5带到目标蛋白周围。
(4)转座反应。用蛋白质A-Tn5融合蛋白(一抗-蛋白质A-Tn5融合蛋白复合物)结合样品,洗涤多余的酶,然后样品中加入含二价离子的Tn5反应液,进行转座反应,(37℃,30分钟-2小时)。
(5)按照实施例1中的方式进行连接反应及后续处理,构建文库,测序。
具体步骤如下:
(1)制备包含条码序列的核苷酸标签,其被固定于固相支持物上。
该核苷酸标签有两条链,形成部分双链结构1,如下所示:
链I:固相支持物~附接序列——条码序列(barcode)——杂交序列(固定序列,与链II的互补部分杂交),其中条码序列(barcode)为(barcode-linker)n大于等于1。
具体实例:Bead-acrydite-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(read1,SEQ ID NO:6)-barcode-ATCCACGTGCTTGAG(SEQ ID NO:12)
链II:杂交序列(固定序列,与链I中固定DNA序列杂交)——与转座子复合物链I的5’端互补的序列
具体实例:CGAATGCTCTGGCCTCTCAAGCACGTGGAT(SEQ ID NO:9)
固体支持物是聚丙烯酰胺微球,其通过微流控设备制备,将丙烯酰胺:Bis混合物、以及acrydite-DNA引物、APS诱发剂在微流控装置中混合成为液滴,其中含有TEMED催化剂,液滴会自发聚合成为凝胶微球,之后微球按照barcode合成方式加标签。
在连接反应中,溶液中含有10mM DTT,S-S键可以被还原从而释放引物。
(2)制备转座子复合物,组装含有DNA部分双链序列的pA-Tn5转座子。
其中含有的一个DNA序列为A链和B链退火形成双链结构2。
链A:磷酸基团——与链II中核酸分子的链I或链II中固定DNA序列中至少部分互补的序列——(UMI)——Tn5转座酶结合的序列
具体实例:AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG(SEQ ID NO:5)
链B:Tn5转座酶结合的序列(与链A中的转座子蛋白(Tn5)结合的序列互补的序列)——磷酸基团
具体实例:p-CTGTCTCTTATACACATCT(SEQ ID NO:4)
其中,A链中的UMI不是必须的;(1)和(2)中的序列中可以含有修饰碱基,如5mC。
Tn5转座复合物是二聚体,两个pA-Tn5蛋白可以结合相同或者不同的部分双链DNA转座子,使得插入位点被1种或2种DNA所标记;pA-Tn5蛋白(可以包含点突变超活性或其他类型的转座酶)可以和以上的双链结构2,以及另外一个标准的转座子DNA组装成杂合的转座复合物,或者只使用上述双链结构2形成单一的Tn5转复合物。
具体操作
等摩尔浓度的pA-Tn5蛋白和退火好的双链引物混合后在室温放置1小时以上,形成功能转座子复合物。
(3)制备样品。可以是非固定的细胞或细胞核,甲醛(或其他固定剂)固定的细胞或细 胞核,非固定或固定的组织切片等。其中,固定或非固定样品用包含有去垢剂(Triton,NP-40或Digitonin等)的缓冲液处理,还可以包括裂解细胞(非固定样品)得到细胞核的中间步骤,去垢剂裂解或通透细胞和细胞核,使得抗体及pA-Tn5酶可以进入细胞核作用。典型的通透剂溶液可以包括Tris,蔗糖,氯化钠,去垢剂。
针对目标蛋白的抗体和样本孵育,使得抗体特异结合在目标蛋白上,洗涤去除未结合的抗体。然后用pA-Tn5转座子和样本孵育,使得pA-Tn5蛋白结合在抗体上,从而定位到目标蛋白附近。
(4)转座反应。对上述处理好的样品加上包含2价金属离(例如,镁离子)的Tn5酶缓冲液,加入转座反应(37℃,30分钟-2小时)。即该反应体系包括:细胞或细胞核或组织;缓冲液。反应完后,用缓冲液对样品进行洗涤,去除未反应的试剂。
(5)连接反应。加入T4 DNA连接酶反应缓冲液,连接步骤(1)中的核苷酸标签,T4 DNA连接酶,核苷酸标签,在适当温度(4℃-37℃)下进行连接反应20分钟以上。
反应体系包括:细胞或细胞核或组织(转座反应后的);T4 DNA连接酶,核苷酸标签,反应后在连接反应体系中加入过量的游离和核苷酸标签互补序列,封闭多余未反应的核苷酸标签。
(6)提取细胞中的DNA.对于非固定样品,直接加入裂解液后用DNA提取试剂盒,磁珠等方法纯化;对于固定样品,加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联后纯化DNA。
对于纯化的DNA,1)如果是使用杂合Tn5进行,产物双侧都有PCR扩增序列,可以直接扩增,获得测序文库。
2)如果是使用单一的Tn5进行,那么DNA产物只有一侧有PCR引物,我们需要对这个DNA进行打断和连接,在另一侧加入扩增引物,这可以使用单一的Tn5酶进行,也可以使用超声或酶打断,然后末端加A,连接头,最终获得测序文库。
按照上述步骤,以人293T细胞为例,取新鲜细胞,制备细胞核,用杂合pA-Tn5进行CUT Tag反应,然后连接一个Illumina测序文库P5端(read1)侧的扩增序列的接头,构建文库,用read1引物和杂合pA-Tn5中另一个DNA片段的read2引物扩增产物,最后进行分析。具体步骤如下:
A.pA-Tn5转座子(transposome)
使如下序列退火形成双链:
10uM Top1 5’p-AGGCCAGAGCATTCGNNNNNNNAGATGTGTATAAGAGACAG (SEQ ID NO:5)(链A)
10uM Top2GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG(SEQ ID NO:3)(链A)
20uM Bottom 5’p-CTGTCTCTTATACACATCT(SEQ ID NO:4)(链B)
之后与10uM的pA-Tn5酶(购买自Vazyme)室温孵育,组装成10uM浓度的pA-Tn5转座子,Top1/Bottom双链与Tn5形成的转座子为p-pA-Tn5,Top2/Bottom双链与Tn5形成的转座子为pA-Tn5-B。
B.细胞标记微球制备
1)Bead上附接的序列如下所示:Bead-S-S-PCR adaptor-barcode1-linker1-barcode2-linker2-barcode3-ligation linker
其中,PCR接头(adaptor)序列为ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6),连接序列1(Linker1)为CGACTCACTACAGGG(SEQ ID NO:7),连接序列2(Linker2)序列为TCGGTGACACGATCG(SEQ ID NO:8),Ligation linker序列为ATCCACGTGCTTGAG(SEQ ID NO:12)。Barcode1=96种5bp碱基序列,Barcode2=96种5bp碱基序列,Barcode1=96种5bp碱基序列。
2)合成3x96种序列
1.PCR handle-96xbarcode1-linker1,合成96个此序列的反向互补序列;
2.linker1-96xbarcode2-linker2,合成96个此序列的反向互补序列;
3.linker2-96xbarcode3-ligation linker,合成96个此序列的反向互补序列。
3)微球合成:
合成如下氨基序列:5’amine-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)以及30um羧基修饰微球(知益,
Figure PCTCN2021097800-appb-000002
www.kbspheres.com/productshow.asp?id=903)。
偶联反应:微球+50mM EDC+100uM氨基序列(SEQ ID NO:6),将氨基序列和羧基微球偶联,获得如下结构:bead-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)
4)附接标签
将合成的微球均分到96孔板,分别加入PCR handle-96xbarcode1-linker1,进行第一轮加标签(barcoding)反应。反应体系及过程如下所示:10ul微球+2ul BstI缓冲液+1ul 10uM dNTP+1ul 100uM PCR handle-96xbarcode1-linker1,之后95℃保持5min,60℃保持20min; 之后再加入1ul BstI+5ul H 2O,60℃保持60min。
第一轮加标签反应完成后收集所有微球,混合,95c反应5min去除互补链,洗涤,得到第一轮加标签(96xbarcode1-linker1)的微球。然后把微球均分到96孔板中,加入linker1-96xbarcode2-linker2(第二轮)、linker2-96xbarcode3-ligation linker(第三轮),按照第一轮的体系方法进行第二,三轮加标签反应,最后得到带有3重barcode的单链微球,微球洗涤之后,跟互补序列CGAATGCTCTGGCCTCTCAAGCACGTGGAT(SEQ ID NO:9)退火形成部分双链结构,最终获得如下附接部分双链结构的微球:
Bead-S-S-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-barcode1-CGACTCACTACAGGG-barcode2-TCGGTGACACGATCG-barcode3-ATCCACGTGCTTGAG
3’-TAGGTGCACGAACTCTCCGGTCTCGTAAGC-5’(SEQ ID NO:9的反向排列)
C.ATAC实验
将人293T细胞系重悬在裂解液(10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl 2;0.01%NP-40)中裂解细胞,获得细胞核。
取10万细胞核与目标蛋白抗体进行孵育,例如抗组蛋白H3K4me3的抗体(Abcam公司),结合条件如下0.05%Digitonin,20mM HEPES,pH 7.5,300mM NaCl,0.5mM Spermidine,1X Protease inhibitor(Roche)buffer中,抗体浓度1ug/100ul,在室温下结合1hr或者4摄氏度过夜结合。
用0.05%Digitonin,20mM HEPES,pH 7.5,300mM NaCl,0.5mM Spermidine,1X Protease inhibitor(Roche)buffer洗涤样品3次。
在样品中加入1ug/100ul的pA-Tn5转座子复合物,buffer条件如上,室温孵育1hr,用此buffer洗涤样品3次。
在buffer中加入MgCl2至镁离子浓度20mM,37c 1hr进行转座反应,在此过程中,pA-Tn5将切割其结合位置相邻的DNA并插入其上的DNA序列。
反应后用PBS洗涤细胞核。
D.高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4  DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
E.建库
在步骤D的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,使用Qiagen DNA purification kit纯化水相中的DNA,用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
以上文库投入约500细胞,每个细胞测序10万个PE150reads,总数据量为15G。
抗体使用Abcam公司rabbit-anti-H3K4me3,图13显示的是,Cut tag文库片段分布结果。图14显示的是Cut tag片段在转录起始位点分布位置结果图。图15显示的是Cut tag片段在基因组中分布的比例。图16显示的是单细胞Cut tag结果分布结果,单细胞数据叠加后呈现典型的H3K4me3组蛋白修饰分布特征,与多细胞样品实验结果高度相似,表明了由该方法获得的单细胞数据的真实准确性。
实施例3 检测细胞或细胞核中转录组
(1)制备反转录引物。5’端磷酸化的并且与核苷酸标签互补的序列-UMI分子计数序列-polyT序列。核苷酸标签的制备同实施例1的方式。RT引物AGGCCAGAGCATTCGNNNNNNNTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT(SEQ ID NO:13);
(2)制备样品。是非固定的细胞或细胞核,甲醛(或其他固定剂)固定的细胞或细胞核,非固定或固定的组织切片等
固定或非固定样品用包含有去垢剂(Triton,NP-40,Digitonin,etc.)的缓冲液处理,可能包括裂解细胞(非固定样品)得到细胞核的中间步骤,去垢剂裂解或通透细胞和细胞核,使得 酶等分子生物学试剂可以进入细胞或细胞核。
(3)反转录。利用步骤(1)的反转录引物,提供反转录酶反应体系,加入链转化模板,对样品进行细胞内反转录反应,反应后细胞/核仍然是独立完整的形态。反应体系及条件如下:细胞/组织,反转录酶缓冲液,RNA酶抑制剂,dNTP,TSO链转换引物,反转录引物;50-55℃,5分钟,4℃+反转录酶,42℃。洗涤去除引物和酶体系,对细胞或组织进行核苷酸标签连接反应。结束后加入引物中和多余引物。
(4)样品后续处理。纯化mRNA/cDNA:非固定组织直接纯化mRNA/cDNA,固定组织解交联后纯化mRNA/cDNA;对mRNA/cDNA进行PCR扩增cDNA,获得cDNA文库,将cDNA文库用Tn5或者其他DNA打断方法构建成测序文库。
具体步骤如下:
制备包含条码序列的核苷酸标签,其被固定于固相支持物上,步骤同上述实施例。
制备转座子复合物,步骤同上述实施例。
制备样品。细胞核:在10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl 2;0.01%NP-40 buffer将组织匀浆,裂解细胞,500g 5min离心,用buffer重悬一次,500g 5min离心,重悬在上述buffer中。
逆转录
设置如下反应,各组分终浓度如下:1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤细胞核,500g 5min离心洗涤2次,去除未反应的酶和引物。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran  Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
建库
在高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,使用Qiagen DNA purification kit纯化水相中的cDNA/mRNA复合物。
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物ISPCR(AAGCAGTGGTATCAACGCAGAGT(SEQ ID NO:15)),1ul Taq,94℃2min,94℃30sec,60℃30sec,72℃3min,18个循环。用AMPure XP磁珠1:1体积纯化扩增后cDNA,用QuBit定量。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera),20ul反应体系,55℃7min,加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq酶。72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。用AMPure XP磁珠1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
以上文库投入约500细胞,每个细胞测序10万个PE150reads,总数据量为15G。
两种混合细胞293T(人),3T3(鼠)后进行单细胞转录组实验,根据细胞条形码(cell barcode)回帖。图17显示的是,单细胞结果清楚区分2种细胞的单个细胞。图18显示的是每一细胞中检测的转录本及基因数目分布结果。本申请的方法可以用于单细胞的转录组检测。类似地,可以通过本申请的方法对两种混合细胞293T(人),3T3(鼠)后进行单细胞基因组实验。根据测得序列比对到人或鼠基因组的比例,图19显示的是,单细胞结果清楚区分2种细胞的单个细胞,混合纯表示可以从混合的细胞中分离出纯净的人或鼠的来源,仅有很少部分细胞匹配冲突。图20显示的是,单个人细胞,基因组覆盖度,依据染色体排列,呈现单细胞测序在每个细胞和每个基因组位点有不同的覆盖程度。本申请的方法可以用于单细胞的基因组检测。本申请的方法还可以通过基因组和转录组检测,用于区分混合细胞中各种细胞的单细胞。
实施例4 检测细胞中DNA序列、数量
(1)核苷酸标签的制备同实施例1的方式。唯一的差别是5’amine-S-S- ACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:6)这一和微球偶联的序列中,所有的C碱基被替换为5mC修饰的碱基。
(2)制备样品。样品:固定的单细胞或细胞核。样品处理方式:细胞或细胞核用一定浓度的SDS和/或其他去垢剂,在加热条件下处理一定时间,把结合在DNA上面的蛋白去除掉,但是并不解开交联,因此DNA还固定在细胞结构中。
(3)转座反应。对上述处理好的样品加上包含2价金属离子例如(镁离子)的Tn5酶缓冲液,加入组装好的Tn5酶,对基因组进行转座反应(37℃,30分钟-2小时)。体系:细胞/核,Tn5缓冲液,Tn5转座复合物,37℃。之后用缓冲液对样品进行洗涤,去除未反应的Tn5酶。
(4)连接反应。加入T4 DNA连接酶反应缓冲液,连接步骤(1)中的核苷酸标签,T4 DNA连接酶,核苷酸标签,在适当温度(4℃-37℃)下进行连接反应20分钟以上。
反应体系包括:细胞或细胞核或组织(转座反应后的);T4 DNA连接酶,核苷酸标签,T4 DNA连接酶。反应后在连接反应体系中加入过量的游离和核苷酸标签互补序列,封闭多余未反应的核苷酸标签。
(5)获取DNA。加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联后纯化DNA,从而得到标记的全基因组DNA。后续进行如下处理:
1)对DNA进行直接测序,得到全基因组序列信息,包括基因组不同区域的拷贝数信息(CNV),或者是点突变信息(SNV)。
2)对DNA进行5mC检测,例如用亚硫酸氢盐转化法(Bisulfite conversion)或者NEB酶转化法(enzymatic conversion)(NEB),或基于MspI酶切的还原亚硫酸氢盐测序(reduced bisulfite sequence)等方法检测基因组上5mC信息。在对修饰C进行转化时,连接引物则设计为抵御转化的碱基或修饰碱基,从而保证扩增。
3)对DNA进5hmC检测,利用beta-galactose transferase对5hmC位点进行修饰,用下游方法进行5hmC检测。
4)对其他DNA修饰碱基的检测。
具体步骤如下:
样品处理
细胞用4%甲醛在1xPBS中室温固定10min,加入Glysine溶液到终浓度0.1M室温终止5min,用PBS洗涤细胞2次,500g 5min离心沉淀,固定细胞可在-80℃或-20℃保存;细胞在室温融化,加入10mM Tris 0.2%SDS溶液,42℃处理10min;PBS溶液洗涤3次;取10万 细胞核与以上实施例ATAC实验中制备的获得的p-Tn5、Tn5-B进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina)+2.5ul 10uM p-Tn5+2.5ul 10uM Tn5-B+20ul细胞核(10万个),37℃反应30min,PBS洗涤细胞核。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
基因组建库
高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,在水相加入以下试剂到终浓度1%SDS,Proteinease K 20ug/ml,55℃反应2hr;使用Qiagen DNA purification kit纯化水相中的DNA,用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
单细胞基因组分析
5mC甲基化测序
对于以上获得的DNA,首先用EpiTect Fast Bisulfite Conversion Kit或NEB Enzymatic Methylation conversion kit等转化试剂盒对上述获得的基因组DNA进行转化,例如以Qiagen kit为例,参照说明书配置bisulfite conversion试剂。
上述DNA,85ul Bisulfite solution,35ul DNA protection Buffer,H 2O,总体积140ul。
95℃5min,60℃10min,95℃5min,60℃10min,20℃保持。参照说明书中步骤柱纯化转化后的DNA。
DNA扩增
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
单细胞甲基化分析
5hmC甲基化测序
使用赛默飞EpiJET 5-hmC Enrichment Kit对回收的DNA进行5hmc富集,然后建库测序。
回收DNA,12.5μL 4X Enzyme Reaction Buffer,10ul 5-hmC Modifying Enzyme,加水至50ul,30℃反应1hr。磁珠1:1体积纯化DNA。
40ul洗脱样本,10ul 10xbiotin conjugation buffer,50ul biotin reagent,50℃5min,加100ul elution buffer中止反应,然后试剂盒柱纯化DNA。
DNA扩增
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
两种混合细胞293T(人),3T3(鼠)后进行单细胞甲基化实验。图21显示的是,根据测得序列比对到人或鼠基因组的比例,单细胞结果清楚区分2种细胞的单个细胞。图22显示的是,单细胞甲基化分布结果,单细胞数据叠加后与多细胞样品实验结果高度相似,表明了由该方法获得的单细胞甲基化数据的真实准确性。
本申请的方法还可以用于单细胞5hmC测序。如图23显示的是,单细胞的5hmC修饰位点的分布结果。本申请获得的单细胞5hmC修饰数据具有真实准确性。
实施例5 同时检测来源于同一细胞的转录组和ATAC
使用含有相同5’末端连接序列的dT引物和Tn5酶。制备细胞核,然后对细胞进行RT(逆转录)反应,洗涤去除RT反应体系后再进行Tn5 ATAC反应,之后细胞中的mRNA和ATAC同时被标记。然后进行跟微球上面释放的引物的连接。回收ATAC的DNA以及RT的mRNA/cDNA混合物。
对这一混合物用连接接头上的通用引物以及Tn5跟cDNA特异的引物分别扩增Tn5文库和cDNA文库,建库测序。
具体步骤如下:
将人293T细胞系重悬在裂解液(10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl 2;0.01%NP-40)中裂解细胞,获得细胞核。
取10万细胞核与本申请实施例中获得的p-Tn5、Tn5-B进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina),2.5ul 10uM p-Tn5,2.5ul 10uM Tn5-B,20ul细胞核(10万个),37℃反应30min,PBS洗涤细胞核。
以上获得的细胞核进行如下RT反应
1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤细胞核,500g 5min离心洗涤2次,去除未反应的酶和引物。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
建库
在高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,使用Qiagen DNA purification kit纯化水相中的ATAC DNA以及mRNA/cDNA。
文库扩增,同时扩增ATAC DNA以及mRNA/cDNA
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul 10mM ISPCR引物,1ul Taq,72℃5min,94℃2min,94℃30sec,55℃30sec,72℃3min,12个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
ISPCR引物:AAGCAGTGGTATCAACGCAGAGT(SEQ ID NO:15)
纯化以上混合文库,AMPure bead 1:1纯化定量。
ATAC文库扩增
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
cDNA扩增和测序文库构建
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10mM ISPCR引物,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera),20ul反应体系,55℃7min。加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq enzyme,72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。
用AMPure XP磁珠,1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
同时分析转录组和ATAC。
本申请用于同时检测同一细胞的转录组和ATAC,图24显示的是,根据转录组和ATAC基因组均可以很好地区分2种细胞中的单个细胞。本申请的方法用于同时检测同一细胞的转录组和ATAC具有准确性。
实施例6 同时检测来源于同一细胞的转录组和CUT&Tag
具体步骤如下:
将人293T细胞系重悬在裂解液(10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl 2;0.01%NP-40)中裂解细胞,获得细胞核。
取10万细胞核与目标蛋白抗体进行孵育,例如抗组蛋白H3K4me3的抗体(Abcam公司),结合条件如下0.05%Digitonin,20mM HEPES,pH 7.5,300mM NaCl,0.5mM Spermidine,1X Protease inhibitor(Roche)buffer中,抗体浓度1ug/100ul,在室温下结合1hr或者4℃过夜结合。
用0.05%Digitonin,20mM HEPES,pH 7.5,300mM NaCl,0.5mM Spermidine,1X Protease inhibitor(Roche)buffer洗涤样品3次。
在样品中加入1ug/100ul的pA-Tn5转座子复合物,buffer条件如上,室温孵育1hr,用此buffer洗涤样品3次。
在buffer中加入MgCl 2至镁离子浓度20mM,37℃1hr进行转座反应,在此过程中,pA-Tn5将切割其结合位置相邻的DNA并插入其上的DNA序列。
反应后用PBS洗涤细胞核。
以上获得的细胞核进行如下RT反应各组分终浓度如下:1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤细胞核,500g 5min离心洗涤2次,去除未反应的酶和引物。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
建库
在高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,使用Qiagen DNA purification kit纯化水相中的ATAC DNA以及mRNA/cDNA
文库扩增,同时扩增ATAC DNA以及mRNA/cDNA
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul 10mM ISPCR引物,1ul Taq,72℃5min,94℃2min,94℃30sec,55℃30sec,72℃3min,12个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
ISPCR引物:AAGCAGTGGTATCAACGCAGAGT(SEQ ID NO:15)
纯化以上混合文库,AMPure bead 1:1纯化定量。
CUT Tag文库扩增
1ng上述DNA
DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环
AMPure 1:1纯化文库,定量,测序。
cDNA扩增和测序文库构建
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物 TrueseqD501,,1ul 10mM ISPCR引物,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
AMPure 1:1纯化文库,定量,测序。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera),20ul反应体系,55℃7min。加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq enzyme,72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。
用AMPure XP磁珠,1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
同时分析转录组cut tag。
本申请用于同时检测同一细胞的转录组和cut tag,图25显示的是,根据转录组和cut tag组均可以很好地区分2种细胞中的单个细胞。本申请的方法用于同时检测同一细胞的转录组和cut tag具有准确性。
实施例7 同时检测来源于同一细胞的转录组和基因组
与单纯检测基因组DNA一样处理样品,先对细胞核进行分离(strip),然后进行Tn5转座反应,再进行RT(逆转录)反应,之后按照实施例5的方式进行处理。
具体步骤如下:
样品处理
细胞用4%甲醛在1xPBS中室温固定10min,加入Glysine溶液到终浓度0.1M室温终止5min,用PBS洗涤细胞2次,500g 5min离心沉淀,固定细胞可在-80℃或-20℃保存;细胞在室温融化,加入10mM Tris 0.2%SDS溶液,42℃处理10min;PBS溶液洗涤3次;取10万细胞核与以上实施例ATAC实验中制备的获得的p-Tn5、Tn5-B进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina)+2.5ul 10uM p-Tn5+2.5ul 10uM Tn5-B+20ul细胞核(10万个),37℃反应30min,PBS洗涤细胞核。
以上获得的细胞核进行如下RT反应
1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID  NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤细胞核,500g 5min离心洗涤2次,去除未反应的酶和引物。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
建库
在高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,获取DNA。加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联后纯化DNA,从而得到标记的全基因组DNA。后续进行如下处理:
使用Qiagen DNA purification kit纯化水相中的基因组DNA以及mRNA/cDNA。
文库扩增,同时扩增ATAC DNA以及mRNA/cDNA
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul 10mM ISPCR引物,1ul Taq,72℃5min,94℃2min,94℃30sec,55℃30sec,72℃3min,12个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
ISPCR引物:AAGCAGTGGTATCAACGCAGAGT(SEQ ID NO:15)
纯化以上混合文库,AMPure bead 1:1纯化定量。
基因组文库扩增
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物 TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
cDNA扩增和测序文库构建
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10mM ISPCR引物,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera),20ul反应体系,55℃7min。加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq enzyme,72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。
用AMPure XP磁珠,1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
同时分析转录组和基因组
本申请用于同时检测同一细胞的转录组和基因组可以很好地区分2种细胞中的单个细胞。在同时检测同一细胞的转录组和cut tag方面具有准确性。
实施例8 同时检测来源于同一细胞的转录组和DNA修饰
具体步骤如下:
样品处理
细胞用4%甲醛在1xPBS中室温固定10min,加入Glysine溶液到终浓度0.1M室温终止5min,用PBS洗涤细胞2次,500g 5min离心沉淀,固定细胞可在-80℃或-20℃保存;细胞在室温融化,加入10mM Tris 0.2%SDS溶液,42℃处理10min;PBS溶液洗涤3次;取10万细胞核与以上实施例ATAC实验中制备的获得的p-Tn5、Tn5-B进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina)+2.5ul 10uM p-Tn5+2.5ul 10uM Tn5-B+20ul细胞核(10万个),37℃反应30min,PBS洗涤细胞核。
以上获得的细胞核进行如下RT反应
1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID  NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤细胞核,500g 5min离心洗涤2次,去除未反应的酶和引物。
高通量标记
使用如图8所示的微流控芯片进行细胞标记,微球通道(Bead channel):100um,细胞核通道(Nuclei Channel):50um。
准备以下溶液:
细胞核溶液1ml(100细胞核/ul浓度),包括:200ul 10xT4 DNA ligase Buffer,10ul T4 DNA ligase,10ul 1M DTT,780ul细胞核/水。
bead溶液(100bead/ul浓度):Bead in PBS。
细胞核溶液、bead溶液、油(FC40氟碳油,含有1%表面活性剂FluoroSurfactant,Ran Biotech)在微流控芯片上形成120um直径的液滴(drop collection),37℃连接1小时。
建库
在高通量标记的液滴中加入等体积全氟辛醇破碎液滴,离心,吸取水相,获取DNA。加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联。
使用Qiagen DNA purification kit纯化水相中的DNA以及mRNA/cDNA
将以上文库分成2份,分别进行甲基化测序和转录组测序
对于甲基化文库,进行Bislufite测序或者5hmc测序
甲基化测序
对于以上获得的DNA,首先用EpiTect Fast Bisulfite Conversion Kit或NEB Enzymatic Methylation conversion kit等转化试剂盒对上述获得的基因组DNA进行转化,例如以Qiagen kit为例,参照说明书配置bisulfite conversion试剂。
上述DNA,85ul Bisulfite solution,35ul DNA protection Buffer,H 2O,总体积140ul。
95℃5min,60℃10min,95℃5min,60℃10min,20℃保持。参照说明书中步骤柱纯化转化后的DNA。
DNA扩增
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列: AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
5hmC甲基化测序,使用赛默飞EpiJET 5-hmC Enrichment Kit对回收的DNA进行5hmc富集,然后建库测序。
回收DNA,12.5μL 4X Enzyme Reaction Buffer,10ul 5-hmC Modifying Enzyme,加水至50ul,30℃反应1hr。磁珠1:1体积纯化DNA。
40ul洗脱样本,10ul 10xbiotin conjugation buffer,50ul biotin reagent,50℃5min,加100ul elution buffer中止反应,然后试剂盒柱纯化DNA。
DNA扩增
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
DNA扩增
用如下反应体系扩增DNA获得最终测序文库:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
cDNA扩增和测序文库构建
上述回收的DNA及cDNA/mRNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,,1ul 10mM ISPCR引物,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。
AMPure 1:1纯化文库,定量,测序。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera),20ul反应体系,55℃7min。加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq enzyme,72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。
用AMPure XP磁珠,1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
同时分析转录组及甲基化。
本申请用于同时检测同一细胞的转录组和甲基化,图26显示的是,同一细胞的转录组和甲基化组均可以很好地与基因模型以及已知的甲基化位点进行匹配。本申请的方法用于同时检测同一细胞的转录组和甲基化具有准确性。
实施例9 空间多组学技术平台
(1)空间点阵芯片:芯片上是固定间隔的DNA oligo簇,结构如下:
Slide-Surface(玻片表面)-释放linker-PCR adaptor-barcode-连接臂芯片跟互补的单链杂交,将oligo点阵变成如下结构:
Slide-Surface(玻片表面)-释放linker-PCR adaptor-barcode-连接臂
连接臂---互补链
空间点阵用microarray原位合成方法(Affymetrix,NimbleGene)或者其他方法合成,包括从已有的array上面用PCR方法转移,用顺序标记法延伸等。
(2)组织切片制备:将非固定组织冰冻切片贴在盖玻片上,加1%甲醛,固定组织,洗涤。
(3)通透处理:用含有去垢剂的缓冲液处理组织。
(4)在组织上方加入反转录反应mix,使用带有5’磷酸修饰,5’延伸可以跟芯片上的 oligo互补的反转录引物进行原位RT(反转录)反应。
(5)洗掉反转录反应体系,在slide上加带有5’磷酸修饰的Tn5酶,进行原位ATAC反应。
(6)洗掉ATAC反应体系,在组织上方加入DNA连接酶buffer,DNA连接酶,然后把组织贴到DNA oligo点阵上,二者紧密接触。DNA oligo从载玻片上释放,转移到组织切片上进行连接反应,标记cDNA和Tn5产物。
(7)反应结束后,对组织进行成像。
(8)终止反应,用蛋白酶消化组织,回收DNA,按照前述实施例的方式对cDNA和ATAC DNA建库测序。
具体步骤如下:
使用Affymetrix公司技术,在玻璃/硅机制上合成100x100,大小为5um,间隔5um的引物点阵,总面积为1cm x 1cm,共一万个DNA oligo点阵,图27显示的是一种空间点阵芯片,DNA点阵可以有规律排布的dT引物阵列,并且和FAM-AAAAAAAAAAAAAAAAAAAAAAAA(SEQ ID NO:17)引物杂交。具体点阵DNA序列为:
S-S-ACACTCTTTCCCTACACGACGCTCT(SEQ ID NO:16)-NNNNNNNN-ATCCACGTGCTTGAG(SEQ ID NO:12)
点阵DNA序列中NNNNNNNN为8bp的特定引物序列,点阵上每一个点对应一个特定的8bp序列。
以上玻片上方加入CGAATGCTCTGGCCTCTCAAGCACGTGGAT(SEQ ID NO:9)引物,在1M NaCl,10mM Tris溶液中,和玻璃室温杂交1hr,使得点阵上的引物退火为部分双链引物。
将OCT包埋的组织用冰冻切片机切片,贴附到多聚赖氨酸表面处理的玻片上。
用1%甲醛固定组织,室温10分钟,用PBS清洗玻片。
玻片上组织用裂解液处理(10mM Tris–Cl,pH 7.4;10mM NaCl;3mM MgCl 2;0.01%NP-40),室温5min。
用本申请实施例中获得的p-Tn5、Tn5-B对玻片进行反应,反应体系如下所示:
25ul 2xTD Buffer(Illumina),2.5ul 10uM p-Tn5,2.5ul 10uM Tn5-B,20ul细胞核(10万个),37℃反应30min,PBS洗涤切片。
对切片进行RT反应
1000/ul细胞核,1x RT Buffer,1uM dNTP,1uM上述逆转录引物,1u/ul RNase酶抑制 剂,1uM TSO引物引物序列(5′-AAGCAGTGGTATCAACGCAGAGTACATrGrGrG(SEQ ID NO:14)-3′,其中3末端的G可以是rG,rG表示核糖鸟嘌呤,1unit/ul RT酶(Superscript II reverse transcriptase);反应条件:50℃5min,4℃5min,42℃60min,用PBS洗涤切片。
将反应后组织和合成的引物点阵玻片接触,加入1xT4 ligase buffer,1unit/ul T4 DNA连接酶,使得玻片上部分双链形式的adaptor与组织切片上的RT产物及AATC产物进行连接反应。
回收cDNA及ATACDNA。在切片上方加入蛋白酶K反应缓冲液,蛋白酶K,在55-65℃进行解交联后纯化DNA,然后用Qiagen kit纯化获得基因组DNA以及逆转录的mRNA/cDNA
使用Qiagen DNA purification kit纯化水相中的ATAC DNA以及mRNA/cDNA。
文库扩增,同时扩增ATAC DNA以及mRNA/cDNA
用如下反应体系扩增DNA及cDNA:36ul DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul 10mM ISPCR引物,1ul Taq,72℃5min,94℃2min,94℃30sec,55℃30sec,72℃3min,12个循环。
引物TrueseqD501序列:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:10)
引物N701序列:CAAGCAGAAGACGGCATACGAGATATCGGCTAGTCTCGTGGGCTCGG(SEQ ID NO:11)
ISPCR引物:AAGCAGTGGTATCAACGCAGAGT(SEQ ID NO:15)
纯化以上混合文库,AMPure bead 1:1纯化定量。
ATAC文库扩增
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10uM引物N701,1ul Taq,94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
cDNA扩增和测序文库构建
1ng上述DNA,DNA模板,10ul 5xPCR Buffer,1ul 10mM dNTP,1ul 10uM引物TrueseqD501,1ul 10mM ISPCR引物,1ul Taq 94℃2min,94℃30sec,55℃30sec,72℃30sec,18个循环。AMPure 1:1纯化文库,定量,测序。
测序文库打断
1ng cDNA,10ul 2xTD Buffer(Illumina Nextera kit),1ul Nextera enzyme(Illumina Nextera), 20ul反应体系,55℃7min。加入5ul Tn5 stop buffer(Nextera kit)。
文库扩增
25ul以上反应体系,1ul 10uM引物TrueseqD501,1ul 10uM引物Nextera N701引物,1ul Taq enzyme,72℃5min,94℃2min,94℃30sec,60℃30sec,72℃3sec,18个循环。
用AMPure XP磁珠,1:1体积纯化文库。
Illumina Novaseq每个细胞测10万个PE150读长(reads)。
同时分析转录组和基因组。
图28显示的是,切片HE染色与空间点阵芯片叠加,每个圆点的颜色深浅度表示测量获得的基因数目。本申请的方法可以用于空间多组学技术平台的研究。
前述详细说明是以解释和举例的方式提供的,并非要限制所附权利要求的范围。目前本申请所列举的实施方式的多种变化对本领域普通技术人员来说是显而易见的,且保留在所附的权利要求和其等同方案的范围内。

Claims (114)

  1. 一种分析来自细胞的目标核酸的方法,所述方法包括:
    a)提供包含下述的离散分区:
    ⅰ.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及
    ⅱ.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
    b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
  2. 根据权利要求1所述的方法,其中所述寡核苷酸标签可释放地附接至所述固体支持物。
  3. 根据权利要求1-2中任一项所述的方法,其包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
  4. 根据权利要求1-3中任一项所述的方法,其中所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
  5. 根据权利要求1-4中任一项所述的方法,其中所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
  6. 根据权利要求5所述的方法,其中所述连接酶包括T4连接酶。
  7. 根据权利要求1-6中任一项所述的方法,其中在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
  8. 根据权利要求1-7中任一项所述的方法,其中所述固体支持物为珠粒。
  9. 根据权利要求8所述的方法,其中所述珠粒为磁性珠粒。
  10. 根据权利要求1-9中任一项所述的方法,其中所述离散分区为孔或微滴。
  11. 根据权利要求1-10中任一项所述的方法,其中所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
  12. 根据权利要求11所述的方法,其中所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
  13. 根据权利要求1-12中任一项所述的方法,其中a)包括将所述源于单个细胞的目标核酸与所 述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
  14. 根据权利要求1-13中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  15. 根据权利要求1-14中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  16. 根据权利要求1-15中任一项所述的方法,其中所述经附接的目标核酸中包含独特分子鉴别区。
  17. 根据权利要求16所述的方法,其中所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
  18. 根据权利要求1-17中任一项所述的方法,其中所述寡核苷酸标签还包含扩增引物识别区。
  19. 根据权利要求18所述的方法,其中所述扩增引物识别区为通用扩增引物识别区。
  20. 根据权利要求1-19中任一项所述的方法,其进一步包括:
    c)获得所述条码化的目标核酸的表征结果;以及
    d)至少部分基于c)中获得的所述表征结果中存在相同的所述细胞条码序列而将所述目标核酸的序列识别为源于所述单个细胞。
  21. 根据权利要求20所述的方法,其进一步包括,在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸。
  22. 根据权利要求20-21中任一项所述的方法,其中c)包括对所述条码化的目标核酸进行测序,从而获得所述表征结果。
  23. 根据权利要求20-22中任一项所述的方法,其进一步包括由所述条码化的目标核酸的序列组装所述单个细胞的基因组的至少一部分的连续核酸序列。
  24. 根据权利要求23所述的方法,其中基于所述单个细胞的所述基因组的至少一部分的所述核酸序列来表征所述单个细胞。
  25. 根据权利要求1-24中任一项所述的方法,其中每个所述离散分区至多包括源自单个细胞的所述目标核酸。
  26. 根据权利要求20-25中任一项所述的方法,其进一步包括至少部分基于所述独特分子鉴别区的存在将所述条码化的目标核酸中的单个核酸序列鉴别为源于所述目标核酸中的给定核酸。
  27. 根据权利要求20-26中任一项所述的方法,所述目标核酸包括外源核酸,所述外源核酸包括 与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。
  28. 根据权利要求27所述的方法,其进一步包括基于所述独特分子鉴别区的存在确定所述目标核酸中给定核酸的量。
  29. 根据权利要求1-28中任一项所述的方法,其包括在a)之前对所述细胞进行预处理。
  30. 根据权利要求29所述的方法,其中所述预处理包括固定所述细胞。
  31. 根据权利要求30所述的方法,其中使用固定剂对所述细胞进行固定,所述固定剂选自下组中的一种或多种:甲醛、多聚甲醛、甲醇、乙醇、丙酮、戊二醛、锇酸和重铬酸钾。
  32. 根据权利要求29-31中任一项所述的方法,其中所述预处理包括使所述细胞的细胞核被暴露。
  33. 根据权利要求29-32中任一项所述的方法,其中所述预处理包括使用去垢剂处理所述细胞,所述去垢剂包括Triton、NP-40和/或digitonin。
  34. 根据权利要求1-33中任一项所述的方法,其中所述目标核酸包括选自下组的一种或多种:DNA、RNA和cDNA。
  35. 根据权利要求20-34中任一项所述的方法,其进一步包括,在b)之后并且在c)之前,对所述条码化的目标核酸进行扩增。
  36. 根据权利要求35所述的方法,其包括在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸,且所述扩增在所述条码化的目标核酸从所述离散分区中释放后进行。
  37. 根据权利要求35-36中任一项所述的方法,其中所述扩增中使用扩增引物,且所述扩增引物中包含随机引导序列。
  38. 根据权利要求37所述的方法,其中所述随机引导序列为随机六聚体。
  39. 根据权利要求35-38中任一项所述的方法,其中所述扩增包括使所述随机引导序列与所述条码化的目标核酸至少部分杂交并且以模板定向的方式延伸所述随机引导序列。
  40. 根据权利要求1-39中任一项所述的方法,其包括使至少一部分所述目标核酸从所述离散分区中的所述单个细胞中释放到细胞外,并在b)中使经释放的所述目标核酸与所述寡核苷酸标签连接,从而产生条码化的目标核酸。
  41. 根据权利要求1-40中任一项所述的方法,其包括使至少一部分从所述固体支持物释放的所述寡核苷酸标签进入所述单个细胞中,并在b)中与所述目标核酸连接,从而产生条码化的目标核酸。
  42. 根据权利要求1-41中任一项所述的方法,其包括使用微流控装置将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
  43. 根据权利要求42所述的方法,其中所述离散分区为微滴,且所述微流控装置为微滴发生器。
  44. 根据权利要求42-43中任一项所述的方法,其中所述微流控装置包括第一输入通道和第二输入通道,它们在与输出通道流体连接的接合处汇合。
  45. 根据权利要求44所述的方法,其中所述方法还包括将包含所述目标核酸的样品引入所述第一输入通道,且将附接有至少一个寡核苷酸标签的所述固体支持物引入所述第二输入通道,从而在所述输出通道中生成所述样品与所述固体支持物的混合物。
  46. 根据权利要求45所述的方法,其中所述输出通道与第三输入通道在接合处流体连接。
  47. 根据权利要求46所述的方法,其还包括将油引入所述第三输入通道,使得形成油包水乳液内的水性小滴作为所述离散分区。
  48. 根据权利要求47所述的方法,其中每个所述离散分区中至多包含来自单个细胞的所述目标核酸。
  49. 根据权利要求44-48中任一项所述的方法,其中所述第一输入通道和所述第二输入通道彼此之间形成基本上垂直的角度。
  50. 根据权利要求1-49中任一项所述的方法,其中所述目标核酸包括源自所述单个细胞中RNA的cDNA。
  51. 根据权利要求50所述的方法,其中所述RNA包括mRNA。
  52. 根据权利要求34-51中任一项所述的方法,其包括在a)之前对所述RNA进行反转录,并产生所述经附接的目标核酸。
  53. 根据权利要求52所述的方法,其中所述反转录中使用反转录引物,所述反转录引物以5‘至3’的方向包含所述寡核苷酸衔接子序列以及polyT序列。
  54. 根据权利要求53所述的方法,其中所述反转录包括使所述polyT序列与所述RNA杂交并且以模板定向的方式延伸所述polyT序列。
  55. 根据权利要求1-54中任一项所述的方法,其中所述目标核酸包括源自所述单个细胞的DNA。
  56. 根据权利要求55所述的方法,其中所述DNA包括基因组DNA、开放染色质DNA、蛋白质结合的DNA区域和/或与蛋白、脂类和/或小分子化合物连接的外源核酸,所述蛋白、脂类和/或小分子化合物能够与细胞内的靶分子结合。
  57. 根据权利要求56所述的方法,其包括在a)之前对源自单个细胞的所述DNA进行片段化。
  58. 根据权利要求57所述的方法,其中在所述片段化之后或者在所述片段化的过程中产生所述经附接的目标核酸。
  59. 根据权利要求57-58中任一项所述的方法,其中所述片段化包括使用超声断裂,而后在经断 裂的所述DNA上添加包含所述寡核苷酸衔接子的序列,从而获得所述经附接的目标核酸。
  60. 根据权利要求57-59中任一项所述的方法,其中所述片段化包括使用DNA内切酶、外切酶打断,而后在经断裂的所述DNA上添加包含所述寡核苷酸衔接子的序列,从而获得所述经附接的目标核酸。
  61. 根据权利要求57-60中任一项所述的方法,其中所述片段化包括使用转座酶-核酸复合物将包含所述寡核苷酸衔接子的序列整合到所述DNA中,并释放所述转座酶以获得所述经附接的目标核酸。
  62. 根据权利要求61所述的方法,其中所述转座酶-核酸复合物包含转座酶以及转座子末端核酸分子,其中所述转座子末端核酸分子包含所述寡核苷酸衔接子序列。
  63. 根据权利要求61-62中任一项所述的方法,其中所述转座酶包括Tn5。
  64. 根据权利要求61-63中任一项所述的方法,其中所述DNA包括与蛋白质结合的DNA区域,且所述转座酶-核酸复合物中还包含直接或间接识别所述蛋白质的部分。
  65. 根据权利要求64所述的方法,其中所述直接或间接识别所述蛋白质的部分包括下组中的一种或多种:特异性结合所述蛋白质的抗体和蛋白质A或蛋白质G。
  66. 一种组合物,其包含:多个固体支持物,每个所述固体支持物上附接有至少一个寡核苷酸标签,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与待测核酸中的序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;所述寡核苷酸标签的条码序列包含共同条码结构域和可变结构域,所述共同条码结构域在附接于同一个固体支持物的寡核苷酸标签中是相同的,且所述共同条码结构域在所述多个固体支持物中的两个或更多个固体支持物之间是不同的。
  67. 一种用于分析来自细胞的目标核酸的试剂盒,其包含权利要求66所述的组合物。
  68. 根据权利要求67所述的试剂盒,其包括转座酶。
  69. 根据权利要求67-68中任一项所述的试剂盒,其进一步包含核酸扩增剂,逆转录剂,固定剂,通透剂,连接剂和裂解剂中的至少一种。
  70. 一种扩增来自细胞的目标核酸的方法,所述方法包括:
    a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列 以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
    b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
    c)对所述条码化的目标核酸进行扩增。
  71. 根据权利要求70所述的方法,其中所述寡核苷酸标签可释放地附接至所述固体支持物。
  72. 根据权利要求71所述的方法,其包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸。
  73. 根据权利要求70-72中任一项所述的方法,其中所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
  74. 根据权利要求70-73中任一项所述的方法,其中所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
  75. 根据权利要求74所述的方法,其中所述连接酶包括T4连接酶。
  76. 根据权利要求70-75中任一项所述的方法,其中在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
  77. 根据权利要求70-76中任一项所述的方法,其中所述固体支持物为珠粒。
  78. 根据权利要求70-77中任一项所述的方法,其中所述离散分区为孔或微滴。
  79. 根据权利要求70-78中任一项所述的方法,其中所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
  80. 根据权利要求79所述的方法,其中所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
  81. 根据权利要求70-80中任一项所述的方法,其中a)包括将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
  82. 根据权利要求70-81中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  83. 根据权利要求70-82中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第二链的所述 第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  84. 根据权利要求70-83中任一项所述的方法,其中所述经附接的目标核酸中包含独特分子鉴别区。
  85. 根据权利要求84所述的方法,其中所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
  86. 根据权利要求70-85中任一项所述的方法,其中所述寡核苷酸标签还包含扩增引物识别区。
  87. 根据权利要求86所述的方法,其中所述扩增引物识别区为通用扩增引物识别区。
  88. 根据权利要求87所述的方法,其包括在b)之后并且在c)之前,从所述离散分区中释放所述条码化的目标核酸,且所述扩增在所述条码化的目标核酸从所述离散分区中释放后进行。
  89. 根据权利要求70-88中任一项所述的方法,其中所述扩增中使用扩增引物,且所述扩增引物中包含随机引导序列。
  90. 根据权利要求89所述的方法,其中所述随机引导序列为随机六聚体。
  91. 根据权利要求70-90中任一项所述的方法,其中所述扩增包括使所述随机引导序列与所述条码化的目标核酸至少部分杂交并且以模板定向的方式延伸所述随机引导序列。
  92. 一种对来自细胞的目标核酸进行测序的方法,所述方法包括:
    a)提供包含下述的离散分区:i.源于单个细胞的目标核酸,其中至少部分所述目标核酸被添加了寡核苷酸衔接子序列而成为经附接的目标核酸;以及ii.附接有至少一个寡核苷酸标签的固体支持物,其中每个所述寡核苷酸标签包含第一链以及第二链,所述第一链包含条码序列以及位于所述条码序列3’端的杂交序列,所述第二链包含与所述第一链的所述杂交序列互补的第一部分以及与附接至所述目标核酸的所述寡核苷酸衔接子序列互补的第二部分,且所述第一链与所述第二链形成部分双链的结构或者所述第二链与所述经附接的目标核酸形成部分双链的结构;
    b)在所述离散分区中,使所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的目标核酸;以及
    c)对所述条码化的目标核酸进行测序。
  93. 根据权利要求92所述的方法,其中所述寡核苷酸标签可释放地附接至所述固体支持物。
  94. 根据权利要求93所述的方法,其包括从所述固体支持物上释放所述至少一个寡核苷酸标签,并在b)中使经释放的所述寡核苷酸标签与所述经附接的目标核酸连接,从而产生条码化的 目标核酸。
  95. 根据权利要求92-94中任一项所述的方法,其中所述寡核苷酸标签通过其第一链的5’端直接或间接附接至所述固体支持物。
  96. 根据权利要求92-95中任一项所述的方法,其中所述离散分区中还包含连接酶,且所述连接酶使所述寡核苷酸标签与所述经附接的目标核酸连接。
  97. 根据权利要求96所述的方法,其中所述连接酶包括T4连接酶或T7连接酶。
  98. 根据权利要求92-97中任一项所述的方法,其中在所述条码化的目标核酸中,所述目标核酸序列位于所述条码序列的3’端。
  99. 根据权利要求92-98中任一项所述的方法,其中所述固体支持物为珠粒。
  100. 根据权利要求92-99中任一项所述的方法,其中所述离散分区为孔或微滴。
  101. 根据权利要求92-100中任一项所述的方法,其中所述条码序列包含细胞条码序列,且附接至同一个固体支持物上的各寡核苷酸标签所包含的细胞条码序列相同。
  102. 根据权利要求101所述的方法,其中所述细胞条码序列包含由连接子序列间隔开的至少2个细胞条码区段。
  103. 根据权利要求92-102中任一项所述的方法,其中a)包括将所述源于单个细胞的目标核酸与所述附接有至少一个寡核苷酸标签的固体支持物共分配至所述离散分区中。
  104. 根据权利要求92-103中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  105. 根据权利要求92-104中任一项所述的方法,其中b)包括使所述寡核苷酸标签的第二链的所述第二部分与附接至所述目标核酸的所述寡核苷酸衔接子杂交,以及使所述寡核苷酸标签的第一链的所述杂交序列与附接至所述目标核酸的所述寡核苷酸衔接子连接,从而产生所述条码化的目标核酸。
  106. 根据权利要求92-105中任一项所述的方法,其中所述经附接的目标核酸中包含独特分子鉴别区。
  107. 根据权利要求106所述的方法,其中所述独特分子鉴别区位于所述寡核苷酸衔接子序列与所述目标核酸序列之间。
  108. 根据权利要求92-107中任一项所述的方法,其中所述寡核苷酸标签还包含扩增引物识别区。
  109. 根据权利要求108所述的方法,其中所述扩增引物识别区为通用扩增引物识别区。
  110. 根据权利要求92-109中任一项所述的方法,其进一步包括由所述条码化的目标核酸的序列组 装所述单个细胞的基因组的至少一部分的连续核酸序列。
  111. 根据权利要求110所述的方法,其中基于所述单个细胞的所述基因组的至少一部分的所述核酸序列来表征所述单个细胞。
  112. 根据权利要求92-111中任一项所述的方法,其中每个所述离散分区至多包括源自单个细胞的所述目标核酸。
  113. 根据权利要求92-112中任一项所述的方法,其进一步包括至少部分基于所述独特分子鉴别区的存在将所述条码化的目标核酸中的单个核酸序列鉴别为源于所述目标核酸中的给定核酸。
  114. 根据权利要求113所述的方法,其进一步包括基于所述独特分子鉴别区的存在确定所述目标核酸中给定核酸的量。
PCT/CN2021/097800 2020-06-03 2021-06-02 分析来自细胞的目标核酸的方法 Ceased WO2021244557A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP21817287.2A EP4163390A4 (en) 2020-06-03 2021-06-02 METHOD FOR ANALYZING THE TARGET NUCLEIC ACID OF A CELL
CN202180039759.9A CN116234926A (zh) 2020-06-03 2021-06-02 分析来自细胞的目标核酸的方法
CA3181004A CA3181004A1 (en) 2020-06-03 2021-06-02 Method for analyzing target nucleic acid from cell
JP2022574773A JP7853705B2 (ja) 2020-06-03 2021-06-02 細胞に由来する標的核酸を解析する方法
US18/000,665 US20230212648A1 (en) 2020-06-03 2021-06-02 Method for analyzing target nucleic acid from cell

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202010495334.6 2020-06-03
CN202010495334 2020-06-03
CN202010506791.0 2020-06-05
CN202010506791 2020-06-05

Publications (1)

Publication Number Publication Date
WO2021244557A1 true WO2021244557A1 (zh) 2021-12-09

Family

ID=78830665

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097800 Ceased WO2021244557A1 (zh) 2020-06-03 2021-06-02 分析来自细胞的目标核酸的方法

Country Status (5)

Country Link
US (1) US20230212648A1 (zh)
EP (1) EP4163390A4 (zh)
CN (1) CN116234926A (zh)
CA (1) CA3181004A1 (zh)
WO (1) WO2021244557A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114277114A (zh) * 2021-12-30 2022-04-05 深圳海普洛斯医学检验实验室 一种扩增子测序添加唯一性标识符的方法及应用
CN114574569A (zh) * 2022-03-28 2022-06-03 浙江大学 一种基于末端转移酶的基因组测序试剂盒和测序方法
CN114574484A (zh) * 2022-03-17 2022-06-03 中国科学院北京基因组研究所(国家生物信息中心) 核酸检测试剂及其应用
CN114807084A (zh) * 2022-04-26 2022-07-29 翌圣生物科技(上海)股份有限公司 突变型Tn5转座酶及试剂盒
WO2024186877A1 (en) * 2023-03-07 2024-09-12 Board Of Regents, The University Of Texas System Methods and compositions for amplification and sequencing of genome and epigenome
WO2024244933A1 (zh) * 2023-05-30 2024-12-05 深圳赛陆医疗科技有限公司 一种snp芯片及其制备方法和应用

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118272509A (zh) * 2024-03-28 2024-07-02 浙江大学 基于组合索引的单细胞dna-蛋白互作测序试剂盒和方法
CN118638901B (zh) * 2024-08-15 2024-12-03 青岛百创智能制造技术有限公司 降低微球矩阵重复率的微球合成方法及其产品和应用
CN119876341B (zh) * 2025-01-21 2026-01-06 西安交通大学 一种条码可重复微球及其制备方法和应用

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995023875A1 (en) 1994-03-02 1995-09-08 The Johns Hopkins University In vitro transposition of artificial transposons
WO2012142213A2 (en) 2011-04-15 2012-10-18 The Johns Hopkins University Safe sequencing system
US20130274117A1 (en) 2010-10-08 2013-10-17 President And Fellows Of Harvard College High-Throughput Single Cell Barcoding
US20150292988A1 (en) 2014-04-10 2015-10-15 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
US9834814B2 (en) * 2013-11-22 2017-12-05 Agilent Technologies, Inc. Spatial molecular barcoding of in situ nucleic acids
CN107735497A (zh) * 2015-02-18 2018-02-23 卓异生物公司 用于单分子检测的测定及其应用
CN108350497A (zh) * 2015-08-28 2018-07-31 Illumina公司 单细胞核酸序列分析
WO2019165318A9 (en) * 2018-02-22 2019-09-26 10X Genomics, Inc. Ligation mediated analysis of nucleic acids

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114214314A (zh) * 2014-06-24 2022-03-22 生物辐射实验室股份有限公司 数字式pcr条码化
WO2018118971A1 (en) * 2016-12-19 2018-06-28 Bio-Rad Laboratories, Inc. Droplet tagging contiguity preserved tagmented dna
WO2019152395A1 (en) * 2018-01-31 2019-08-08 Bio-Rad Laboratories, Inc. Methods and compositions for deconvoluting partition barcodes
CN120624607A (zh) * 2018-05-08 2025-09-12 深圳华大智造科技股份有限公司 用于准确且经济高效的测序、单体型分型和组装的基于单管珠粒的dna共条形码化
WO2020041148A1 (en) * 2018-08-20 2020-02-27 10X Genomics, Inc. Methods and systems for detection of protein-dna interactions using proximity ligation
EP3844304B1 (en) * 2018-08-28 2024-10-02 10X Genomics, Inc. Methods for generating spatially barcoded arrays

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1995023875A1 (en) 1994-03-02 1995-09-08 The Johns Hopkins University In vitro transposition of artificial transposons
US20130274117A1 (en) 2010-10-08 2013-10-17 President And Fellows Of Harvard College High-Throughput Single Cell Barcoding
WO2012142213A2 (en) 2011-04-15 2012-10-18 The Johns Hopkins University Safe sequencing system
US9834814B2 (en) * 2013-11-22 2017-12-05 Agilent Technologies, Inc. Spatial molecular barcoding of in situ nucleic acids
US20150292988A1 (en) 2014-04-10 2015-10-15 10X Genomics, Inc. Fluidic devices, systems, and methods for encapsulating and partitioning reagents, and applications of same
CN107735497A (zh) * 2015-02-18 2018-02-23 卓异生物公司 用于单分子检测的测定及其应用
CN108350497A (zh) * 2015-08-28 2018-07-31 Illumina公司 单细胞核酸序列分析
WO2019165318A9 (en) * 2018-02-22 2019-09-26 10X Genomics, Inc. Ligation mediated analysis of nucleic acids

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
"GenBank", Database accession no. YP001446289
BOEKECORCES, ANNU REV MICROBIOL., vol. 43, no. 198, pages 403 - 34
BROWN ET AL., PROC NATL ACAD SCI USA, vol. 86, pages 2525 - 9
COLEGIO ET AL., J. BACTERIOL, vol. 183, no. 2384-8, pages 200
CRAIG, N L, SCIENCE, vol. 271, 1996, pages 1512
GLOOR, G B, METHODS MOL. BIOL., vol. 260, no. 200, pages 97 - 114
ICHIKAWAOHTSUBO, J BIOL. CHEM., vol. 265, pages 18829 - 32
ISLAM ET AL., NAT. METHODS, vol. 11, 2014, pages 163 - 166
KIRBY ET AL., MOL. MICROBIOL., vol. 43, pages 173 - 86
KIVIOJA, T ET AL., NAT. METHODS, vol. 9, 2012, pages 72 - 74
KLECKNERN ET AL., CURR TOP MICROBIOL IMMUNOL., vol. 204, no. 199, 1996, pages 49 - 82
LAMPE D J ET AL., EMBO J., vol. 15, 1996, pages 5470 - 9
OHTSUBOSEKINE, CURR. TOP. MICROBIOL. IMMUNOL, vol. 204, no. 199, 1996, pages 1 - 26
PLASTERK R H, CURR. TOPICS MICROBIOL. IMMUNOL., vol. 204, no. 199, 1996, pages 125 - 43
See also references of EP4163390A4
TYLOSIN (TYL, NUCLEIC ACIDS RES., vol. 22, 1994, pages 3765 - 72
WILSON C. ET AL., J. MICROBIOL. METHODS, vol. 71, 2007, pages 332 - 5
ZHANG ET AL., PLOS GENET, vol. 5, 16 October 2009 (2009-10-16), pages 689

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114277114A (zh) * 2021-12-30 2022-04-05 深圳海普洛斯医学检验实验室 一种扩增子测序添加唯一性标识符的方法及应用
CN114574484A (zh) * 2022-03-17 2022-06-03 中国科学院北京基因组研究所(国家生物信息中心) 核酸检测试剂及其应用
CN114574569A (zh) * 2022-03-28 2022-06-03 浙江大学 一种基于末端转移酶的基因组测序试剂盒和测序方法
CN114807084A (zh) * 2022-04-26 2022-07-29 翌圣生物科技(上海)股份有限公司 突变型Tn5转座酶及试剂盒
CN114807084B (zh) * 2022-04-26 2023-05-16 翌圣生物科技(上海)股份有限公司 突变型Tn5转座酶及试剂盒
WO2024186877A1 (en) * 2023-03-07 2024-09-12 Board Of Regents, The University Of Texas System Methods and compositions for amplification and sequencing of genome and epigenome
WO2024244933A1 (zh) * 2023-05-30 2024-12-05 深圳赛陆医疗科技有限公司 一种snp芯片及其制备方法和应用

Also Published As

Publication number Publication date
JP2023528917A (ja) 2023-07-06
CA3181004A1 (en) 2021-12-09
US20230212648A1 (en) 2023-07-06
EP4163390A1 (en) 2023-04-12
EP4163390A4 (en) 2024-08-07
CN116234926A (zh) 2023-06-06

Similar Documents

Publication Publication Date Title
WO2021244557A1 (zh) 分析来自细胞的目标核酸的方法
US11035002B2 (en) Methods and systems for processing polynucleotides
US20220235416A1 (en) Methods and systems for single cell gene profiling
US10752949B2 (en) Methods and systems for processing polynucleotides
CN110462060B (zh) 用于标记细胞的方法和组合物
EP4240870B1 (en) SYSTEMS AND METHODS FOR MANUFACTURING SEQUENCE BANK NETWORKS
CN109526228B (zh) 转座酶可接近性染色质的单细胞分析
CN113811619A (zh) 用于处理来自细胞的rna的系统和方法
US20240254538A1 (en) Particles associated with oligonucleotides
CN114616341A (zh) 联接介导的核酸分析
CN113366117A (zh) 用于生物样品中转座酶介导的空间标记和分析基因组dna的方法
CN111051523A (zh) 功能化凝胶珠
CN112639985A (zh) 用于代谢组分析的系统和方法
CN112272710A (zh) 高通量多组学样品分析
CN111699388A (zh) 用于单细胞处理的系统和方法
CN106795553A (zh) 分析来自单个细胞或细胞群体的核酸的方法
CN110637084A (zh) Mmlv逆转录酶变体
JP7853705B2 (ja) 細胞に由来する標的核酸を解析する方法
HK40082906A (zh) 分析来自细胞的目标核酸的方法
US20240229106A1 (en) Composition and method for analyzing target molecule from sample

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21817287

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 3181004

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2022574773

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021817287

Country of ref document: EP

Effective date: 20230103