WO2025175229A1 - Procédés à étapes multiples pour détecter des acides nucléiques - Google Patents

Procédés à étapes multiples pour détecter des acides nucléiques

Info

Publication number
WO2025175229A1
WO2025175229A1 PCT/US2025/016120 US2025016120W WO2025175229A1 WO 2025175229 A1 WO2025175229 A1 WO 2025175229A1 US 2025016120 W US2025016120 W US 2025016120W WO 2025175229 A1 WO2025175229 A1 WO 2025175229A1
Authority
WO
WIPO (PCT)
Prior art keywords
amr
sample
nucleic acids
microbe
genetic marker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/016120
Other languages
English (en)
Inventor
Fred C. Christians
Frederick S. NOLTE
Jamilla AKHUND-ZADE
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Karius Inc
Original Assignee
Karius Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Karius Inc filed Critical Karius Inc
Publication of WO2025175229A1 publication Critical patent/WO2025175229A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae

Definitions

  • AMR antimicrobial resistance
  • compositions for performing a multi-step assay for detecting nucleic acids in a sample from a subject comprising: providing a first aliquot of the sample from the subject, wherein the sample comprises nucleic acids comprising a first target nucleic acid and a second target nucleic acid that is different from the first target nucleic acid; performing a sequencing assay on the nucleic acids to produce sequence reads comprising sequence reads associated with the first target nucleic acid; analyzing the sequence reads, thereby obtaining an identification of the first target nucleic acid; after the identification of the first target nucleic acid is obtained, providing a second aliquot of the sample comprising nucleic acids comprising the first target nucleic acid and the second target nucleic acid; introducing primers into the second aliquot or to nucleic acids derived from the second aliquot wherein the primers specifically target the second target nucleic acid and do not target the first target nucleic acid; and conducting an a
  • the methods disclosed herein further comprise performing a high-throughput sequencing assay on the amplicons associated with the second target nucleic acid.
  • the second target nucleic acid is not detected by a sequencing assay.
  • the first nucleic acid is associated with a genome of an organism but not a phenotype of interest of the organism and the second nucleic acid is associated with the phenotype of interest of the organism.
  • the first target nucleic acid comprises at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, or at least 25 first target nucleic acids; or the second target nucleic acid comprises at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, or at least 25 second target nucleic acids.
  • the first target nucleic acid or the second target nucleic acid are not negative or positive controls for any step of the multi-step assay.
  • the primers comprise multiple primers targeting multiple target nucleic acids.
  • the first target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises a cancer marker.
  • the first target nucleic acid is associated with a carrier microbe harboring a target genetic marker and the second target nucleic acid comprises a sequence associated with the target genetic marker.
  • the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the microbial sequence reads from the gene cassette is below the threshold. In some embodiments, the methods disclosed herein further comprise quantifying mcfNA sequencing reads from the one or more microbes. In some embodiments, the methods disclosed herein further comprise comparing the abundance of the mcfNA sequencing reads from the one or more microbes to a threshold. In some embodiments, the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the mcfNA sequencing reads from the one or more microbes is below the threshold.
  • the methods disclosed herein further comprise performing size selection of the nucleic acids in the sample.
  • physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the second aliquot comprises at least 500 pl of plasma.
  • detecting the carrier microbe comprises determining an abundance of the carrier microbe.
  • the abundance is expressed as molecules of mcfNA per microliter of sample (MPM).
  • the methods disclosed herein further comprise calculating an AMR gene copy number from the amplification or sequencing of the AMR genetic marker.
  • the methods disclosed herein further comprise linking the AMR genetic marker to the carrier microbe using the abundance of the mcfNA from the carrier microbe and the AMR gene copy number.
  • the AMR gene copy number is an episomal gene copy number.
  • the methods disclosed herein comprise introducing at least 200 primers targeting the plurality of AMR genetic markers into the second aliquot of the sample.
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the methods disclosed herein further comprise spiking one or more control molecules into the sample at a known concentration.
  • the one or more control molecules are synthetic oligonucleotides.
  • the control molecules comprise whole assay internal control (WINC) molecules.
  • the methods disclosed herein further comprise spiking at least 25,000 unique WINC molecules at known concentrations.
  • the methods disclosed herein further comprise generating a report listing the carrier microbes or pathogen detected in the subject.
  • the report further comprises the abundance of microbial cell-free DNA (mcfDNA) from microbes detected in the sample or the antimicrobial resistance of the microbes infecting the subject.
  • mcfDNA microbial cell-free DNA
  • the first target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises a cancer marker.
  • the first target nucleic acid is associated with a carrier microbe harboring a target genetic marker and the second target nucleic acid comprises a sequence associated with the target genetic marker.
  • the methods disclosed herein further comprise performing high-throughput sequencing on the amplicons associated with the AMR genetic marker.
  • the sequencing assay on the first nucleic acids comprises a high-throughput sequencing assay.
  • the methods disclosed herein further comprise conducting a polymerase chain reaction (PCR) to amplify the AMR genetic marker, thereby producing amplicons associated with the AMR genetic marker.
  • the PCR comprises multiplex PCR, random PCR (rPCR), non-biased PCR, Nested PCR, Hot Start PCR, or Assembly PCR.
  • the methods disclosed herein further comprise attaching an adapter sequence to the second nucleic acids.
  • the primers comprise an adapter sequence.
  • the methods disclosed herein further comprise physically manipulating the sample to produce a fraction of cfNA enriched for degraded cfNA, wherein the fraction of cfNA comprises the AMR genetic marker.
  • the degraded cfNA comprises ultra short cfNA, single stranded cfNA, or nicked double stranded cfNA.
  • the ultra short cfNA comprises cfNA fragments that are less than 100 nucleotides in length.
  • the ultra short cfNA comprises cfNA fragments that are from 30 nucleotides to 70 nucleotides in length.
  • the methods disclosed herein further comprise performing size selection of the nucleic acids in the sample.
  • physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the second aliquot comprises at least 500 pl of plasma.
  • detecting the carrier microbe comprises determining an abundance of the carrier microbe.
  • the abundance is expressed as molecules of mcfNA per microliter of sample (MPM).
  • the methods disclosed herein further comprise calculating an AMR gene copy number from the amplification or sequencing of the AMR genetic marker.
  • the methods disclosed herein further comprise linking the AMR genetic marker to the carrier microbe using the abundance of the mcfNA from the carrier microbe and the AMR gene copy number.
  • the AMR gene copy number is an episomal gene copy number.
  • the methods disclosed herein comprise introducing at least 200 primers targeting the plurality of AMR genetic markers into the second aliquot of the sample.
  • the methods disclosed herein further comprise introducing primers targeting housekeeping genes into the second aliquot of the sample. In some embodiments, the methods disclosed herein further comprise determining antimicrobial resistance of the microbe infecting the subject.
  • the sample comprises plasma.
  • the first nucleic acids are DNA.
  • the second nucleic acids are DNA.
  • the sample is not subjected to a process that primarily causes cell lysis.
  • the methods disclosed herein further comprise preparing a library from the amplicons associated with the AMR genetic marker.
  • the primers are added directly to the second aliquot.
  • the sample comprises cell-free nucleic acids.
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the methods disclosed herein comprise adding at least 200 primers targeting a plurality of AMR genetic markers into the second aliquot of the sample. In some embodiments, the methods disclosed herein further comprise introducing primers targeting housekeeping genes into the second aliquot of the sample. In some embodiments, the second aliquot of the sample comprises at least 500 pl of plasma. In some embodiments, detecting the pathogen comprises detecting the abundance of mcfNA from the pathogen in the sample over a threshold. In some embodiments, the methods disclosed herein further comprise calculating a positive percent agreement (PPA), negative percent agreement (NPA), diagnostic yield (DY), or any combination thereof. In some embodiments, the methods disclosed herein further comprise detecting cfDNA from one or more housekeeping genes.
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the sequencing comprises sequencing by synthesis.
  • the subject is an animal. In some embodiments, the subject is a human. In some embodiments, the subject has an infection by the microbe infecting the subject harboring the AMR genetic marker.
  • the methods disclosed herein further comprise administering an anti- infective agent to the subject. In some embodiments, the subject has been treated with an anti- infective agent for an infection. In some embodiments, the methods disclosed herein further comprise adjusting the anti -microbial agent received by the subject at least in part based on the antimicrobial resistance of the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise determining antimicrobial resistance of a microbe.
  • the methods disclosed herein further comprise performing a high- throughput sequencing assay on the amplicons associated with the second target nucleic acid.
  • the second target nucleic acid is not detected by a sequencing assay.
  • the first nucleic acid is associated with a genome of an organism but not a phenotype of interest of the organism and the second nucleic acid is associated with the phenotype of interest of the organism.
  • the methods disclosed herein further comprise detecting one or more carrier microbe from the sequencing assay. In some embodiments, the methods disclosed herein further comprise identifying the one or more carrier microbe as the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise linking the AMR genetic marker to one of the carrier microbes or the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise obtaining an abundance of the one or more carrier microbes based on an abundance of carrier microbe sequences in the sample. In some embodiments, conducting a statistical analysis comprises using the abundance of the microbial nucleic acids of the one or more potential carrier microbes. In some embodiments, conducting a statistical analysis comprises using the AMR gene copy number.
  • the first nucleic acids are DNA. In some embodiments, the second nucleic acids are DNA. In some embodiments, the sample does not undergo cell lysis. In some embodiments, intact microbes are not actively lysed. In some embodiments, the amplicons associated with the AMR genetic marker undergoes direct library preparation. In some embodiments, the sample comprises cell-free nucleic acids. In some embodiments, the sample comprises cell-free DNA. In some embodiments, the first nucleic acids comprise cell-free nucleic acids. In some embodiments, the second nucleic acids comprise cell-free nucleic acids. In some embodiments, the methods disclosed herein comprise adding primers targeting a plurality of AMR genetic markers into the second aliquot of the sample.
  • the methods disclosed herein comprise adding at least 200 primers targeting a plurality of AMR genetic markers into the second aliquot of the sample. In some embodiments, the methods disclosed herein further comprise introducing primers targeting housekeeping genes into the second aliquot of the sample. In some embodiments, the second aliquot of the sample comprises at least 500 pl of plasma. In some embodiments, detecting the pathogen comprises detecting the abundance of mcfNA from the pathogen in the sample over a threshold. In some embodiments, the methods disclosed herein further comprise calculating a positive percent agreement (PPA), negative percent agreement (NPA), diagnostic yield (DY), or any combination thereof. In some embodiments, the methods disclosed herein further comprise detecting cfDNA from one or more housekeeping genes.
  • the methods disclosed herein further comprise spiking one or more control molecules into the sample at a known concentration.
  • the one or more control molecules are synthetic oligonucleotides.
  • the control molecules comprise whole assay internal control (WINC) molecules.
  • the methods disclosed herein further comprise spiking at least 25,000 unique WINC molecules at known concentrations.
  • the methods disclosed herein further comprise generating a report listing the carrier microbes or pathogen detected in the subject.
  • the report further comprises the abundance of microbial cell-free DNA (mcfDNA) from microbes detected in the sample or the antimicrobial resistance of the microbes infecting the subject.
  • mcfDNA microbial cell-free DNA
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the AMR genetic marker is associated with one or more genes selected from the group consisting of: SCCmec, mecA, mecC, vanA, vanB, blacrx-M, blctKPC, OXA-48-like, OXA-23, NDM, VIM, IMP, or mcr-1.
  • the AMR genetic marker provides resistance to an anti-microbial agent selected from the group consisting of: methicillin, vancomycin, cephalosporin, carbapenem, and oxyimino-cephalosporin/aztreonam resistance.
  • the sequencing comprises sequencing by synthesis.
  • the subject is an animal. In some embodiments, the subject is a human. In some embodiments, the subject has an infection by the microbe infecting the subject harboring the AMR genetic marker.
  • the methods disclosed herein further comprise administering an anti- infective agent to the subject. In some embodiments, the subject has been treated with an anti- infective agent for an infection. In some embodiments, the methods disclosed herein further comprise adjusting the anti -microbial agent received by the subject at least in part based on the antimicrobial resistance of the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise determining antimicrobial resistance of a microbe.
  • the statistical analysis comprises a generalized linear model, a maximum-likelihood estimation, a probit model, a logistic regression, a linear probability, a linear regression, a complimentary log-log, a Poisson regression, a support vector machine, a decision tree, a random forest, a neural network, a gradient boosted model, a Bayesian model, a hidden Markov model, or any combination thereof.
  • the method further comprises comparing the sequence read data to the amplification data.
  • the method further comprises calculating estimated deduplicated templates (EDT) for an AMR gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • EDT estimated deduplicated templates
  • the method further comprises calculating an EDT for a housekeeping gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • the methods disclosed herein further comprise determining a presence of an organism in a sample by calculating a ratio of an AMR estimated deduplicated templates (EDT) to a housekeeping gene EDT.
  • EDT AMR estimated deduplicated templates
  • compositions for performing a multi-step sample processing procedure for detecting nucleic acids in a sample from a subject providing a first aliquot of the sample from the subject, wherein the sample comprises one or more first nucleic acids, wherein the one or more first nucleic acids comprise one or more microbial nucleic acids derived from one or more carrier microbes harboring an antimicrobial resistance (AMR) genetic marker; performing a sequencing assay on the one or more first nucleic acids to produce sequence reads comprising microbial nucleic acid sequence reads from the one or more carrier microbes; detecting the one or more carrier microbes by analyzing the microbial nucleic acid sequence reads; after at least one carrier microbe is detected, providing a second aliquot of the sample from the subject, wherein the second aliquot comprises second nucleic acids from at least one AMR genetic marker; annealing primers to the second nucleic acids that target at least one AMR genetic marker; and amplifying
  • the methods disclosed herein further comprise performing a high-throughput sequencing assay on the amplicons associated with the second target nucleic acid.
  • the second target nucleic acid is not detected by a sequencing assay.
  • the first nucleic acid is associated with a genome of an organism but not a phenotype of interest of the organism and the second nucleic acid is associated with the phenotype of interest of the organism.
  • the first target nucleic acid comprises at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, or at least 25 first target nucleic acids; or the second target nucleic acid comprises at least 2, at least 3, at least 4, at least 5, at least 10, at least 15, at least 20, or at least 25 second target nucleic acids.
  • the first target nucleic acid or the second target nucleic acid are not negative or positive controls for any step of the multi-step assay.
  • the primers comprise multiple primers targeting multiple target nucleic acids.
  • the carrier microbe comprises a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the target genetic marker comprises an antimicrobial resistance (AMR) genetic marker.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the AMR genetic marker to a threshold value.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the gene cassette to a threshold value.
  • the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the microbial sequence reads from the AMR genetic marker is below the threshold.
  • the methods disclosed herein further comprise performing high-throughput sequencing on the amplicons associated with the AMR genetic marker.
  • the sequencing assay on the first nucleic acids comprises a high-throughput sequencing assay.
  • the methods disclosed herein further comprise conducting a polymerase chain reaction (PCR) to amplify the AMR genetic marker, thereby producing amplicons associated with the AMR genetic marker.
  • the PCR comprises multiplex PCR, random PCR (rPCR), non-biased PCR, Nested PCR, Hot Start PCR, or Assembly PCR.
  • the methods disclosed herein further comprise attaching an adapter sequence to the second nucleic acids.
  • the primers comprise an adapter sequence.
  • the abundance is expressed as molecules of mcfNA per microliter of sample (MPM).
  • the methods disclosed herein further comprise calculating an AMR gene copy number from the amplification or sequencing of the AMR genetic marker.
  • the methods disclosed herein further comprise linking the AMR genetic marker to the carrier microbe using the abundance of the mcfNA from the carrier microbe and the AMR gene copy number.
  • the AMR gene copy number is an episomal gene copy number.
  • the methods disclosed herein comprise introducing at least 200 primers targeting the plurality of AMR genetic markers into the second aliquot of the sample.
  • the degraded cfNA comprise ultra short cfNA, single stranded cfNA, nicked double stranded cfNA, or any combination thereof. In some embodiments, the degraded cfNA comprise cfNA fragments that are less than 100 nucleotides in length. In some embodiments, the degraded cfNA comprise cfNA fragments that are from 30 nucleotides to 70 nucleotides in length. In some embodiments, physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the microbe infecting the subject or the carrier microbe comprises a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae. In some embodiments, the microbe infecting the subject or the carrier microbe comprises a gram-positive bacterium or a fungus. In some embodiments, the microbe infecting the subject or the carrier microbe is a microbe listed in Table 1. In some embodiments, the microbe infecting the subject or the carrier microbe harbors at last two AMR genetic markers. In some embodiments, the antimicrobial resistance is phenotypic antimicrobial resistance. In some embodiments, the sequencing comprises nextgeneration sequencing or a sequencing method beyond next generation sequencing.
  • the sequencing comprises sequencing by synthesis.
  • the subject is an animal. In some embodiments, the subject is a human. In some embodiments, the subject has an infection by the microbe infecting the subject harboring the AMR genetic marker.
  • the methods disclosed herein further comprise administering an anti- infective agent to the subject. In some embodiments, the subject has been treated with an anti- infective agent for an infection. In some embodiments, the methods disclosed herein further comprise adjusting the anti -microbial agent received by the subject at least in part based on the antimicrobial resistance of the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise determining antimicrobial resistance of a microbe.
  • the microbe comprises a microbe infecting the subject. In some embodiments, the microbe comprises a carrier microbe. In some embodiments, the methods disclosed herein comprise determining whether an AMR gene is carried by a microbe infecting the subject or by a carrier microbe. In some embodiments, the determining comprises linking an AMR genetic marker to one of the carrier microbes or the microbe infecting the subject. In some embodiments, the determining comprises determining a copy number of the AMR genetic marker in the microbe. In some embodiments, the linking comprises performing a statistical analysis on the sequence reads.
  • compositions for assaying nucleic acids in a sample from a subject comprising: providing a first aliquot of the sample from the subject, wherein the first aliquot comprises first nucleic acids, and the first nucleic acids comprise an AMR genetic marker and a gene cassette comprising the AMR genetic marker derived from one or more carrier microbes harboring the AMR genetic marker; performing a sequencing assay on the first nucleic acids to produce sequence reads comprising microbial nucleic acid sequence reads from the one or more carrier microbes; detecting the AMR genetic marker or the gene cassette comprising the AMR genetic marker in the sample and quantifying microbial sequence reads from the AMR genetic marker and/or microbial sequence reads from the gene cassette; after and based on the quantification, providing a second aliquot of the sample from the subject, wherein the second aliquot comprises second nucleic acids, wherein the second nucleic acids comprise the AMR genetic marker; introducing primers targeting the AMR genetic
  • the first target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises nucleic acids from an animal, a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the second target nucleic acid comprises a cancer marker.
  • the first target nucleic acid is associated with a carrier microbe harboring a target genetic marker and the second target nucleic acid comprises a sequence associated with the target genetic marker.
  • the methods disclosed herein further comprise performing size selection of the nucleic acids in the sample.
  • physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the second aliquot comprises at least 500 pl of plasma.
  • detecting the carrier microbe comprises determining an abundance of the carrier microbe.
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the methods disclosed herein further comprise detecting one or more carrier microbe from the sequencing assay. In some embodiments, the methods disclosed herein further comprise identifying the one or more carrier microbe as the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise linking the AMR genetic marker to one of the carrier microbes or the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise obtaining an abundance of the one or more carrier microbes based on an abundance of carrier microbe sequences in the sample. In some embodiments, conducting a statistical analysis comprises using the abundance of the microbial nucleic acids of the one or more potential carrier microbes. In some embodiments, conducting a statistical analysis comprises using the AMR gene copy number.
  • the methods disclosed herein further comprise spiking one or more control molecules into the sample at a known concentration.
  • the one or more control molecules are synthetic oligonucleotides.
  • the control molecules comprise whole assay internal control (WINC) molecules.
  • the methods disclosed herein further comprise spiking at least 25,000 unique WINC molecules at known concentrations.
  • the methods disclosed herein further comprise generating a report listing the carrier microbes or pathogen detected in the subject.
  • the report further comprises the abundance of microbial cell-free DNA (mcfDNA) from microbes detected in the sample or the antimicrobial resistance of the microbes infecting the subject.
  • mcfDNA microbial cell-free DNA
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the method further comprises calculating an EDT for a housekeeping gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • the methods disclosed herein further comprise determining a presence of an organism in a sample by calculating a ratio of an AMR estimated deduplicated templates (EDT) to a housekeeping gene EDT.
  • EDT AMR estimated deduplicated templates
  • the carrier microbe comprises a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the target genetic marker comprises an antimicrobial resistance (AMR) genetic marker.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the AMR genetic marker to a threshold value.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the gene cassette to a threshold value.
  • the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the microbial sequence reads from the AMR genetic marker is below the threshold.
  • the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the microbial sequence reads from the gene cassette is below the threshold. In some embodiments, the methods disclosed herein further comprise quantifying mcfNA sequencing reads from the one or more microbes. In some embodiments, the methods disclosed herein further comprise comparing the abundance of the mcfNA sequencing reads from the one or more microbes to a threshold. In some embodiments, the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the mcfNA sequencing reads from the one or more microbes is below the threshold.
  • the sample comprises cell-free DNA.
  • the first nucleic acids comprise microbial cell-free nucleic acids.
  • the second nucleic acids comprise microbial cell-free nucleic acids.
  • intact microbes are not actively lysed prior to performing the sequencing assay.
  • the microbial nucleic acids comprise microbial cell-free nucleic acids (mcfNA) from the microbe.
  • the microbial nucleic acids comprise nucleic acids associated with a microbial cell.
  • the methods disclosed herein comprise enriching for at least at least 75%, at least 80%, at least 85%, or at least 90% of the mcfNA in the sample.
  • the degraded cfNA comprise ultra short cfNA, single stranded cfNA, nicked double stranded cfNA, or any combination thereof. In some embodiments, the degraded cfNA comprise cfNA fragments that are less than 100 nucleotides in length. In some embodiments, the degraded cfNA comprise cfNA fragments that are from 30 nucleotides to 70 nucleotides in length. In some embodiments, physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the methods disclosed herein further comprise detecting one or more carrier microbe from the sequencing assay. In some embodiments, the methods disclosed herein further comprise identifying the one or more carrier microbe as the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise linking the AMR genetic marker to one of the carrier microbes or the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise obtaining an abundance of the one or more carrier microbes based on an abundance of carrier microbe sequences in the sample. In some embodiments, conducting a statistical analysis comprises using the abundance of the microbial nucleic acids of the one or more potential carrier microbes. In some embodiments, conducting a statistical analysis comprises using the AMR gene copy number.
  • the first nucleic acids are DNA. In some embodiments, the second nucleic acids are DNA. In some embodiments, the sample does not undergo cell lysis. In some embodiments, intact microbes are not actively lysed. In some embodiments, the amplicons associated with the AMR genetic marker undergoes direct library preparation. In some embodiments, the sample comprises cell-free nucleic acids. In some embodiments, the sample comprises cell-free DNA. In some embodiments, the first nucleic acids comprise cell-free nucleic acids. In some embodiments, the second nucleic acids comprise cell-free nucleic acids. In some embodiments, the methods disclosed herein comprise adding primers targeting a plurality of AMR genetic markers into the second aliquot of the sample.
  • the methods disclosed herein further comprise spiking one or more control molecules into the sample at a known concentration.
  • the one or more control molecules are synthetic oligonucleotides.
  • the control molecules comprise whole assay internal control (WINC) molecules.
  • the methods disclosed herein further comprise spiking at least 25,000 unique WINC molecules at known concentrations.
  • the methods disclosed herein further comprise generating a report listing the carrier microbes or pathogen detected in the subject.
  • the report further comprises the abundance of microbial cell-free DNA (mcfDNA) from microbes detected in the sample or the antimicrobial resistance of the microbes infecting the subject.
  • mcfDNA microbial cell-free DNA
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the microbe infecting the subject or the carrier microbe comprises a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae. In some embodiments, the microbe infecting the subject or the carrier microbe comprises a gram-positive bacterium or a fungus. In some embodiments, the microbe infecting the subject or the carrier microbe is a microbe listed in Table 1. In some embodiments, the microbe infecting the subject or the carrier microbe harbors at last two AMR genetic markers. In some embodiments, the antimicrobial resistance is phenotypic antimicrobial resistance. In some embodiments, the sequencing comprises nextgeneration sequencing or a sequencing method beyond next generation sequencing.
  • the statistical analysis comprises a generalized linear model, a maximum-likelihood estimation, a probit model, a logistic regression, a linear probability, a linear regression, a complimentary log-log, a Poisson regression, a support vector machine, a decision tree, a random forest, a neural network, a gradient boosted model, a Bayesian model, a hidden Markov model, or any combination thereof.
  • the method further comprises comparing the sequence read data to the amplification data.
  • the method further comprises calculating estimated deduplicated templates (EDT) for an AMR gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • EDT estimated deduplicated templates
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the method further comprises calculating an EDT for a housekeeping gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • the methods disclosed herein further comprise determining a presence of an organism in a sample by calculating a ratio of an AMR estimated deduplicated templates (EDT) to a housekeeping gene EDT.
  • EDT AMR estimated deduplicated templates
  • the methods disclosed herein further comprise performing a high-throughput sequencing assay on the amplicons associated with the second target nucleic acid.
  • the second target nucleic acid is not detected by a sequencing assay.
  • the first nucleic acid is associated with a genome of an organism but not a phenotype of interest of the organism and the second nucleic acid is associated with the phenotype of interest of the organism.
  • the carrier microbe comprises a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the target genetic marker comprises an antimicrobial resistance (AMR) genetic marker.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the AMR genetic marker to a threshold value.
  • the methods disclosed herein further comprise comparing an abundance of the microbial sequence reads from the gene cassette to a threshold value.
  • the methods disclosed herein further comprise providing a second aliquot for the amplification reaction when the abundance of the microbial sequence reads from the AMR genetic marker is below the threshold.
  • the methods disclosed herein further comprise performing high-throughput sequencing on the amplicons associated with the AMR genetic marker.
  • the sequencing assay on the first nucleic acids comprises a high- throughput sequencing assay.
  • the methods disclosed herein further comprise conducting a polymerase chain reaction (PCR) to amplify the AMR genetic marker, thereby producing amplicons associated with the AMR genetic marker.
  • the PCR comprises multiplex PCR, random PCR (rPCR), non-biased PCR, Nested PCR, Hot Start PCR, or Assembly PCR.
  • the methods disclosed herein further comprise attaching an adapter sequence to the second nucleic acids.
  • the primers comprise an adapter sequence.
  • the methods disclosed herein further comprise performing size selection of the nucleic acids in the sample.
  • physically manipulating the sample comprises performing size selection of the nucleic acids in the sample.
  • the second aliquot comprises at least 500 pl of plasma.
  • detecting the carrier microbe comprises determining an abundance of the carrier microbe.
  • the abundance is expressed as molecules of mcfNA per microliter of sample (MPM).
  • the methods disclosed herein further comprise calculating an AMR gene copy number from the amplification or sequencing of the AMR genetic marker.
  • the methods disclosed herein further comprise linking the AMR genetic marker to the carrier microbe using the abundance of the mcfNA from the carrier microbe and the AMR gene copy number.
  • the AMR gene copy number is an episomal gene copy number.
  • the methods disclosed herein comprise introducing at least 200 primers targeting the plurality of AMR genetic markers into the second aliquot of the sample.
  • the methods disclosed herein further comprise introducing primers targeting housekeeping genes into the second aliquot of the sample. In some embodiments, the methods disclosed herein further comprise determining antimicrobial resistance of the microbe infecting the subject.
  • the sample comprises plasma.
  • the first nucleic acids are DNA.
  • the second nucleic acids are DNA.
  • the sample is not subjected to a process that primarily causes cell lysis.
  • the methods disclosed herein further comprise preparing a library from the amplicons associated with the AMR genetic marker.
  • the primers are added directly to the second aliquot.
  • the sample comprises cell-free nucleic acids.
  • the sample comprises cell-free DNA.
  • the first nucleic acids comprise microbial cell-free nucleic acids.
  • the second nucleic acids comprise microbial cell-free nucleic acids.
  • intact microbes are not actively lysed prior to performing the sequencing assay.
  • the microbial nucleic acids comprise microbial cell-free nucleic acids (mcfNA) from the microbe.
  • the microbial nucleic acids comprise nucleic acids associated with a microbial cell.
  • the methods disclosed herein comprise enriching for at least at least 75%, at least 80%, at least 85%, or at least 90% of the mcfNA in the sample.
  • the methods disclosed herein further comprise generating sequence reads from the cfNA from the subject, wherein the sequence reads comprise microbial sequencing reads derived from the microbe infecting the subject. In some embodiments, the sequence reads further comprise microbial sequencing reads from one or more carrier microbe. In some embodiments, the methods disclosed herein further comprise calculating an abundance of the mcfNA from the microbe in the sample and an abundance of the mcfNA from the one or more carrier microbe. In some embodiments, the abundance is expressed as molecules of mcfNA per 100 nanoliters of sample. In some embodiments, the methods disclosed herein further comprise calculating an AMR gene copy number for each of the one or more carrier microbe.
  • the methods disclosed herein further comprise calculating a probability of each of the one or more potential carrier microbes being the microbe harboring the AMR genetic marker, thereby identifying the potential carrier microbe as the microbe infecting the subject. In some embodiments, the methods disclosed herein further comprise determining the antimicrobial resistance of the microbe infecting the subject based on the calculated probability. In some embodiments, the detecting comprises performing a sequencing assay. In some embodiments, the sequencing assay comprises a high-throughput sequencing assay. In some embodiments, the amplification reaction comprises introducing primers targeting a plurality of AMR genetic markers. In some embodiments, the amplification reaction produces amplicons associated with the AMR genetic marker. In some embodiments, the sample comprises plasma.
  • the first nucleic acids are DNA. In some embodiments, the second nucleic acids are DNA. In some embodiments, the sample does not undergo cell lysis. In some embodiments, intact microbes are not actively lysed. In some embodiments, the amplicons associated with the AMR genetic marker undergoes direct library preparation. In some embodiments, the sample comprises cell-free nucleic acids. In some embodiments, the sample comprises cell-free DNA. In some embodiments, the first nucleic acids comprise cell-free nucleic acids. In some embodiments, the second nucleic acids comprise cell-free nucleic acids. In some embodiments, the methods disclosed herein comprise adding primers targeting a plurality of AMR genetic markers into the second aliquot of the sample.
  • the sample comprises a biological sample obtained from the subject.
  • the biological sample is a whole blood sample, a plasma sample, a serum sample, a cerebrospinal fluid sample, a synovial fluid sample, a urine sample, a stool sample, a bronchoalveolar lavage sample, or any combination thereof.
  • the sample is a plasma sample.
  • the AMR genetic marker comprises a gene, a genetic element, a genetic cassette, an allele, a mutation, or any combination thereof.
  • the method further comprises calculating an EDT for a housekeeping gene, wherein calculating EDT comprises calculating a number of unique DNA sequencing reads from a microorganism present in a sequenced library.
  • the methods disclosed herein further comprise determining a presence of an organism in a sample by calculating a ratio of an AMR estimated deduplicated templates (EDT) to a housekeeping gene EDT.
  • EDT AMR estimated deduplicated templates
  • FIG. 3 Distribution of index correct read counts across 108 clinical samples.
  • FIG. 4 Distribution of EDTs in negative control samples for each AMR target gene.
  • FIGs. 5A-5B Probit fits for estimating CTX-M presence LoD using the (FIG. 5A) original dilution series and (FIG. 5B) simulated dilution series assuming one CTX-M gene copy. Light grey lines reflect the probit model fit and dark grey lines reflect the observed data of fraction positive calls per concentration.
  • FIG. 7 shows a table of clinical validation performance summary.
  • the present disclosure provides, in some embodiments, methods of performing a plurality of assays in order to detect whether a subject is infected with a microbe that contains an antimicrobial resistance (AMR) genetic marker.
  • the methods provided herein are particularly useful for determining whether a detected microbe carries an AMR genetic marker.
  • the assays are performed sequentially.
  • the methods comprise performing unbiased high-throughput sequencing (HTS) to detect a potential carrier microbe, followed by performing targeted sequencing in order to detect an AMR genetic marker in a sample.
  • the method comprises performing a targeted sequencing assay to detect both a carrier microbe and its associated AMR genetic marker, if present.
  • the methods provided herein comprise performing a multi-step assay for detecting an AMR genetic marker by analyzing at least two aliquots of the same sample in at least two sequential assays.
  • the first assay is a high throughput sequencing assay
  • the second assay is a targeted PCR amplification assay, that can, in some cases, be followed by a second high throughput sequencing assay.
  • carrier microbes and AMR genetic markers are detected in the same assay.
  • the carrier microbes are detected using primers specific for housekeeping genes and the AMR markers are detected using primers specific for sets of AMR genes.
  • targeted amplification for multiple targets is performed, followed by, in some embodiments, performing a statistical analysis to link detected carrier microbes with AMR genetic markers.
  • the methods comprise linking the AMR genetic marker to its carrier microbe based on the copy number of both the AMR genetic marker and the one or more microbe-specific housekeeping genes.
  • the method comprises analyzing certain features or criteria to determine that a carrier microbe harbors the AMR genetic marker.
  • such criteria can comprise a copy number of the AMR marker, the size of the AMR marker, the size of a gene cassette comprising the AMR marker, an abundance of mcfNA associated with the AMR marker, an abundance of mcfNA associated with the carrier microbe, or an abundance of mcfNA associated with the AMR marker in comparison to an abundance of mcfNA associated with one or more microbial-specific housekeeping genes.
  • the methods comprise performing an HTS assay such as a sequencing-by-synthesis assay in order to detect and/or quantify nucleic acids associated with the one or more potential carrier microbes.
  • a targeted assay is then performed to detect the AMR genetic marker, particularly if a carrier microbe was identified in the first high throughput sequencing assay.
  • detection of a carrier microbe in the first high throughput sequencing assay triggers obtaining a second aliquot of the sample and performing a targeted assay to detect nucleic acids in the second aliquot associated with the AMR genetic marker.
  • the targeted assay can, for example, comprise performing a PCR reaction with primers specific for one or more AMR genetic markers, optionally followed by high throughput sequencing to detect the AMR genetic markers.
  • the methods comprise calculating abundances of microbial nucleic acids detected in the targeted assay, such as an estimated deduplicated transcripts or templates (EDT) or a copy number.
  • the methods comprise calculating an EDT of one or more AMR genetic markers, a copy number of the one or more AMR genetic markers, or an EDT of one or more microbial-specific housekeeping genes.
  • the methods comprise linking an AMR genetic marker to its carrier microbe based on the results from the targeted assay.
  • the methods comprise linking an AMR genetic marker to its carrier microbe using the copy number of the AMR genetic marker and the abundances of the one or more microbial-specific housekeeping genes.
  • criteria associated with an HTS assay that detected a carrier microbe or an AMR gene cassette are analyzed in order to determine whether a subsequent assay such as a targeted sequencing assay is performed.
  • the methods comprise comparing the amount or abundance of mcfNA (e.g., MPM) associated with a carrier microbe, an AMR gene cassette, and/or an AMR gene with a threshold value.
  • the methods comprise proceeding to the targeted assay for the AMR genetic marker based on the comparison with the threshold value.
  • the high-throughput nature of the methods provided herein generally enables comprehensive profiling of the subject’s microbiome, infectome, and/or resistome.
  • the methods provided herein can further detect co-infections in a patient with complex infections.
  • the methods can also comprise treating patients based on the identification of the AMR genetic markers, which can reduce or inform the use of broad-spectrum antibiotics and allow for a more targeted approach to treatment of infectious diseases.
  • the methods can comprise treating the patients who have received one class of antimicrobial treatment with a different class of antimicrobial treatments to overcome AMR.
  • the one or more genetic markers are associated with an infectious disease. In some embodiments, the one or more genetic markers are associated with a non-communicable disease. In some embodiments, the one or more genetic markers comprise a genetic marker from a prokaryotic organism. In some embodiments, the one or more genetic markers comprise a genetic marker from a eukaryotic organism.
  • the mutation comprises a point mutation (e.g., missense, nonsense, silent, frameshift), a structural mutation (e.g., insertion, deletion, duplication, inversion, translocation), repeat expansions (e.g., trinucleotide repeats, MSI), a gain-of-function mutation, a loss-of-function mutation, a dominant negative mutation, or any combination thereof.
  • a point mutation e.g., missense, nonsense, silent, frameshift
  • a structural mutation e.g., insertion, deletion, duplication, inversion, translocation
  • repeat expansions e.g., trinucleotide repeats, MSI
  • a gain-of-function mutation e.g., a loss-of-function mutation
  • a dominant negative mutation e.g., MSI
  • the genetic markers comprise a microbial genetic marker.
  • the microbial genetic marker is from a virus, a bacterium, a protozoa, a fungus, an archaea, or an algae.
  • the microbial genetic marker comprises an antimicrobial resistance (AMR) marker.
  • the genetic markers comprise a mammalian genetic marker.
  • the mammalian genetic marker is from a human.
  • the genetic marker is associated with a human disease.
  • the genetic marker comprises a cancer marker.
  • An AMR genetic marker can be associated with one or more AMR genes that enable microorganisms to survive exposure to anti-microbial agents, leading to treatment failure and persistent infections.
  • the AMR genetic marker can be of any size within a microbial genome.
  • the AMR genetic marker comprises a genetic locus, a gene, an allele, a genetic variant, a gene element (e.g., a regulatory element), a gene cassette, an operon, an integron, a resistance island, a resistance supercluster, a chromosome, a plasmid, or any combination thereof.
  • the AMR genetic marker is associated with one or more AMR genes.
  • the AMR genetic marker is associated with resistance to one or more anti-microbial or anti-infective agents.
  • the AMR genetic marker can be associated with resistance to any class of anti-microbial agents.
  • the AMR genetic marker is associated with resistance to a P-Lactam drug (e.g., carbapenems, methicillin), oxyimino- cephalosporins/aztreonam, an aminoglycoside drug (e.g., gentamicin), a Macrolide- Lincosamide-Streptogramin (MLS) drug (e.g., clindamycin), a tetracycline drug (e.g., doxycycline), a fluoroquinolone drug (e.g., ciprofloxacin), a glycopeptide drug (e.g., vancomycin), a polymyxin drug (e.g., colistin), rifamycin (e.g., rifampin
  • P-Lactam drug
  • the AMR genetic markers can be present in a carrier microbe at any copy number.
  • copy number can refer to the number of times a microbial gene or genomic region (e.g., the AMR genetic marker) is present in a microbe.
  • the copy number of the AMR genetic marker comprises a chromosomal copy number, an integron copy number, or an episomal (or plasmid) copy number.
  • the copy number of the AMR genetic marker is 1.
  • the copy number of the AMR genetic marker is at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10.
  • the copy number of the AMR genetic marker is calculated by PCR.
  • the methods described herein can comprise performing a multi-step assay on a sample from a subject.
  • the multi-step assay comprises performing a first high- throughput sequencing (HTS) assay on nucleic acids from the sample followed by performing a targeted sequencing assay.
  • the targeted sequencing assay can comprise performing a PCR reaction on the sample, optionally followed by a second high-throughput sequencing assay.
  • performing a targeted sequencing assay is predicated on the results of a first HTS assay.
  • results from a first HTS assay may meet certain criteria before a targeted sequencing assay is performed.
  • an analysis is performed to link the nucleic acids detected in the targeted sequencing assay with nucleic acids detected in the first high-throughput sequencing assay.
  • linking results from the two assays can provide phenotypic characteristics related to a genome identified in the first high throughput sequencing assay.
  • the methods provided herein comprise performing a multi-step assay on a sample in order to detect a microbe of interest.
  • a first HTS assay is performed in order to detect a carrier microbe; in some embodiments, the first HTS is an unbiased sequencing assay.
  • a targeted sequencing assay is performed if the results of the first HTS assay meet certain criteria.
  • the targeted sequencing assay is performed if one or more microbes known to carry an AMR marker is detected in the first HTS.
  • a targeted sequencing assay is performed in order to detect the AMR marker.
  • the at least two sequencing assays can be the same sequencing assay. In some embodiments, the at least two sequencing assays can be different sequencing assays. In some embodiments, the at least two sequencing assays are performed in parallel. In some embodiments, the at least two sequencing assays are performed sequentially. In some embodiments, methods comprise performing a first high throughput sequencing assay before performing a targeted amplification or a second high throughput sequencing assay. In some embodiments, the second high throughput assay is a targeted sequencing assay. In some embodiments, only a targeted amplification or sequencing assay is performed. In some cases, an unbiased sequencing method is performed prior to the targeted assay.
  • a first aliquot can be taken out of the subject’s sample and analyzed by a first sequencing assay in the methods described herein.
  • a second aliquot can be taken out of the subject’s sample and analyzed by a second sequencing assay (e.g., a targeted sequencing assay).
  • a second sequencing assay e.g., a targeted sequencing assay.
  • the entire remaining sample is analyzed as the second aliquot in an amplification or a second sequencing assay.
  • a portion or fraction of the remaining sample is analyzed as the second aliquot in the amplification or the second sequencing assay.
  • assays within workflow analyze a different aliquot of the sample.
  • an aliquot of the sample can be any measured portion of the sample.
  • an aliquot is a portion of an original sample, such as at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% percent of the sample.
  • an aliquot is a portion of an original sample, such as less than 10%, 20%, 30%, 40%, 50%, 60%, 70%, or 80% percent of the sample.
  • the methods comprise performing a first sequencing assay to analyze a first aliquot of the sample from the subject in order to detect first target nucleic acids, wherein the first aliquot comprises the first target nucleic acids.
  • the methods described herein further comprise, in some embodiments, after detecting the first target nucleic acids, performing an amplification or a second sequencing assay to analyze a second aliquot of the sample in order to detect second target nucleic acids, wherein the second aliquot comprises the second target nucleic acids.
  • the first target nucleic acids and the second target nucleic acids comprise any one of the genetic markers provided herein.
  • the first target nucleic acids are associated with a carrier microbe or an AMR cassette; in some cases, the second target nucleic acids are AMR markers.
  • the first target nucleic acids and the second target nucleic acids comprise a genetic marker associated with an infection or a non- communicable disease or disorder.
  • the first target nucleic acids and the second target nucleic acids comprise the same genetic marker.
  • the first target nucleic acids and the second target nucleic acids comprise different genetic markers.
  • the methods comprise detecting a microbe using an HTS assay followed by performing a targeted sequencing assay.
  • an unbiased HTS is not performed prior to performing the targeted sequencing assay.
  • the methods comprise using unbiased HTS to detect a carrier microbe potentially harboring an AMR genetic marker.
  • the carrier microbe, but not the AMR marker is detected by the unbiased HTS.
  • the methods comprise detecting two or more carrier microbes harboring an AMR genetic marker.
  • the methods comprise detecting at least three carrier microbes.
  • the EDR is calculated for one or more microbes.
  • an estimated fraction of reads may be calculated by dividing the number of reads aligned to the region of interest by the total number of reads in the sequence data.
  • the EDR comprises identifying duplicates.
  • the EDR per microbe is calculated by removing duplicate reads and then multiplying the total deduplicated reads by the estimated fraction of reads generated from a microbe.
  • a genomic region may be a window of the genome within a given range of continuous nucleotides between a start point and an end point.
  • the range may be 100 bp, 200 bp, 300, bp, 400 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kbp,10 kbp, 20 kbp, 30 kbp, 40 kbp, 50 kbp, 60 kbp, 70 kbp, 80 kbp, 90 kbp, 100 kbp, 1 mbp, 2 mbp, 3 mbp, 4 mbp, 5 mbp, or any combination thereof.
  • the standard has a known copy number.
  • the standard is the synthetic spike-in nucleic acids.
  • the standard is a selected region of the genome.
  • the genomic region of interest may be compared to the standard. Examples of comparisons include read depth (coverage) analysis, ration of normalized read counts, b-allele frequency analysis, control-FREEC (free copy number estimation from coverage), read depth per probe, hidden Markov model analysis, or sequencing depth windowing.
  • the methods comprise detecting a gene cassette (such as a gene cassette described herein) using a sequencing assay.
  • a gene cassette e.g., AMR cassette
  • the methods comprise detecting at least one gene cassette.
  • the methods detect two or more gene cassettes.
  • the methods detect at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 60, at least 70, at least 90, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, or at least 1000 gene cassettes.
  • the methods comprise detecting a gene cassette harboring at least one AMR genetic marker.
  • the methods comprise detecting an empty gene cassette that does not harbor any AMR genetic marker.
  • the methods comprise detecting gene cassettes from one or more carrier microbes harboring at least one AMR genetic marker. In some embodiments, the methods comprise detecting gene cassettes from the same carrier microbe. In some embodiments, the methods comprise detecting an integrated gene cassette. In some embodiments, the methods comprise detecting an episomal gene cassette.
  • the methods described herein comprise detecting an AMR genetic marker using a sequencing assay. In some embodiments, the methods comprise detecting two or more AMR genetic markers. In some embodiments, the methods comprise detecting at least three AMR genetic markers.
  • the methods comprise detecting at least 5, at least 10, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, at least 75, at least 80, at least 85, at least 90, at least 95, at least 100, at least 200, at least 300, at least 400, at least 500, at least 600, at least 700, at least 800, at least 900, at least 1000, at least 1100, at least 1200, at least 1300, at least 1400, at least 1500, at least 1600, at least 1700, at least 1800, at least 1900, at least 2000, at least 2100, at least 2200, at least 2300, at least 2400, at least 2500, at least 2600, at least 2700, at least 2800, at least 2900, or at least 3000 AMR genetic markers.
  • the methods comprise detecting AMR genetic markers from one or more carrier microbes harboring at least one AMR genetic marker. In some embodiments, the methods comprise detecting AMR genetic markers detected are from the same carrier microbe. In some embodiments, the methods comprise detecting an integrated AMR genetic marker. In some embodiments, the methods comprise detecting an episomal AMR genetic marker.
  • the methods comprise 1) detecting a carrier microbe harboring an AMR genetic marker in a first sequencing assay, wherein the first sequencing assay is performed on a first aliquot of a sample; and then 2) detecting the AMR genetic marker in a second sequencing assay, wherein the second sequencing assay is performed on a second aliquot of a sample.
  • the methods comprise 1) detecting an AMR genetic cassette in the first sequencing assay, wherein the first sequencing assay is performed on a first aliquot of a sample; and then 2) detecting the AMR genetic marker in a second sequencing assay; wherein the second sequencing assay is performed on a second aliquot of a sample.
  • the methods comprise 1) detecting a carrier microbe and not detecting an AMR genetic marker in the first sequencing assay, wherein the first sequencing assay is performed on a first aliquot of a sample; and then 2) detecting the AMR genetic marker in a second sequencing assay; wherein the second sequencing assay is performed on a second aliquot of a sample.
  • the methods comprise 1) detecting a carrier microbe and not detecting an AMR gene cassette in the first sequencing assay, wherein the first sequencing assay is performed on a first aliquot of a sample; and then 2) detecting the AMR genetic marker in a second sequencing assay; wherein the second sequencing assay is performed on a second aliquot of a sample.
  • the methods further comprise providing sufficient amount of the second aliquot for the second sequencing assay.
  • the second aliquot is the entire remaining sample after the first aliquot has been removed from the sample.
  • the methods comprise detecting an AMR genetic marker harbored by a microbe selected from the group consisting of: Staphylococcus aureus, S. epidermidis, S. lugdunensis, Enterococcus faecalis, E. faecium, Enterobacter cloacae complex, Escherichia coli, Klebsiella aerogenes, K pneumoniae, K oxytoca, Proteus mirabilis, P. vulgaris, Salmonella bongori, S. enterica, Serratia marcescens, Pseudomonas aeruginosa, Acinetobacter baumannii, and A. calcoaceticus in the second sequencing assay.
  • a microbe selected from the group consisting of: Staphylococcus aureus, S. epidermidis, S. lugdunensis, Enterococcus faecalis, E. faecium, Enterobacter cloacae complex
  • the methods comprise performing a second sequencing assay after the first sequencing assay detects a microbe selected from the group consisting of: Staphylococcus aureus, S. epidermidis, S. lugdunensis, Enterococcus faecalis, E. faecium, Enterobacter cloacae complex, Escherichia coli, Klebsiella aerogenes, K pneumoniae, K oxytoca, Proteus mirabilis, P. vulgaris, Salmonella bongori, S. enterica, Serratia marcescens, Pseudomonas aeruginosa, Acinetobacter baumannii, and A. calcoaceticus.
  • a microbe selected from the group consisting of: Staphylococcus aureus, S. epidermidis, S. lugdunensis, Enterococcus faecalis, E. faecium, Enterobacter cloacae complex, Es
  • criteria that are met before performing a second sequencing assay comprise (1) presence of a microbe known to harbor an AMR marker higher than a threshold value (e.g., a threshold value provided herein) and AMR cassette presence is “indeterminate”; (2) presence of a microbe known to harbor an AMR marker lower than a threshold value (e.g., a threshold value provided herein) and with confounding microbes present below a threshold value (e.g., a threshold value provided herein), and AMR cassette presence indeterminate; or (3) gram negative microbes greater than a threshold value (e.g., a threshold value provided herein).
  • a threshold value e.g., a threshold value provided herein
  • the methods comprise performing a first sequencing assay on a first aliquot of a sample and based on the results of the first sequencing assay, and determining whether to perform the second sequencing assay on a second aliquot of the sample.
  • the methods comprise 1) detecting nucleic acids from a potential carrier microbe harboring an AMR genetic marker in a first sequencing assay, wherein the first sequencing assay is performed on a first aliquot of a sample; 2) quantifying the nucleic acids from the potential carrier microbe in the first aliquot; 3) determining the likelihood that the potential carrier microbe comprises the AMR genetic marker; and 4) determining to perform an amplification or a second sequencing assay on a second aliquot of the sample to detect the AMR genetic marker.
  • determining the likelihood in 3) comprises quantifying microbial sequence reads associated with the potential carrier microbe.
  • determining the likelihood in 3) comprises quantifying sequence reads associated with the AMR genetic marker.
  • determining the likelihood in 3) comprises comparing the sequence reads associated with the AMR genetic marker to the microbial sequence reads associated with the potential carrier microbe.
  • determining the likelihood in 3) comprises using a statistical model.
  • the threshold abundance value for gram-positive bacteria is at least about 50 MPM, at least about 100 MPM, at least about 150 MPM, at least about 200 MPM, at least about 300 MPM, at least about 400 MPM, at least about 500 MPM, at least about 600 MPM, at least about 700 MPM, at least about 800 MPM, at least about 900 MPM, at least about 1000 MPM, at least about 1100 MPM, at least about 1200 MPM, at least about 1300 MPM, at least about 1400 MPM, at least about 1500 MPM, at least about 1600 MPM, at least about 1700 MPM, at least about 1800 MPM, at least about 1900 MPM, at least about 2000 MPM, at least about 2500 MPM, at least about 3000 MPM, at least about 3500 MPM, at least about 4000 MPM, at least about 4500 MPM, at least about 5000 MPM, at least about 5500 MPM, at least about 6000 MPM, at least about 6500 MPM, at least about
  • the abundance threshold value for gram-negative bacteria is at least about 4500 MPM, at least about 5000 MPM, at least about 5500 MPM, at least about 6000 MPM, at least about 6500 MPM, at least about 7000 MPM, at least about 7500 MPM, at least about 8000 MPM, at least about 8500 MPM, at least about 9000 MPM, at least about 9500 MPM, at least about 10,000 MPM, at least about 15,000 MPM, at least about 20,000 MPM, at least about 25,000 MPM, at least about 30,000 MPM, at least about 35,000 MPM, at least about 40,000 MPM, at least about 45,000 MPM, or at least about 50,000 MPM.
  • the threshold for a carrier microbe harboring /CTX I or /KPC is at least about 5000 MPM, at least about 5500 MPM, at least about 6000 MPM, at least about 6500 MPM, at least about 7000 MPM, at least about 7500 MPM, at least about 8000 MPM, at least about 8500 MPM, at least about 9000 MPM, at least about 9500 MPM, at least about 10,000 MPM, at least about 15,000 MPM, or at least about 20,000 MPM.
  • the abundance threshold value is from 50 MPM to 20,000 MPM, from 50 MPM to 10,000 MPM, from 50 MPM to 9,500 MPM, from 50 MPM to 5,000 MPM, from 100 MPM to 10,000 MPM, from 200 MPM to 8,000 MPM, from 300 MPM to 7,000 MPM, from 400 MPM to 6,000 MPM, from 500 MPM to 5,000 MPM, from 1,000 MPM to 4,000 MPM, from 1,500 MPM to 3,000 MPM, from 2,000 MPM to 2,500 MPM, or from 2,500 MPM to 2,000 MPM.
  • the methods provided herein comprise detecting one or more carrier microbes harboring an AMR genetic marker using a first sequencing assay on a first aliquot of a sample form a subject, and then based on the results from the first sequencing assay, determining to perform an amplification assay or a second sequencing assay on a second aliquot of the same sample.
  • the methods comprise, after the first sequencing assay on the first aliquot of sample, providing a second aliquot of sample with sufficient amount of cfNA for the amplification assay or the second sequencing assay.
  • the second aliquot of sample comprises at least 0.05 ng, at least 0.10 ng, at least 0.15 ng, at least 0.20 ng, at least 0.25 ng, at least 0.30 ng, at least 0.35 ng, at least 0.40 ng, at least 0.45 ng, at least 0.50 ng, at least 0.55 ng, at least 0.60 ng, at least 0.65 ng, at least 0.70 ng, at least 0.75 ng, at least
  • the methods comprise, after the first sequencing assay on the first aliquot of sample, providing a second aliquot of sample with sufficient volume for the amplification assay or the second sequencing assay.
  • the second aliquot of sample comprises at least 50 pl, at least 100 pl, at least 150 pl, at least 200 pl, at least 250 pl, at least 300 pl, at least 350 pl, at least 400 pl, at least 450 pl, at least 500 pl, at least 550 pl, at least 600 pl, at least 650 pl, at least 700 pl, at least 750 pl, at least 800 pl, at least 850 pl, at least 900 pl, at least 950 pl, or at least 1 ml of the sample.
  • the one or more Methicillin- resistance carriers comprise In some embodiments, the methicillin-resistance carrier comprises Staphylococcus pseudintermedius, Staphylococcus fleurettii, Staphylococcus epidermidis, Staphylococcus schleiferi, Staphylococcus lugdunensis, Staphylococcus cohnii, Staphylococcus caprae, Staphylococcus warneri, Staphylococcus saprophyticus, Staphylococcus aureus, Staphylococcus haemolyticus, Staphylococcus capitis, Staphylococcus hominis, Staphylococcus pettenkoferi, or Staphylococcus simulans.
  • the methods provided herein can comprise detecting Staphylococcus aureus (S. aureus) in a high throughput sequencing assay performed on an aliquot of a sample from a patient, wherein the aliquot of the sample comprises mcfNA from the S. aureus.
  • S. aureus Staphylococcus aureus
  • the methods comprise performing the amplification assay or the second sequencing assay when the results from the first sequencing assay is indeterminate. In some embodiments, the methods comprise determining the result from the first sequencing assay is indeterminate and performing a second sequencing assay or an amplification assay if the SCCmec cassette in the sample is likely to be a non-mec SCC, as determined from alignments to alleles known to lack mecA and mecC. In some embodiments, the result from the first sequencing assay is indeterminate if the SCC fragments in the clinical sample have originated from a microbe other than S. aureus, as determined by comparing the abundances of known cross-reacting or interfering species. In some embodiments, the interfering species comprise coagulase-negative staphylococci that harbor sequences highly homologous to SCCmec.
  • the methods do not comprise detecting an abundance of sequencing reads from S. aureus over a threshold provided herein, the amplification assay or the second sequencing assay targeting mecA and mecC will be performed. In some cases, if the methods do not comprise detecting a carrier microbe harboring /CTX-M or bla ⁇ c above the threshold, the amplification assay or the second sequencing assay targeting /CTX-M or bla ⁇ c will be performed. In some cases, if the methods do not comprise detecting a carrier microbe harboring vanA or van B above the threshold, the amplification assay or the second sequencing assay targeting vanA or van B will be performed.
  • the threshold of mcfNA from a confounding species comprises at most 200 MPM, at most 180 MPM, at most 160 MPM, at most 140 MPM, at most 120 MPM, at most 100 MPM, at most 80 MPM, at most 60 MPM, at most 40 MPM, at most 20 MPM, at most 10 MPM, at most 8 MPM, at most 6 MPM, at most 4 MPM, or at most 2 MPM, or at most 1 MPM.
  • the methods comprise detecting one or more Gram-negative bacteria using the first sequencing assay on the first aliquot of a sample form a subject, wherein the one or more Gram-negative bacteria are potential carrier microbes of an AMR genetic marker, and then based on the results from the first sequencing assay, determining to perform an amplification assay or a second sequencing assay on a second aliquot of the same sample.
  • the one or more Gram-negative bacteria comprise Enterobacter cloacae complex, Escherichia coli, Klebsiella pneumoniae, Klebsiella oxytoca, Proteus mirabilis, Proteus vulgaris, Salmonella enterica, Salmonella bongori, Serratia marcescens, Enterobacter aerogenes, Pseudomonas aeruginosa, Acinetobacter baumannii, Acinetobacter calcoaceticus, or a combination thereof.
  • the amplification assay or the second sequencing assay is performed only when the methods detect a quantity of mcfNA from the one or more Gram-negative bacteria passes a threshold provided herein.
  • the methods comprise discounting a result from a step of the multi-step process if there are multiple contaminants detected by the step. In some embodiments, the methods comprise discounting a result from a sequencing assay if there are multiple contaminants detected by the sequencing assay. In some embodiments, contaminants comprise contaminating nucleic acids or microbes. In some embodiments, contaminating nucleic acids arise from sample collection, sample site (e.g., skin), or reagents (e.g., buffers, buffer components, water, beads, etc.).
  • the methods provided herein comprise using abundance of S. aureus combined with detection of a SCCmec cassette in order to identify whether the S. aureus harbors an AMR genetic marker.
  • a number of reads mapping to SCCmec is compared to the number of reads mapping to the S. aureus genome and a statistical model is used to determine the likelihood of either the presence or absence of methicillin resistance.
  • the methods further comprise quantifying sequencing reads from the high throughput sequencing assay, wherein the sequencing reads comprise S. aureus mcfNA sequencing reads. The abundance of the S. aureus mcfNA in the sample can be calculated based on the abundance of the S. aureus mcfNA sequencing reads.
  • the methods comprise detecting SCCmec cassette from the sample and quantifying microbial sequencing reads from the SCCmec cassette.
  • the SCCmec cassette is detected by unbiased HTS; in some embodiments, it is detected by targeted sequencing.
  • the SCCmec cassette is an AMR gene cassette of a known size found in S. aureus.
  • the SCCmec cassette in certain strains of S. aureus can harbor mecA or mecC, which are AMR genes associated with resistance to methicillin.
  • the SCCmec cassette can be empty and does not comprise the AMR genetic marker of interest.
  • the methods comprise detecting a non-empty SCCmec cassette.
  • the methods provided herein can further comprise determining the likelihood that mecA or mecC is present in the S. aureus in the patient based on the amount of microbial sequencing reads mapped to S. aureus and the amount of microbial sequencing reads from the SCCmec cassette.
  • the SCCmec cassette in S. aureus is an integrated region of the genome, the sequencing read count from S. aureus and the sequencing read count from the SCCmec cassette can be reliably correlated.
  • an estimated deduplicated reads of >1000, 2000, 3000, 4000 or 5000 can allow the methods to determine whether the SCCmec cassette likely harbors mecA or mecC by analyzing the S. aureus sequencing reads.
  • the methods can compare the SCCmec cassette sequencing reads with the S. aureus genome sequencing reads to determine this likelihood.
  • the method if there is a high likelihood that the SCCmec cassette harbors mecA or mecC, the method will determine that the S. aureus is methicillin-resistant S. aureus (MRSA).
  • MRSA methicillin-resistant S. aureus
  • MS SA methicillin- susceptible S. aureus
  • a beta-binomial distribution can be used to determine the probability of getting X SCCmec reads given Y S. aureus reads.
  • a distribution comprises a distribution of ratio of SCCmec length to S. aureus genome length for MRSA.
  • a similar model can be built for MSSA.
  • Bayesian interference can be used to calculate probability of MRSA given the observed S. aureus reads and the observed number of SCCmec reads.
  • linkage of an AMR marker with a carrier microbe is based on analyzing an EDT value of an AMR marker and the abundance of a carrier microbe detected by HTS. In some embodiments, linkage of an AMR marker with a carrier microbe is based on analyzing an EDT value of an AMR marker and an EDT value of a microbe-specific gene, such as a microbe-specific housekeeping gene.
  • the methods provided herein can be especially useful for linking an AMR marker associated with a carrier microbe.
  • targeted sequencing of an AMR marker is based on the sequence of the AMR marker itself (e.g., primers directed to the AMR marker), without giving information about its associated carrier microbe.
  • linking a carrier microbe with an AMR genetic marker is especially useful for samples in which nucleic acids from multiple carrier microbes are detected, nucleic acids from multiple AMR markers, or a combination thereof. In such cases, it can be challenging to determine which carrier microbe is linked to which AMR genetic marker. In some cases, when multiple species of carrier microbes are detected, the methods provided herein identify which species of the carrier microbe is associated with a particular AMR marker.
  • the methods provided herein comprise linking an AMR genetic marker with its carrier microbe via an unbiased high throughput sequencing (HTS) assay and a targeted sequencing assay.
  • the methods comprise 1) detecting one or more potential carrier microbes harboring an AMR genetic marker via an unbiased HTS assay; 2) detecting the AMR genetic marker via a targeted sequencing assay; and 3) performing statistical analysis to link the AMR genetic marker with its carrier microbe.
  • the methods comprise detecting an episomal AMR genetic marker in the targeted sequencing assay.
  • performing the statistical analysis comprises calculating an abundance of mcfNA from the one or more potential carrier microbes detected in the unbiased HTS assay.
  • performing the statistical analysis comprises calculating an abundance of mcfNA associated with the AMR genetic marker.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based on the abundance of mcfNA from the one or more potential carrier microbes, the abundance of mcfNA associated with the AMR genetic marker, the copy number of the AMR genetic marker, the abundances of microbial-specific housekeeping genes detected in the targeted sequencing assay, or any combination thereof.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based on the abundance of mcfNA from the one or more potential carrier microbes and the copy number of the AMR genetic marker.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based on results from the unbiased HTS assay and results from the targeted sequencing assay.
  • the methods provided herein comprise performing a statistical analysis to link one or more AMR genetic markers to one or more carrier microbes harboring the AMR genetic markers.
  • the methods comprise linking one AMR genetic marker to one carrier microbe.
  • the methods comprise linking multiple AMR genetic markers to multiple carrier microbes; for example, the methods can comprise linking one or more carrier microbes to its corresponding AMR marker.
  • the statistical analysis comprises a generalized linear model.
  • the statistical analysis comprises a maximum-likelihood estimation.
  • the statistical analysis comprises a probit model.
  • the statistical analysis comprises a maximum-likelihood probit model.
  • Examples of other models that may be used in the statistical analysis include logistic regression, linear probability, linear regression, complimentary log-log, Poisson regression, support vector machine, decision tree, random forest, neural network, gradient boosted model, Bayesian models, hidden Markov models, or any combination thereof.
  • the methods comprise using a model that uses the observed abundance of the carrier microbes (e.g., MPM), copy number of AMR gene, ratio of AMR gene length to microbe genome length, a measure of PCR reaction efficiency from the abundance of an internal control (PCR WINC), or any combination thereof.
  • a model is prepared that gives an expected AMR abundance, such as estimated deduplicated templates or transcripts (EDT)).
  • EDT estimated deduplicated templates or transcripts
  • the methods comprise using a model (e.g., a model described herein) and/or probabilities for AMR prevalence.
  • the model or probabilities for AMR prevalence are derived from empirical observations for combinations of observed carrier microbe and AMR gene copy number.
  • the model or probabilities for AMR prevalence are derived from known observations reported in scientific literature for combinations of observed carrier microbe and AMR gene copy number.
  • a combination of observed carrier microbe and AMR gene copy number is used to determine the likelihood of this combination given the observed AMR EDT.
  • thresholds to this final likelihood for microbe-AMR combination are applied in order to determine presence, indeterminate, or absence.
  • the methods comprise performing a statistical analysis to link an AMR genetic marker with a carrier microbe based on detecting or quantifying an AMR marker that is integrated into a chromosome or based on quantifying an AMR marker that is an episomal marker.
  • integrated or episomal AMR markers are detected by an unbiased HTS assay or a targeted sequencing assay.
  • the methods comprise performing a statistical analysis to link an AMR genetic marker with a carrier microbe based on detecting an abundance of an AMR cassette.
  • a first assay detects larger genomic targets, such as SCCmec. If the larger genomic targets are known to be integrated in the microbial genome, the methods comprise identifying the linkage to a carrier microbe using a statistical model or without a statistical model. In some embodiments, the methods comprise detecting genes or portions of plasmids by an assay (e.g., unbiased HTS assay) and then using a statistical model to assign linkage without necessarily performing a targeted sequencing assay.
  • an assay e.g., unbiased HTS assay
  • the methods comprise detecting an integrated AMR genetic marker that has a copy number that tracks the copy number of a microbial genome; in some cases, the AMR genetic marker is integrated and has one copy per microbial genome. In some cases, multiple AMR genetic markers are integrated into the same microbial genome. In some embodiments, the methods comprise detecting an episomal AMR genetic marker. In some cases, the copy number of the episomal AMR genetic marker dose not present in a one-to-one ratio with the microbial genome (e.g., microbial DNA, RNA), microbial gene, or microbial chromosome. In some embodiments, the methods detect an integrated AMR cassette that may comprise the AMR genetic marker.
  • the methods detect an episomal AMR cassette. In some embodiments, the methods detect a non-empty AMR cassette. In some embodiments, the methods detect the integrated AMR genetic marker or the integrated AMR cassette in the unbiased HTS assay. In some embodiments, the methods detect the episomal AMR genetic marker or the episomal AMR cassette in the unbiased HTS assay. In some embodiments, the methods detect the episomal AMR genetic marker or the episomal AMR cassette in the targeted HTS assay. In some embodiments, the methods detect the integrated AMR genetic marker or the integrated AMR cassette in the targeted HTS assay.
  • the methods can detect the AMR genetic marker or the carrier microbes by high throughput sequencing, next generation sequencing (NGS), massively parallel sequencing, or sequencing by synthesis. In some embodiments, the methods detect the AMR genetic marker by an unbiased sequencing assay. In some embodiments, the methods detect the AMR genetic marker by a targeted sequencing assay.
  • a first assay can detect whether disease state is present or absent and a second assay can define the specific disease state (e.g., inflammation) or site of disease (e.g., a site of infection).
  • a second assay can define the specific disease state (e.g., inflammation) or site of disease (e.g., a site of infection).
  • the methods provided herein comprise linking an AMR genetic marker with its carrier microbe via a targeted sequencing assay that detects both the carrier microbe as well as the AMR genetic marker.
  • the methods comprise 1) detecting the AMR genetic marker via a targeted sequencing assay; 2) detecting one or more microbial-specific genes such as microbe-specific housekeeping genes via the targeted assay; and 3) performing statistical analysis to link the AMR genetic marker with its carrier microbe.
  • the AMR marker detected by the targeted sequencing assay is an episomal AMR genetic marker.
  • the AMR marker detected by the targeted sequencing assay is an integrated AMR genetic marker.
  • performing the statistical analysis comprises calculating a copy number of the AMR genetic marker or calculating abundances of the microbial-specific housekeeping genes detected in the targeted sequencing assay.
  • a statistical model for linking a carrier microbe with an AMR genetic marker using a targeted assay is similar to a statistical model used for linking a carrier microbe detected by HTS with an AMR marker detected by a targeted assay. Some examples of such models are provided herein.
  • housekeeping gene abundance (EDT) and AMR-to-housekeeping-gene ratio are used in a statistical model.
  • the MPM value in a statistical model described herein can be substituted with the EDT value when using the targeted sequencing assay to detect a carrier microbe and its associated AMR marker.
  • the method comprises detecting microbe-specific housekeeping genes by a targeted sequencing assay.
  • the microbe-specific housekeeping genes are detected by HTS or unbiased HTS.
  • the microbe-specific housekeeping genes are specific for different species of a microbe.
  • the microbe-specific housekeeping genes are specific for different species of Staphylococcus, e.g., S. aureus and/or S. epidermidis.
  • the microbe-specific genes comprise A-SA, B-SA, C-SA, D-SA or any combination thereof.
  • the microbe-specific genes comprise a housekeeping (HK) gene listed in Supplemental Table 6 in the Examples.
  • the methods comprise detecting one or more microbial-specific housekeeping genes in the targeted sequencing assay in order to detect potential carrier microbes. In some embodiments, the methods comprise detecting a carrier microbe based on the detection of the one or more microbial-specific housekeeping genes in the targeted assay. In some embodiments, the methods comprise performing a statistical analysis in order to link a carrier microbe (detected by a housekeeping gene) with an AMR genetic marker.
  • performing the statistical analysis comprises calculating an abundance of the AMR genetic marker, a copy number of the AMR genetic marker, and/or abundances of the microbial-specific housekeeping genes detected in the targeted sequencing assay.
  • the abundance of the AMR genetic marker or the microbial-specific housekeeping genes comprises a relative abundance, such as an estimated deduplicated template or transcript (EDT).
  • performing the statistical analysis comprises calculating a ratio of the EDT of AMR genetic marker to the EDT of the microbial-specific housekeeping genes.
  • performing the statistical analysis comprises detecting a carrier microbe based on the abundances of the one or more microbial-specific housekeeping genes.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based on the abundance of mcfNA associated with the AMR genetic marker, the copy number of the AMR genetic marker, the abundances of the microbial- specific housekeeping genes detected in the targeted sequencing assay, the ratio of the AMR EDT to housekeeping gene EDT, or any combination thereof.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based on the copy number of the AMR genetic marker and the abundances of the microbial-specific housekeeping genes detected in the targeted assay.
  • the methods comprise linking the AMR genetic marker with its carrier microbe based only on results from the targeted sequencing assay.
  • a determination of “indeterminate” may be arrived at if the ratio of AMR EDT to housekeeping gene EDT for a carrier microbe departs from the expected ratio. For example, a lack of adequate housekeeping gene EDT would lower confidence in an AMR absence call, making the call indeterminate.
  • the methods further comprise preparing the sample using any of the sample preparation methods. In some embodiments, the methods further comprise preparing the nucleic acids in the sample for a sequencing assay. In some embodiments, the methods further comprise preparing the nucleic acids in the sample before a sequencing assay. In some embodiments, the methods further comprise preparing the nucleic acids in the sample after a first sequencing assay but before a second sequencing assay. In some embodiments, the methods further comprise performing a sequencing assay without preparing the nucleic acids. In some embodiments, preparing the nucleic acids in the sample comprises extracting, physically enriching, concentrating, amplifying the nucleic acids, or attaching one or more adapter to the nucleic acids.
  • amplifying the nucleic acids comprises performing a polymerase chain reaction (PCR). In some embodiments, amplifying the nucleic acids comprises attaching one or more adapter to the nucleic acids. In some embodiments, attaching one or more adapter to the nucleic acids comprises performing a ligase reaction or a PCR.
  • PCR polymerase chain reaction
  • the methods comprise 1) performing a first high throughput sequencing assay on nucleic acids in a first aliquot of a sample, wherein the nucleic acids have been extracted from the first aliquot; and 2) performing a second high throughput sequencing assay on nucleic acids in a second aliquot of the sample, wherein the nucleic acids have been extracted from the second aliquot.
  • the methods comprise 1) performing a first high throughput sequencing assay on nucleic acids in a first aliquot of a sample, wherein the nucleic acids have not been extracted from the first aliquot; and 2) performing a second high throughput sequencing assay on nucleic acids in a second aliquot of the sample, wherein the nucleic acids have not been extracted from the second aliquot.
  • the methods comprise 1) performing a first high throughput sequencing assay on nucleic acids in a first aliquot of a sample, wherein the nucleic acids have been extracted from the first aliquot; and 2) performing a second high throughput sequencing assay on nucleic acids in a second aliquot of the sample, wherein the nucleic acids have not been extracted from the second aliquot.
  • the first high throughput sequencing assay is not a targeted sequencing assay.
  • the second high throughput sequencing assay is a targeted sequencing assay.
  • the methods further comprise extracting nucleic acids from a nucleic acid library before the sequencing assay.
  • the methods provided herein comprise 1) performing a first sequencing assay on nucleic acids in a first aliquot of a sample, wherein the nucleic acids in the first aliquot have been extracted; 2) extracting nucleic acids from a second aliquot of the sample; and after 2), 3) performing a second sequencing assay on the nucleic acids extracted from the second aliquot.
  • the methods provided herein comprise 1) extracting nucleic acids from a first aliquot of a sample; 2) performing a first sequencing assay on the nucleic acids extracted from the first aliquot; and after 2), 3) performing a second sequencing assay on nucleic acids from a second aliquot, wherein the nucleic acids from the second aliquot have not been extracted.
  • the methods provided herein comprise 1) extracting nucleic acids from a first aliquot of a sample; 2) performing a first sequencing assay on the nucleic acids extracted from the first aliquot; and after 2), 3) performing a second sequencing assay on nucleic acids from a second aliquot, wherein the nucleic acids from the second aliquot have been extracted.
  • a sample can comprise a biological sample obtained or collected from a subject.
  • a biological sample can comprise cells.
  • a biological sample can be substantially cell-free.
  • a biological sample can comprise a biological fluid.
  • a biological fluid can comprise a bodily fluid of a subject (e.g., blood), a fluid obtained from the subject via a medical procedure (e.g., lavage), or any fluid obtained from processing a biopsy of the subject (e.g., serous fluid).
  • a biological fluid can comprise a bodily fluid.
  • a bodily fluid can comprise a whole blood, a plasma, a serum, a lymph, a synovial fluid, a cerebrospinal fluid (CSF), a saliva, a gastric juice, a bile, a pancreatic juice, an intestinal fluid, a respiratory tract mucosal secretion, a semen, a cervical mucus, a vaginal secretion, a urine, a sebum, a breast milk, an amniotic fluid, a pericardial fluid, a pleural fluid, a peritoneal fluid, or any combination thereof.
  • a biological fluid can be processed from a bodily fluid.
  • the one or more carrier microbes comprise a methicillin-resistance carrier.
  • the methicillin- resistance carrier comprises Staphylococcus pseudintermedius, Staphylococcus fleurettii, Staphylococcus epidermidis, Staphylococcus schleiferi, Staphylococcus lugdunensis, Staphylococcus cohnii, Staphylococcus caprae, Staphylococcus warneri, Staphylococcus saprophyticus, Staphylococcus aureus, Staphylococcus haemolyticus, Staphylococcus capitis, Staphylococcus hominis, Staphylococcus pettenkoferi, or Staphylococcus simulans.
  • the one or more carrier microbes comprise a Gram-negative carbapenem or extended-spectrum beta-lactamase (ESBL) resistance carrier.
  • the Gram-negative carbapenem or extended-spectrum beta-lactamase (ESBL) resistance carrier comprises Enterobacter cloacae complex, Escherichia coli, Klebsiella pneumoniae, Klebsiella oxytoca, Proteus mirabilis, Proteus vulgaris, Salmonella enterica, Salmonella bongori, Serratia marcescens, Enterobacter aerogenes, Pseudomonas aeruginosa, Acinetobacter baumannii, Acinetobacter calcoaceticus, Aeromonas caviae, Aeromonas hydrophila, Citrobacter amalonaticus, Citrobacter freundii, Citrobacter koseri, Pluralibacter gergoviae, Enterobacter hormaechei,
  • a confounding microbe comprises a carrier microbe.
  • a confounding microbe can refer to any microbe that can harbor nucleic acids that are homologous to the target nucleic acids (e.g., AMR gene or housekeeping gene).
  • the confounding microbe is a carrier microbe harboring an AMR cassette or an AMR genetic marker.
  • a “microbe” can refer to a living microorganism or a non-living microscopic entity.
  • a living microorganism can comprise a bacterium, a protozoa, a fungus, an archaea, an algae, a parasite, or any other living microorganism.
  • a non-living microscopic entity can comprise a virus, a live virus, a replicating virus, or an attenuated virus.
  • non-host nucleic acids can be derived from a microbe.
  • target nucleic acids can be derived from a plurality of microbes.
  • a microbe can be pathogenic to a subject (e.g., a pathogen), a commensal microbe of a subject, or a microbe present in a general environment.
  • a pathogen can comprise any pathogenic or virulent microbe.
  • a commensal microbe of a subject can comprise a microbe that inhabits any location in or on a subject without causing any symptom of a disease or disorder.
  • a microbe of a general environment can comprise a microbe at or near a sample collection site or a microbe at or near a location of a subject.
  • a microbe of a general environment comprises a commensal microbe.
  • a commensal microbe of a subject may become a pathogen to the subject.
  • a commensal microbe of a first subject may be a pathogen to a second subject.
  • a microbe of a general environment of a subject may become a pathogen to a subject.
  • a microbe of a general environment of a first subject may be a pathogen to a second subject.
  • a pathogen can cause an infection or disease comprising gastrointestinal infections (e.g., Escherichia coli, Salmonella spp., Clostridioides difficile), urinary infections (e.g., Escherichia coli , skin infections (e.g., Staphylococcus aureus, including MRS A), strep throat (or scarlet fever, rheumatic fever) (e.g., Streptococcus pyogenes), tuberculosis (e.g., Mycobacterium tuberculosis), gonorrhea (e.g., Neisseria gonorrhoeae), cholera (e.g., Vibrio cholerae), Lyme disease (e.g., Borrelia burgdorferi), ulcers or stomach cancer (e.g., Helicobacter pylori), syphilis (e.g., Treponema pallidum),
  • gastrointestinal infections e.
  • Pannonibacter Rotavirus, Turicella, Cardiovirus, Propionimicrobium, Furovirus, Naumovozyma, Closterovirus, Fluoribacter, Zeavirus, Clavispora, Megrivirus,
  • cfNA comprise circulating cfNA in a subject’s bloodstream.
  • the nucleic acids comprise circulating cfDNA, circulating cfRNA, cfDNA, cfRNA, circulating DNA, circulating RNA, or any combination thereof.
  • a sample of a body fluid can comprise a cfNA.
  • cfNA includes vesicle- associated cfNA, such as cfNA associated with exosomes, extracellular vesicles, microvesicles, apoptotic bodies, or any combination thereof.
  • cfNA do not comprise vesicle-associated cfNA, such as cfNA associated with exosomes, extracellular vesicles, microvesicles, apoptotic bodies, or any combination thereof.
  • cfNAs can also be associated with proteins or other cellular constituents, outside of an intact cell.
  • cfNA can comprise free-floating nucleosome-associated cfNA.
  • cfNAs can arise from various biological processes, including cell death (apoptosis, necrosis) or active secretion.
  • cfNA can comprise viral nucleic acids that are not encapsulated by a capsid, that are fragmented, or a combination thereof.
  • the methods provided herein can, in some embodiments, be practiced using viral nucleic acids derived from whole viruses floating in a bodily fluid.
  • an mcfNA can comprise a bacterial cfNA, a fungal cfNA, a viral cfNA, a protozoan cfNA, an archaeal cfNA, an algal cfNA, or any combination thereof.
  • a sample can comprise a non-microbial nucleic acid (e.g., a non- microbial cell-free nucleic acid).
  • a sample can comprise mcfNAs from one or more species of microbes.
  • a sample can comprise mcfNAs from at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20, at least 21, at least 22, at least 23, at least 24, at least 25, at least 26, at least 27, at least 28, at least 29, or at least 30 species of microbes.
  • a sample can comprise a mixture of nucleic acids.
  • a sample can comprise target nucleic acids (e.g., target cfNAs) and/or non-target nucleic acids.
  • the cfNAs comprise cfNA derived from a subject or microbial genome.
  • the cfNAs are derived from a housekeeping gene, e.g., a human or mammalian housekeeping gene, a microbial housekeeping gene, a microbe-specific housekeeping gene, and/or a microbe non-specific housekeeping gene.
  • a sample can further comprise contaminant nucleic acids.
  • contaminant nucleic acids can comprise nucleic acids from a general environment (e.g., a sample collection site).
  • contaminant nucleic acids are introduced during any step of sample processing.
  • the contaminant nucleic acids comprise contaminating microbial nucleic acids, contaminating nucleic acids from a different sample, contaminating host nucleic acids, or any combination thereof.
  • cell-free nucleic acids can comprise a mixture of cfNAs.
  • a mixture of cfNAs can comprise cfNAs originated from one or more organisms.
  • a mixture of cfNAs can comprise microbial nucleic acids (e.g., mcfNAs) originated from one or more species of microbes.
  • a cfNA can comprise a double-stranded nucleic acid (dsNA), a single-stranded nucleic acid (ssNA), nicked double-stranded, or a combination thereof.
  • a cfNA can comprise a cell-free DNA (cfDNA), a cell-free RNA (cfRNA), a cell- free DNA-RNA hybrid (cfDNA-RNA), or a combination thereof.
  • hcfNA can comprise host cell-free DNA (hcfDNA), host cell-free RNA (hcfRNA), host cell-free DNA- RNA hybrid (hcfDNA-RNA), or a combination thereof.
  • mcfNA can comprise microbial cell-free DNA (mcfDNA), microbial cell-free RNA (mcfRNA), microbial cell-free DNA-RNA hybrid (mcfDNA-RNA), or any combination thereof.
  • the cell-free sample is devoid of all types of microbes, including eukaryotic or prokaryotic cells.
  • the cell-free sample can be obtained from a biological sample provided herein.
  • the cell-free sample is a plasma, which has been processed in order to remove blood cells, subject cells, and/or intact microbes, or fragments thereof.
  • the cell-free sample can be obtained by a sample preparation process, such as centrifuging, ultracentrifuging, or filtering a biological sample.
  • a cfNA can comprise a host nucleic acid, a non-host nucleic acid, a target nucleic acid, or a combination thereof.
  • a cfNA can be derived from a host (host cell free nucleic acids or “hcfNA”) or a non-host.
  • a hcfNA can be derived from nuclear nucleic acids, mitochondria nucleic acids, exosomal nucleic acids, fetal nucleic acids, or any combination thereof.
  • a sample can comprise a host nucleic acid (e.g., a host cell-free nucleic acid).
  • a host is any subject provided herein.
  • ultrashort nucleic acids comprise nucleic acids less than 100 bp, less than 90 bp, less than 80 bp, less than 70 bp, less than 60 bp, less than 50 bp, less than 40 bp, or less than 30 bp in length. In some embodiments, at least 70%, 75%, 80%, 85%, 90%, or 95% of the mcfNA are degraded nucleic acids. Sizes of cfNAs
  • a cfNA as disclosed herein or fragments thereof can be approximately less than about 10 bp, less than about 15 bp, less than about 20 bp, less than about 25 bp, less than about 30 bp, less than about 35 bp, less than about 40 bp, less than about 45 bp, less than about 50 bp, less than about 55 bp, less than about 60 bp, less than about 65 bp, less than about 70 bp, less than about 75 bp, less than about 80 bp, less than about 85 bp, less than about 90 bp, less than about 95 bp, less than about 100 bp, less than about 105 bp, less than about 110 bp, less than about 115 bp, less than about 120 bp, less than about 125 bp, less than about 130 bp, less than about 135 bp, less than about 140 bp, less than about 145 bp,
  • the cfNAs provided herein comprise ultra short cfNAs.
  • ultra short nucleic acids refer to subnucleosomal nucleic acids or fragments shorter than subnucleosomal nucleic acids.
  • ultra short cfNAs can be from about 10 bp to about 100 bp long.
  • the ultra short cfNAs can be from about 30 bp to about 80 bp long.
  • the ultra short cfNAs can be from about 40 bp to about 60 bp long.
  • the ultra short cfNAs are about 50 bases long.
  • mcfNA can be present at higher concentrations relative to hcfNA at lengths that fall outside a nucleosomal interval.
  • mcfNA can be enriched relative to hcfNA by enriching for cfNA of less than 180bp, less than 170bp, less than 160bp, less than 150bp, less than 140bp, less than 130bp, less than 120bp, less than HObp, less than lOObp, less than 90bp, less than 80bp, less than 70bp, less than 60bp, less than 50bp, less than 40bp, less than 30bp, or less than 20bp.
  • enriching for mcfNA can comprise enriching for cfNA between 10-180 bp.
  • a cfNA can comprise any nucleic acid that is not encapsulated by a cell (e.g., a eukaryotic or microbial cell).
  • a cfNA can originate from any nucleic acids.
  • a cfNA can comprise a plurality of chemical forms of deoxyribonucleic acid (DNA), ribonucleic acid (RNA), or DNA/RNA hybrid.
  • nucleic acids can comprise a plurality of structural forms of DNA, RNA, or DNA/RNA hybrid.
  • a cfNA can comprise linear nucleic acids or circular nucleic acids.
  • nucleic acids can comprise a mixture of nucleic acids from various sources.
  • nucleic acids can be derived from a plurality of biological fluids.
  • nucleic acids can be from a plurality of organisms.
  • nucleic acids can be from a subject.
  • nucleic acids can be from one or more species of microbes.
  • nucleic acids can comprise environmental nucleic acids.
  • environmental nucleic acids can comprise any nucleic acid at or near a sample collection site, or any nucleic acid introduced by personnel, equipment or a reagent used in collecting and/or processing a sample from a subject.
  • process control molecules can be used for normalizing the signal in a sample to account for variations in sample processing or to control process performance.
  • process control molecules can include whole assay intemal control (WINC) molecules. In some embodiments, at least 10,000, at least 15,000, at least 20,000, at least 25,000, at least 30,000, at least 35,000, at least 40,000, at least 45,000, or at least 50,000 unique WINC molecules are spike in the sample.
  • process control molecules can include sample identifiers.
  • process control molecules can comprise dephosphorylation control molecules, denaturation control molecules, and/or ligation control molecules. In some embodiments, multiple different types or sets of control molecules can be added to a sample.
  • spiked initial sample refers to an initial sample to which process control molecules (or synthetic spike-ins) have been added prior to the start of generating a sequencing library.
  • sequence diversity controls refers to degenerate pools, or pools of nucleic acids with diverse sequences, which degenerate pools can often be used for diversity assessment, abundance calculation, and/or determination of information transfer efficiency/ [00146]
  • size controls As used herein, “size controls,” “length controls,” “GC Spike-in Panel” or “GC size/length controls” refers to nucleic acids that are size or length or GC-content markers, which can be used for abundance normalization, development, and/or analysis purposes and other purposes.
  • a sample comprising nucleic acids can be prepared prior to a sequencing assay.
  • a raw biological sample comprising whole blood can be processed by centrifugation to generate an initial sample of plasma.
  • whole blood can be collected in a K2-EDTA tube.
  • whole blood draws are not pooled.
  • a tube can be gently inverted multiple times after draw.
  • a tube can be centrifuged for about 1200 RCF (g), about 1400 RCF (g), about 1600 RCF (g), or more after draw in order to separate plasma from the blood. The centrifugation can occur at ambient temperature. In some cases, the centrifugation occurs for greater than 5 minutes, 7 minutes, 10 minutes, 15 minutes or 20 minutes. In some embodiments, for tubes containing less than 4 mL a tube manufacturer’s instruction and centrifugation speed and time can be used.
  • the plasma fraction can be transferred into a new tube.
  • the plasma is subjected to centrifugation a second time to remove residual cells (e.g., mammalian cells and microbial cells).
  • the additional centrifugation can be conducted at, e.g., about 1400 RCF (g), about 1600 RCF (g), about 1800 RCF (g), about 2000 RCF (g), or more.
  • the methods described herein comprise enriching a population of cfNA.
  • enriching a population of cfNA comprises bioinformatically enriching or physically enriching.
  • enriching a population of cfNA comprises isolating, extracting, or selectively amplifying a desired population of cfNA (e.g., target cfNA) from the initial sample.
  • enriching a population of cfNA does not comprise isolating or extracting the desired population of cfNA from the initial sample.
  • enriching a population of cfNA does not comprise amplifying the desired population of cfNA.
  • enriching a population of cfNA comprises removing the undesired cfNA population (e.g., non-target cfNA or contaminants) from the initial sample.
  • enriching cfNA can comprise differentiating and/or selecting the cfNA by one or more characteristics comprising size, sequence, GC content, secondary structure, biological source, or protein-binding.
  • the methods comprise enriching microbial cfNA (mcfNA) in the biological sample.
  • the methods provided herein comprise enriching for at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% of the mcfNA in the biological sample.
  • enriching mcfNA comprises enriching cfNA that are less than about 20 bp, 30 bp, 40 bp, 50 bp, 60 bp, 70 bp, 80 bp, 90 bp, 100 bp, 110 bp, 120 bp, 130 bp, 140 bp, 150 bp, 160 bp, 170 bp, 180 bp, 190 bp, 200 bp, 210 bp, 220 bp, 230 bp, 240 bp, or 250 bp in length.
  • enriching mcfNA comprises amplifying the nucleic acids in the initial sample with primers containing non-human nucleic acid sequences. In some embodiments, enriching mcfNA comprises removing non-microbial cfNA. In some embodiments, enriching mcfNA comprises removing nucleosome-bound cfNA. In some embodiments, the methods comprise enriching non-microbial cfNA (e.g., host cfNA) in the biological sample. In some embodiments, the methods comprise enriching mammalian cfNA in the biological sample.
  • the methods comprise enriching cfNA by size selection.
  • size selection can comprise removing nucleic acids not in the desired size range.
  • the desired size range comprises an artificial or engineered threshold.
  • size selection comprises separating nucleic acids by size via chromatography (e.g., size-exclusion chromatography), electrophoresis (e.g., gel or capillary electrophoresis), centrifugation (e.g., density -gradient centrifugation), filtration (e.g., membrane ultrafiltration), magnetic bead-based methods (e.g., SPRI beads), affinity-based methods (e.g., streptavidin beads), or any combination thereof.
  • chromatography e.g., size-exclusion chromatography
  • electrophoresis e.g., gel or capillary electrophoresis
  • centrifugation e.g., density -gradient centrifugation
  • filtration e.g., membrane ultrafiltration
  • magnetic bead-based methods
  • the methods can comprise selectively removing nucleic acid fragments greater than about 500 bp, about 450 bp, about 400 bp, about 350 bp, about 300 bp, about 250 bp, about 200 bp, about 150 bp, about 140 bp, about 130 bp, about 120 bp, about 110 bp, about 100 bp, about 90 bp, about 80 bp, about 70 bp, or about 60 bp in length.
  • the methods can comprise selectively enriching nucleic acid fragments at most about 20 bp, about 30 bp, about 40 bp, about 50 bp, about 60 bp, about 70 bp, about 80 bp, about 90 bp, about 100 bp, about 110 bp, about 120 bp, about 130 bp, about 140 bp, about 150 bp, about 160 bp, about 170 bp, about 180 bp, about 190 bp, about 200 bp, about 210 bp, about 220 bp, about 230 bp, about 240 bp, or about 250 bp in length.
  • the methods can comprise selectively enriching nucleic acid fragments of about 10 bp to about 20 bp, about 10 bp to about 30 bp, about 10 bp to about 40 bp, about 10 bp to about 50 bp, about 10 bp to about 60 bp, about 10 bp to about 70 bp, about 10 bp to about 80 bp, about 10 bp to about 90 bp, about 10 bp to about 100 bp, about 10 bp to about 110 bp, about 10 bp to about 120 bp, about 10 bp to about 130 bp, about 10 bp to about 140 bp, about 10 bp to about 150 bp, about 10 bp to about 160 bp, about 10 bp to about 170 bp, about 10 bp to about 180 bp, about 10 bp to about 190 bp, about 10 bp to about 200 bp, about
  • the methods can comprise selectively enriching nucleic acid fragments of about 20 bp to about 250 bp, about 20 bp to about 200 bp, about 20 bp to about 150 bp, about 20 bp to about 100 bp, about 20 bp to about 90 bp, about 20 bp to about 80 bp, about 20 bp to about 70 bp, about 20 bp to about 60 bp, about 20 bp to about 50 bp, about 30 bp to about 250 bp, about 30 bp to about 200 bp, about 30 bp to about 150 bp, about 30 bp to about 100 bp, about 30 bp to about 90 bp, about 30 bp to about 80 bp, about 30 bp to about 70 bp, about 30 bp to about 60 bp, about 30 bp to about 50 bp, about 40 bp to about 250 bp, about 40 bp to about.
  • the methods provided herein comprise performing process 1 to prepare a sample for high throughput sequencing assay.
  • Process 1 provides an example of preparing a sequencing library from the double-stranded cfDNA in the original sample.
  • a control molecule can be added to an initial sample of plasma to generate a spiked plasma sample.
  • nucleic acid extraction can be performed on a spiked plasma sample to generate purified and concentrated cfDNA.
  • a library preparation process can be performed on a purified and concentrated cfDNA sample.
  • the library preparation can comprise attaching (e.g., by ligation) double-stranded adapters to double-stranded cfDNA.
  • library preparation can comprise performing unbiased amplification on the sample.
  • a sample preparation method does not comprise extracting nucleic acids from a raw or initial sample. For example, in some cases, nucleic acids can be extracted during or following library preparation, if at all.
  • an adapter pair can be attached to cfDNA fragments in a sample such as by ligation or PCR amplification.
  • a pair of adapters can comprise a p5 adapter that is attached to a 5’ end of a molecule and a p7 adapter that is attached to a 3’ end of a molecule.
  • a p5 and p7 sequence can allow a nucleic acid library to bind and generate clusters on a flow cell.
  • the cfDNA can be attached to adapters comprising identifier sequences that can differentiate between multiple samples.
  • the multiple samples comprise a plurality of patient samples and/or control samples (e.g., positive control, negative control).
  • samples can be pooled after barcoding, then sequenced, then demultiplexed to assign each cluster to its sample.
  • the methods described herein can comprise attaching a splint adapter (e.g., a splint oligonucleotide) to the single-stranded cfDNA in the sample.
  • a splint adapter e.g., a splint oligonucleotide
  • the splint adapter can comprise dsDNA with a ssDNA overhang.
  • the ssDNA overhang can comprise random or degenerate nucleotides (e.g., NNNNNN).
  • the ssDNA overhang can comprise a specific sequence capable of hybridizing to a target molecule.
  • the splint adapter can randomly hybridize to cfDNA via the single-stranded DNA overhang within the splint adapter.
  • attaching the splint adapter can further comprise ligating the hybridized adapter to cfDNA, e.g., using an enzyme (e.g., ligase).
  • the ligase comprises T4 ligase, CircLigase II, CircLigase ssDNA Ligase, Splint ligase, any engineered variants thereof, any natural variants thereof, or any combination thereof.
  • exemplary detectable labels include radiolabels, fluorescent labels, protein labels, dye labels, enzymatic labels, etc.
  • the detectable label can be an optically detectable label, such as a fluorescent label.
  • Exemplary fluorescent labels include cyanine, rhodamine, fluorescein, coumarin, BODIPY®, Alexa Fluor®, or conjugated multi-dyes.
  • a reference value can comprise an abundance of sequence reads generated from any nucleic acids in the sample.
  • the reference value may be an abundance of sequence reads from microbial cell- free DNA (mcfDNA) or of sequence reads from a synthetic spike-in molecule (or process control molecule).
  • the methods further comprise calculating the relative abundance of a sequence read at least in part by comparing the abundance of the sequence read to the abundance of another sequence read.
  • the methods further comprise calculating the normalized abundance of a sequence read at least in part by comparing the abundance of the sequence read from mcfDNA in the sample to the abundance of sequence reads from synthetic spike-in molecules.
  • the methods further comprise mapping a sequence read to a reference sequence.
  • the reference sequence can comprise a microbial sequence.
  • the reference sequence comprises a non-microbial sequence.
  • the reference sequence comprises a genomic sequence.
  • the method further comprise mapping a sequence read to a reference genome.
  • a method disclosed herein can comprise identifying a species of microbe by mapping a sequence read to a reference genome.
  • the methods further comprise identifying the source of nucleic acids in the sample from which the sequence read originated.
  • the methods further comprise performing a bioinformatic analysis.
  • sequence reads can be aligned to a reference sequence comprising artifactual sequences.
  • regions that show irregularities in read coverage when multiple samples are aligned can be masked or removed as an artifact.
  • the detection of such irregular coverage can be done by various metrics, such as the ratio between coverage of a specific nucleotide and the average coverage of the entire contig within which this nucleotide is found.
  • a sequence that is represented as greater than about 5*, about 10*, about 25*, about 50*, about 100* the average coverage of the reference sequence comprising artifactual sequences can be artifactual.
  • a binomial test can be applied to provide a per-base likelihood of coverage given the overall coverage of the contig.
  • each high confidence read can align to multiple organisms in the given microbial database.
  • an algorithm can be used to compute the most likely organism (for example, see Lindner et al. Nucl. Acids Res. (2013) 41 (1): elO, which is referenced herein in its entirety).
  • the microbial sequence comprises a region of a microbial genome.
  • the reference sequence comprises an artifactual sequence.
  • a sequencing can generate at least 100, at least 250, at least 500, at least 750, at least 1000, at least 1500, at least 2500, at least 3000, at least 3500, at least 4000, at least 4500, at least 5000, at least 5500, at least 6000, at least 7000, at least 8000, at least 9000, at least 10,000, at least 12,500, at least 15,000, at least 17,500, at least 20,000, at least 30,000, at least 40,000, at least 50,000, at least 60,000, at least 70,000, at least 80,000, at least 90,000, at least 100,000, at least 200,000, at least 300,000, at least 400,000, at least 500,000, at least 600,000, at least 700,000, at least 800,000, at least 900,000, at least 1,000,000, at least 2,000,000, at least 3,000,000
  • a bioinformatics analysis can include, without limitation, assembling sequence data, detecting and quantifying sequence reads, distinguish populations of nucleic acids, detecting the presence and measuring the abundance of microbial nucleic acids, comparing sequence reads, comparing abundances of sequence reads, identifying contaminant nucleic acids from the sample collection site, identifying target nucleic acids (e.g., cell-free nucleic acids), identifying host nucleic acids, generating fragment lengths profiles of microbial nucleic acids, generating fragment lengths profiles of control process molecules, comparing fragment lengths profiles of the microbial nucleic acids, detecting site of infection, detecting the state of infection, detecting the risk of organ rejection in a transplant patient, determining the eligibility of a subject for a transplant, and/or detecting potential for drug resistance.
  • target nucleic acids e.g., cell-free nucleic acids
  • host nucleic acids e.g., cell-free nucleic acids
  • the computer system 601 also includes memory or memory location 610 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 615 (e.g., hard disk), communication interface 620 (e.g., network adapter) for communicating with one or more other systems, and peripheral devices 625, such as cache, other memory, data storage and/or electronic display adapters.
  • memory or memory location 610 e.g., random-access memory, read-only memory, flash memory
  • electronic storage unit 615 e.g., hard disk
  • communication interface 620 e.g., network adapter
  • peripheral devices 625 such as cache, other memory, data storage and/or electronic display adapters.
  • the memory 610, storage unit 615, interface 620, and peripheral devices 625 are in communication with the CPU 605 through a communication bus (solid lines), such as a motherboard.
  • the storage unit 615 can be a data storage unit (or data repository) for storing data.
  • the computer system 601 can be operatively coupled to a computer network (“network”) 630 with the aid of the communication interface 620.
  • the network 630 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
  • the network 630 in some embodiments is a telecommunication and/or data network.
  • the network 630 can include one or more computer servers, which can enable distributed computing, such as cloud computing.
  • the network 630 in some embodiments with the aid of the computer system 601, can implement a peer-to-peer network, which can enable devices coupled to the computer system 601 to behave as a client or a server.
  • the CPU 605 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
  • the instructions can be stored in a memory location, such as the memory 610.
  • the instructions can be directed to the CPU 605, which can subsequently program or otherwise configure the CPU 605 to implement methods of the present disclosure. Examples of operations performed by the CPU 605 can include fetch, decode, execute, and writeback.
  • the CPU 605 can be part of a circuit, such as an integrated circuit. One or more other components of the system 601 can be included in the circuit. In some embodiments, the circuit is an application specific integrated circuit (ASIC).
  • ASIC application specific integrated circuit
  • the storage unit 615 can store files, such as drivers, libraries, and saved programs.
  • the storage unit 615 can store user data, e.g., user preferences and user programs.
  • the computer system 601 in some embodiments can include one or more additional data storage units that are external to the computer system 601, such as located on a remote server that is in communication with the computer system 601 through an intranet or the Internet.
  • the computer system 601 can communicate with one or more remote computer systems through the network 630.
  • the computer system 601 can communicate with a remote computer system of a user.
  • remote computer systems include personal computers, slate or tablet PC's, telephones, smart phones, or personal digital assistants.
  • the user can access the computer system 601 via the network 630.
  • the computer system 601 can include or be in communication with an electronic display 635 that comprises a user interface (UI) 640 for providing, an output of a report, which can include a diagnosis of a subject or a therapeutic intervention for the subject.
  • UI user interface
  • Examples of UI's include, without limitation, a graphical user interface (GUI) and web-based user interface.
  • GUI graphical user interface
  • the analysis can be provided as a report.
  • the report can be provided to a subject, to a health care professional, a lab-worker, or other individual.
  • Methods and systems of the present disclosure can be implemented by way of one or more algorithms.
  • An algorithm can be implemented by way of software upon execution by the central processing unit 605.
  • the algorithm can, for example, facilitate the enrichment, sequencing and/or detection of pathogen or microbe or other target nucleic acids.
  • Information about a patient or subject can be entered into a computer system, for example, patient background, patient medical history, or medical scans.
  • the computer system can be used to analyze results from a method described herein, report results to a patient or doctor, or come up with a treatment plan.
  • the methods provided comprise detecting a clinically-relevant genetic marker.
  • the genetic marker is associated with an infection or a non-communicable disease.
  • the methods are useful for detecting an antimicrobial resistance (AMR) marker.
  • the methods are useful for detecting a cancer marker.
  • the methods are useful for detecting a disease site.
  • the methods provided herein are useful for informing diagnostic and treatment decisions for infectious diseases in clinical settings.
  • the methods provided herein can be useful for rapid detection of the microbe causing the infection without performing microbial culture.
  • the methods can be useful for detecting an AMR genetic marker, an AMR gene cassette, and/or a carrier microbe harboring the AMR genetic marker.
  • the methods comprise generating a report listing the microbes detected by the methods.
  • the methods comprise generating a report listing the microbes and probabilities of the microbes being resistant or susceptible to antimicrobial treatment for clinical use.
  • the methods comprise generating a report on the microbes and their resistance or susceptibility to antimicrobial treatment.
  • the methods provided herein are useful for linking the AMR genetic marker to carrier microbes harboring the AMR genetic marker.
  • by detection of the AMR genetic marker can provides additional phenotypic characteristics of the carrier microbe.
  • a phenotypic characteristic is resistance to a particular AMR drug or class of AMR drugs.
  • the methods comprise performing a first sequencing assay to detect one or more carrier microbes harboring the AMR genetic marker and based on this detection, perform a second sequencing assay to link the AMR genetic marker to a specific carrier microbe. For example, many carrier microbes can harbor the same AMR gene, making it important to identify the exact microbes in the infection that carry the gene to better inform treatment.
  • the mecA gene can be carried by multiple Staphylococcus species, such as S. aureus and S. epidermidis, so mecA could be present in either or both microbes in the infection.
  • the methods provided herein can be useful for determining the likely etiology of the infection and establishing the microbial species containing the mecA gene.
  • the methods further comprise administering a therapeutic or developing a treatment plan based on the identification of a microbial species, such as a microbial species carrying an AMR marker.
  • the methods are useful for guiding the selection of antimicrobial treatment. For example, these methods can help determine the appropriate class of antimicrobial drugs for an infection based on the results of AMR detection. Once a link is established between an AMR genetic marker and the detected carrier microbe, the treatment plan for the infection can be adjusted accordingly based on the results generated by the methods.
  • the methods comprise modifying the antimicrobial drug class administered to a subject when microbes harboring resistance genes are detected.
  • the methods comprise administering an alternative regimen to the subject. For example, if the methods detect that Staphylococcus aureus infecting the patient harbors the SCCmec cassette, mecA gene, and/or mecC gene, this indicates that the infection is caused by methicillin-resistant Staphylococcus aureus (MRSA), which is resistant to all P-lactams except ceftaroline and ceftobiprole.
  • MRSA methicillin-resistant Staphylococcus aureus
  • the clinical treatment can shift from P-lactams to alternative primary regimens such as vancomycin, linezolid, daptomycin, or ceftobiprole.
  • alternative primary regimens such as vancomycin, linezolid, daptomycin, or ceftobiprole.
  • VRE vancomycin-resistant Enterococcus
  • the clinical treatment would shift from vancomycin or penicillin-based therapies to alternative regimens such as linezolid or daptomycin.
  • ESBL extended-spectrum P-lactamase
  • ESBL-producing bacteria are generally resistant to penicillins, penicillin-BLI combinations, most cephalosporins (except cephamycins), and aztreonam.
  • the clinical treatment would shift from P-lactam antibiotics to alternative regimens such as ceftolozane-tazobactam, ertapenem, imipenem-cilastatin, meropenem, or aminoglycosides.
  • KPC Klebsiella pneumoniae carbapenemase
  • the methods provided herein are useful for identification of multiple species involved in an infection or a co-infection.
  • the methods comprise non-biased sequencing of microbial nucleic acids and can identify many microbes present in an infection. This is particularly useful for diagnosing and treating patients susceptible to multiple infections, including co-infections and complex infections, such as those occurring in immunocompromised individuals.
  • the methods comprise identifying microbes from different kingdoms within an infection, such as a viral, bacterial, and fungal infection occurring simultaneously. For example, immunocompromised patients, such as those with human immunodeficiency virus (HIV), may experience co-infections involving multiple pathogens.
  • HIV human immunodeficiency virus
  • the methods identify multiple resistant bacteria in difficult-to- treat infections. For instance, chronic infections in cystic fibrosis patients are particularly challenging to manage due to the persistence of multidrug-resistant bacteria.
  • the methods are useful for identifying resistance markers across a broad range of microbes. For example, the methods can detect clinically relevant AMR genes in Cytomegalovirus (CMV) and other DNA viruses, the rpoB gene in Mycobacterium tuberculosis, additional classes of carbapenemase and ESBL genes in Gram-negative bacteria, and antifungal resistance genes in Aspergillus spp. and other molds.
  • CMV Cytomegalovirus
  • rpoB gene in Mycobacterium tuberculosis
  • additional classes of carbapenemase and ESBL genes in Gram-negative bacteria
  • antifungal resistance genes in Aspergillus spp. and other molds.
  • the methods provided herein are useful for rapid diagnosis of lifethreatening infections, particularly in immunocompromised or critically ill patients, such as those undergoing chemotherapy, organ transplants, or long-term immunosuppressive therapy.
  • these methods can be used to diagnose endocarditis, pneumonia in immunocompromised hosts (ICH), invasive fungal infections, febrile neutropenia, and/or fever of unknown origin (FUO).
  • ICH immunocompromised hosts
  • invasive fungal infections e.g., invasive fungal infections, febrile neutropenia, and/or fever of unknown origin
  • FUO fever of unknown origin
  • the methods described herein offer a rapid and comprehensive approach to identifying causative pathogens, improving targeted treatment, and minimizing the use of broad-spectrum antimicrobials.
  • the methods provided herein can be used independently of microbial culturing or can be used to confirm the results of a microbial culture.
  • the methods can offer various advantages over traditional culturing, including faster turnaround times and the ability to detect microbes that are difficult to culture.
  • the methods described herein are more sensitive and/or more accurate than microbial culturing. For example, even when a patient’s blood culture fails to show any organism, the methods provided herein can detect microbial nucleic acids from pathogens causing the infection in the patient’s blood sample, thereby identifying the etiology of the infection.
  • the methods provided herein can be used independently of other detection methods, such as method that depend solely on PCR.
  • the methods provided herein can offer various advantages over methods that solely use PCR.
  • the methods provided herein can detect multiple microbial nucleic acids in a biological sample in a high-throughput manner, providing unique insights into the patient's microbiome, infectome, and resistome that may be unavailable with other detection methods.
  • the methods described herein more accurately detect the microbe infecting the patient and more accurately link the detection of an AMR genetic marker to its carrier microbe.
  • the methods provided herein are useful for monitoring the development of AMR genetic markers in an infected patient or a population of patients overtime, providing longitudinal data on microbial resistance.
  • the methods monitor AMR genetic marker development at clinical locations, such as hospital wards.
  • the methods may be useful for preventing the spread of antimicrobial-resistant infections in clinical settings.
  • the methods can help reduce the acquisition and spread of antimicrobial resistance due to the inappropriate use of antimicrobials, both in individual patients and among populations.
  • the methods provided herein are useful for treating a patient who has a cancer, wherein the cancer harbors a cancer marker or multiple markers for cancer.
  • a cancer marker can be detected by high throughput sequencing; followed by a targeted assay from a different aliquot of a sample in order to detect additional cancer markers.
  • the methods comprise determining, adjusting, or administering a cancer therapy provided herein to the patient.
  • an mcfNA in a subject’s bodily fluid can originate from a microbe living in or on a subject.
  • a mcfNA can be detected in a subject that has been exposed to an infectious disease.
  • an mcfNA can be detected in a healthy individual.
  • detection of mcfNA by the methods disclosed herein can be performed as a biomarker of infection.
  • mcfNAs can be associated with communicable and/or non-communicable diseases.
  • mcfNAs can be associated with a range of diseases and conditions of a subject described herein, including an infection, an inflammatory bowel disease (IBD), a Kawasaki disease (KD), a human immunodeficiency virus (HIV), a cardiovascular disease (CVD), a cystic fibrosis (CF), a pneumonia, a sepsis, a cancer, a gastric cancer (GC), a hepatocellular carcinoma (HCC), a melanoma, or any combination thereof.
  • IBD inflammatory bowel disease
  • KD Kawasaki disease
  • HV human immunodeficiency virus
  • CVD cardiovascular disease
  • CF cystic fibrosis
  • a pneumonia a pneumonia
  • a sepsis a cancer
  • GC gastric cancer
  • HCC hepatocellular carcinoma
  • melanoma or any combination thereof.
  • the methods provided herein comprise detecting, diagnosing, treating, monitoring, staging, or prognosing a disease or a disorder.
  • the disease or disorder is an infection or another medical indication related to an infection.
  • the methods are for determining an infection site or identifying a source of an infection.
  • the methods are for determining the biological relationship between a microbe and a host (e.g., the subject herein).
  • the methods are for detecting or identifying a commensal microbe of the subject.
  • a commensal microbe of a first subject may be a potential pathogen to a second subject.
  • a commensal microbe of an animal may be a potential pathogen to a human.
  • methods disclosed herein can be used in conjunction with one or more medical tests.
  • the methods described herein are for determining an eligibility of a subject in transplantation.
  • the subject described herein can be a donor or a recipient of the transplantation.
  • the transplantation comprises organ transplantation, tissue transplantation, composite tissue transplantation, living donor transplantation, or xenotransplantation.
  • the methods are for determining the eligibility of an animal donor for xenotransplant.
  • the methods are for determining the eligibility of a human donor for organ transplant.
  • the methods are for determining the eligibility of a human transplant recipient.
  • the methods described herein are for individualized treatment for an infected subject or a subject who is susceptible or at risk for infections (e.g., immunosuppressed, immunocompromised, living conditions, or genetic variations resulting in increased susceptibility for infection).
  • individualized treatment can include predicting if an infection will progress to an invasive disease stage, monitoring the efficacy of a therapy in a subject, modifying a therapeutic regimen depending on the subject’s response to the therapy, and determining a pathogen’s resistance to a particular therapeutic.
  • the methods disclosed herein can be used to detect, diagnose, predict, or prognose a pathogen’s resistance to a particular therapeutic.
  • the methods disclosed herein can further comprise sequencing of the subject’s DNA for genetic variations that are associated with therapeutic resistance to therapeutics or to a particular therapeutic.
  • the methods provided herein can comprise, in some embodiments, determining a subject’s response to a particular treatment.
  • samples can be collected serially at various times before or during the course of the infection to determine the pathogen’s and subject's response to a treatment.
  • a treatment plan is individually tailored to the subject.
  • samples can be collected at various timepoints following administration of a treatment.
  • samples can be collected at various timepoints following discontinuation of a treatment or adjustment of a treatment.
  • serially collected samples are compared to each other to determine whether the infection is improving or worsening in the subject.
  • the methods can comprise maintaining, discontinuing, modifying, adjusting the dose of a particular treatment.
  • modifying the treatment may comprise replacing a therapeutic drug with a different or alternative therapeutic drug, particularly if an AMR marker is detected following treatment.
  • the methods comprise discontinuing an antimicrobial treatment if an AMR gene or cassette is identified.
  • the methods comprise changing a treatment to a different therapy, or alternative regimen.
  • the methods can comprise adjusting a dose of current treatment, either up or down, based on detection of an AMR marker.
  • the methods described herein can be used to adjust a therapeutic regimen.
  • the subject can be administered a drug to treat an infection.
  • methods provided herein can be used to track or monitor the efficacy of the drug treatment.
  • the therapeutic regimen can be adjusted, depending on upward or downward course of the infection. For example, if the methods provided herein indicate that an infection is not improving with drug treatment, the therapeutic regimen can be adjusted by changing the type of drug or treatment, discontinuing the use of the drug, continuing the use of the drug, increasing the dose of the drug, or adding a new drug or treatment to the subject’s therapeutic regimen.
  • the methods provided herein are useful for treating a patient who has an infection by one or more carrier microbes harboring an AMR marker.
  • the methods comprise determining, adjusting, changing, or administering an antimicrobial treatment of a subject based on identification of a multiple microbes.
  • a treatment can involve administering a drug or other therapy to reduce or eliminate the colonization or invasive disease associated with an infection.
  • the subject can be treated prophylactically to prevent the development of an infection.
  • the methods provided herein can comprise performing procedures or administering a treatment to improve or reduce the symptoms of an infection.
  • antibiotics such as ampicillin, sulbactam, penicillin, vancomycin, gentamycin, aminoglycosides, clindamycin, cephalosporin, metronidazole, timentin, ticarcillin, clavulanic acid, cefoxitin
  • antiretroviral drugs e.g., highly active antiretroviral therapy (HAART), reverse transcriptase inhibitors, nucleoside/nucleotide reverse transcriptase inhibitors (NRTIs), Nonnucleoside RT inhibitors, and/or protease inhibitors
  • immunoglobulins or any variant or combination thereof.
  • the methods described herein are for detection, monitoring, diagnosis, prognosis, treatment, prediction, or prevention of colonization by the microbes disclosed herein.
  • the disclosure also provides methods to detect, monitor, diagnose, prognose, treat, predict, or prevent invasive disease caused by the microbes described herein.
  • the methods of the disclosure can be applied to any pathogen that has various stages of infection, including early stages.
  • the methods can be especially useful for pathogens that have a colonization stage and an invasive disease stage.
  • the invasive disease stage can be caused by the pathogen infection.
  • the invasive disease stage can be associated with the pathogen infection.
  • the methods can be used for to distinguish populations of nucleic acids or for detecting a microbe in a subject.
  • the methods provide a more comprehensive view of the state and diversity of the infection or symbiotic microbes in a subject.
  • the identification of both RNA and DNA in a sample can be useful to detect RNA and DNA type viruses, or to detect bacterial, protist, parasitic or fungal genomic DNA and/or gene expression products, e.g., mRNA.
  • Such process can also be able to differentiate between latent infection (e.g., which might be indicated by the presence of integrated retroviral DNA) versus active infection (e.g., which might be indicated by the presence of viral RNA from intact viral particles).
  • Such processes can also be able to detect drug resistance and/or the origin of infection. Such processes can also be used to analyze host response. Such analyses can include analysis of cell-free, circulating nucleic acids, e.g., for microbial or viral infection identification. Subjects
  • the sample described herein has been obtained from a subject.
  • the subject can comprise a human or a non-human animal.
  • a subject can comprise a male or a female.
  • a subject can be of any age.
  • a subject can be a child.
  • a subject can comprise an embryo or a fetus.
  • a subject can comprise a healthy subject.
  • a subject can have, be suspected of having, or be at risk of having a disease or a disorder.
  • the subject has an infection by a microbe.
  • the subject is infected by a pathogenic microbe, a commensal microbe, or a combination thereof.
  • the subject is infected by a carrier microbe harboring one or more AMR genetic marker provided herein.
  • the subject has a medical indication related to an infection or an abnormal immune response to an infection.
  • the subject is wholly or partially immunocompromised or abnormally susceptible to infections.
  • the subject has a reoccurring infection by a microbe, a secondary infection by another microbe, or a co-infection by two or more microbes.
  • the subject has a disorder or a disease comprising a cancer, transplantation, a surgery, a bum, an infection, a malnourishment, a chronic kidney disease, diabetes mellitus, an autoimmune disease or disorder, or an immune disorder (e.g., an acquired immunodeficiency syndrome (AIDS)).
  • a subject can have a medical condition.
  • a medical condition can comprise pregnancy, lactation, menopause, frailty, malnutrition, graft or organ rejection, chronic fatigue, or a combination thereof.
  • the subject is taking an immunosuppressant agent.
  • the immunosuppressant agent may comprise chemotherapy, radiation, corticosteroids, transplant medications, or certain biologies.
  • the subject is taking an anti-infective agent provided herein.
  • the anti-infective agent comprises an antibacterial drug (e.g., an antibiotic), an antiviral drug, an antifungal drug, an antiparasitic drug, or any combination thereof.
  • the subject is taking one anti -infective agent without improvement of symptoms and needs a different anti -infective regimen.
  • the subject is taking one anti- infective agent but has developed an infection by the one or more carrier microbes harboring an AMR genetic marker.
  • the subject has a co-infection of one or more microbes, wherein the one or more microbes comprise one or more carrier microbes harboring an AMR genetic marker.
  • a subject can have or be at an elevated risk for developing an infection.
  • a subject can have or be at an elevated risk for developing a cancer.
  • a subject can be eligible as a recipient of transplantation or is an actual recipient of a transplanted organ or graft.
  • a subject can be an organ donor or preparing to be an organ donor.
  • a subject can comprise an animal organ donor for use in a xenotransplant or an animal being prepared for organ donation in a xenotransplant.
  • the subject can also be referred to as a “host.”
  • “host” refers to an organism that harbors another organism or microbe.
  • a living thing e.g., a mammal such as a human being can be a host that harbors a microbe, the microbe being the non-host.
  • “host nucleic acids” and all derivative terms such as “host cell-free nucleic acids”, “host cell-free DNA”, etc. refer to nucleic acids derived from the host genome.
  • a host genome can comprise nucleic acids derived from a nucleus, a mitochondrion, a cytoplasm, an exosome, cell-free nucleic acids derived from any of these, or any combination thereof.
  • the target nucleic acids comprise host nucleic acids.
  • the host nucleic acids comprise a genetic marker associated with a disease or disorder of the host.
  • the disease or disorder comprises an infectious disease or a non-communicable disease.
  • the host nucleic acid can comprise a genetic marker associated with a cancer.
  • a subject can comprise an animal.
  • an animal can comprise a vector for disease transmission from which a sample is being tested to determine a presence or absence of a pathogen in the animal.
  • a disease vector can comprise an animal that has come into contact with a human subject.
  • an animal coming into contact with a human subject can comprise an animal biting a human, a human ingesting an animal or a secretion of an animal, or a combination thereof.
  • an animal can comprise a mammal, a bird, a reptile, an amphibian, a fish, an insect, or an arachnid.
  • an animal can comprise a research animal, an animal for medical use (e.g., xenotransplant donor), a companion animal, a farm animal, a working animal, a performance animal, or a wild animal.
  • a mammal can comprise a non-human primate (e.g., a macaque or rhesus monkey), a rodent, a carnivore (e.g., a canine or a feline), a bat, a cetacean (e.g., a dolphin), an ungulate, or an insectivore (e.g., a hedgehog).
  • an ungulate can comprise a swine, a sheep, a cow, a deer, or a horse.
  • the methods are for detecting, diagnosing, treating, monitoring, staging, or prognosing an infectious disease or an infection.
  • the infection may be caused by any microbe, including but not limited to pathogenic or commensal microbes.
  • the infection is caused by a carrier microbe harboring the one or more AMR genetic markers.
  • the methods are for detecting, diagnosing, treating, monitoring, staging or prognosing a non-communicable disease or disorder in a subject.
  • non-communicable disease or disorder is associated with altered or abnormal gene expression.
  • the non-communicable disease or disorder is associated with a genetic alteration (e.g., genetic marker).
  • the non-communicable disease is associated with a cancer genetic marker.
  • a non-communicable disease or disorder comprises a cancer, an autoimmune disease, a neurodegenerative disease, diabetes, arthritis, a heart disease, hepatitis, a kidney disease, multiple sclerosis, a disease of the integumentary system, an anemia, Duchenne muscular dystrophy, hemophilia, Down syndrome, Angelman syndrome, Prader-Willi syndrome, Rett syndrome, Fragile X syndrome, Li-Fraumeni syndrome, a cardiomyopathy, obesity, an inflammatory bowel disease, celiac disease, an autism spectrum disorder, a prion disease, asthma, a genetic defect, osteoporosis, neurofibromatosis, beta-thalassemia, Marfan syndrome, and cystic fibrosis.
  • the methods are for detecting, diagnosing, treating, monitoring, staging, or prognosing a medical indication related to an infection.
  • a medical indication related to an infection comprises any disease, disorder or procedure (e.g., medical or surgical) that renders a subject immunocompromised or susceptible to infections.
  • the medical indication related to an infection may cause a new or reoccurring infection in a subject.
  • the medical indication related to an infection can comprise any disease for which an immunosuppressant (e.g., chemotherapy, radiation, corticosteroids, transplant medications, or certain biologies) or an anti-infective agent is required as a treatment.
  • a medical indication related to an infection may comprise cancers, transplantation, surgeries, bums, infections, malnourishment, chronic kidney diseases, diabetes mellitus, autoimmune diseases, or immune disorders (e.g., acquired immunodeficiency syndrome (AIDS)).
  • AIDS acquired immunodeficiency syndrome
  • the methods further comprise administering a treatment to the subject.
  • the treatment can comprise a pharmacological treatment or a non-pharmacological treatment for the disease or disorder that the subject has.
  • the pharmacological treatments may comprise a small molecule drug or a biologic drug.
  • the biologic drug comprises a peptide, a protein, an antibody or an antigen binding fragment thereof, nucleic acids, a cell-based therapy, or any combination thereof.
  • the drug treatment may comprise medications to be administered via a plurality of routes.
  • the drug treatment is administered orally, topically, or by injections. Topical administration may comprise intranasal administration, intravaginal administration, transdermal administration, or inhalation.
  • the treatment comprises an antimicrobial or an anti-infective agent.
  • the antimicrobial agent comprises an antibacterial drug (e.g., an antibiotic), an antiviral drug, an antifungal drug, an antiparasitic drug, or any combination thereof.
  • the antifungal drug comprises a polyene (e.g., Amphotericin B, Nystatin), an azole (e.g., Fluconazole, Voriconazole, Posaconazole), an echinocandin (e.g., Caspofungin, Micafungin), a pyrimidine analog (e.g., flucytosine), an allylamine (e.g., terbinafine), Griseofulvin, Ciclopirox, Tavaborole, or any combination thereof.
  • the methods described herein further comprise adjusting a treatment that the subject has received at the time of sample collection.
  • the term "about” as used herein generally means plus or minus ten percent (10%) of a value, inclusive of the value, unless otherwise indicated by the context of the usage.
  • “about 100” refers to any number from 90 to 110 and includes the number 100, unless otherwise indicated by the context in which the term is used.
  • the term “about” a range refers to that range minus 10% of its lowest value and plus 10% of its greatest value.
  • the terms “increased” , “increasing” , or “increase” are used herein to generally mean an increase by a statically significant amount.
  • the terms “increased,” or “increase,” mean an increase of at least 10% as compared to a reference level, for example an increase of at least about 10%, at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, standard, or control.
  • Other examples of "increase” include an increase of at least 2-fold, at least 5-fold, at least 10-fold, at least 20-fold, at least 50-fold, at least 100-fold, at least 1000-fold or more as compared to a reference level.
  • decreased means a reduction by at least 10% as compared to a reference level, for example a decrease by at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% decrease (e.g., absent level or non-detectable level as compared to a reference level), or any decrease between 10-100% as compared to a reference level.
  • a 100% decrease e.g., absent level or non-detectable level as compared to a reference level
  • a marker or symptom by these terms is meant a statistically significant decrease in such level.
  • the decrease can be, for example, at least 10%, at least 20%, at least 30%, at least 40% or more, and is preferably down to a level accepted as within the range of normal for an individual without a given disease.
  • adapter or “portions of an adapter” refers to a chemically synthesized, single-stranded, or double-stranded oligonucleotide that can be attached, e.g., covalently (e.g., ligation, primer extension) or non-covalently (e.g, hybridization), to the ends of nucleic acid molecules, such as DNA or RNA molecules.
  • Adapter sequences can be of any length.
  • Adapter can refer to either a full-length adapter or a portion of the adapter, e.g, partial adapters can be attached in some embodiments before the full-lengths are introduced by e.g., indexing primers in amplification steps.
  • 3'-end adapters and 5'-end adapters can be full-length or a portion of an adapter sequence that are attached to the opposite ends of a target nucleic acid, a copy of a target nucleic acid, or a target nucleic acid complement.
  • 3'-end adapters and 5'-end adapters sequences end up being attached to the opposite ends of e.g., a template that can be sequenced that comprises target nucleic acid, a copy of a target nucleic acid, and/or a target nucleic acid complement.
  • the 3'-end adapter and 5'-end adapter sequences can be the same or they can be different.
  • nucleic acid sequence is often expressed in “bases” or “bases in length.” For single-stranded nucleic acids, this indicates the number of nucleotides (nt). For double-stranded nucleic acids, this indicates the number of base pairs (bp). Sometimes, “bases” is used interchangeably with “nucleotides (nt)” or “base pairs (bp)” depending on the context.
  • detect refers to quantitative or qualitative detection, including, without limitation, detection by identifying the presence, absence, quantity, frequency, concentration, sequence, form, structure, origin, or amount of an analyte.
  • microbe generally refers to archaea, bacteria, fungi, protists, parasites, viruses, or other entities that are usually detectable using a microscope (e.g., an optical microscope or electron microscopy).
  • a microscope e.g., an optical microscope or electron microscopy
  • the term “microorganism” refers to a uni- or multi- cellular organism, such as, for example, a microscopic organism or macroscopic organism including but not limited to bacteria, fungi, protists, and parasites.
  • Microbes herein can be a prokaryote or a eukaryote. Microbes are often pathogens responsible for disease, but can also exist in a non-pathogenic, symbiotic, commensalistic, mutualistic, or amensalistic relationship with a host, such as a human.
  • Sequencing can be performed by various systems currently available, such as, without limitation, a sequencing system by Illumina®, Pacific Biosciences®, Oxford Nanopore®, Genia Technologies®, or Life Technologies® and others.
  • Such devices can provide a plurality of raw genetic data corresponding to the genetic information of a host (e.g., human), a non-host (e.g., a pathogen, an organ donor), a host-derived variant genetic sequence (e.g., a single nucleotide polymorphism), and/or combinations thereof as generated by the device from a sample provided by the subject.
  • a host e.g., human
  • a non-host e.g., a pathogen, an organ donor
  • a host-derived variant genetic sequence e.g., a single nucleotide polymorphism
  • the term “derived from” encompasses the terms “originated from,” “obtained from,” “obtainable from” and “created from,” and generally indicates that one specified material finds its origin in another specified material or has features that can be described with reference to the specified material.
  • a sample can be derived from a blood draw
  • a nucleic acid can be derived from a sample
  • a sequence read can be derived from sequencing a nucleic acid, or any combination thereof.
  • the phrase “uniformly distributed” refers to a distribution that is continuous or uniform between members of a family such that for each member of a family there is a predictable or symmetric interval between them.
  • non-uniformly distributed refers to a distribution of members of a family that does not have a predictable or symmetric interval between them.
  • copy number refers to the number of times a particular gene or genomic region is present in the genome of an organism. In some embodiments, “copy number” refers to the number of times a microbial gene or genomic region (an AMR gene) is present in a microbe. As used herein, “copy number” can also refer to “copy number variations (CNVs)” in eukaryotes.
  • cfNAs cell-free nucleic acids
  • biological samples such as blood, urine, cerebrospinal fluid, and synovial fluid.
  • cfNAs can be free-floating, such as cfDNA fragments in plasma.
  • cfNAs can also be associated with cells but not contained within intact cells, as seen in vesicle-associated cfNA or nucleosome- associated cfNA.
  • cfNAs can arise from various biological processes, including cell death (apoptosis, necrosis), active secretion, or viral shedding.
  • cell-free sample refers to a sample devoid, or almost devoid, of cells.
  • the cell-free sample is devoid of any human cells.
  • the cell-free sample is devoid of any microbial cells.
  • the cell-free sample is devoid of all types of cells, including eukaryotic or prokaryotic cells.
  • the cell-free sample can be obtained from a biological sample provided herein.
  • the cell-free sample is a plasma, which is almost devoid of blood cells.
  • the cell-free sample can be obtained by the sample preparation process described herein, such as centrifuging a biological sample.
  • Microbial cell-free DNA was extracted from 250 pL of the plasma, converted to DNA libraries, and sequenced on an Illumina NextSeq®500 or NovaSeq® at a CAP-accredited and CLIA-certified laboratory according to previously validated methods (Karius®, Redwood City, CA). Any of the over 1,000 organisms included in the Karius® clinical reportable range found to be present above a predefined statistical threshold were reported as previously described.
  • the quantity for each organism identified was expressed in molecules per microliter (MPM), representing the number of cfDNA molecules from the reported microorganism present per microliter of plasma (absolute quantity) and estimated deduplicated reads (EDR), representing the number of unique DNA sequencing reads from the reported microorganism present in the sequenced library (relative quantity).
  • MPM molecules per microliter
  • EDR estimated deduplicated reads
  • WINC whole assay internal control
  • MRSA Methicillin-resistant S. aureus
  • MS SA Methicillin-susceptible S. aureus
  • the determination was based on the number of reads that align to a collection of SCCmec (44 SCCmec variants and 4 non-mec SCC variants), a 20-70 kb genetic element known to harbor the mecA or mecC genes.
  • the number of reads was compared to the number of reads mapping to the S. aureus genome and a statistical model was used to determine the likelihood of either the presence or absence of methicillin resistance.
  • indeterminate was also reported if the SCC in the clinical sample was likely to be a non-mec SCC or if SCC fragments in the clinical sample may have originated from a microbe other than S. aureus, as determined by comparing the abundances of a set of “interfering” species, known to harbor sequences highly homologous to SCCmec, to that of S. aureus.
  • the number of unique template molecules was estimated from the amplicon alignments for each marker and housekeeping gene and from the unique molecular identifiers for the quality control oligonucleotides. Background EDT levels of each marker and housekeeping gene were estimated per sequencing batch using the batch controls. An upper bound estimate of background signal was subtracted from the observed EDT for each gene to generate a corrected EDT estimate.
  • Corrected AMR marker genes and WINC EDT were used as input into the AMR caller statistical model.
  • the AMR caller For each AMR marker gene, the AMR caller identifies all of the pathogens detected by the test known to carry that gene, referred to here as “carriers”, and their respective MPM values.
  • the statistical model of the AMR caller evaluates the likelihood of all the possible combinations of carriers and AMR gene copy number given an established prior to output a probability of linkage of AMR to each carrier. For carriers that qualified for the follow-on AMR testing, the test AMR caller reported:
  • the limit of detection (LoD) of the MRSA/MSSA calls made by the SCCmec caller was established by measuring sensitivity using in-silico modeling based on an established dilution series of a A aureus genome spiked into a human cfDNA background representing the 10th percentile of human cfDNA concentration observed in commercial samples.
  • the 10 half-log dilutions ranged from 316,000 MPM down to 10 MPM, in addition to un-spiked samples.
  • Ten genome assemblies were chosen (5 MRS A and 5 MS SA) that were geographically and genotypically diverse, publicly available, and tested for phenotypic susceptibility (Table 1).
  • an in-silico dilution series was created from the established dilution series by swapping existing reads with simulated reads from the genome.
  • a probit analysis was performed to establish a 95% LoD for each genome after subsampling to the unique WINC QC minimum (25,000) as well as the typical level (300,000) observed in production samples.
  • the final LoD values presented in S. aureus MPM and EDR were averaged across the genomes.
  • the LoD for calling AMR presence and absence was calculated using contrived laboratory samples containing multiplexed mixtures of sheared genomes microbes carrying the target AMR genes from the panel (Supplemental Table 2).
  • the multiplexed mixtures were spiked in 8 half-log dilutions into a low human cfDNA plasma background (Supplemental Table 3).
  • the vanB variant carried by the ATCC strain used in the genome mix shares 95% nucleotide identity with the vanB used to design targeting primers; 9 of 16 primer-binding sites contain a mutation. To compensate for the loss of signal due to imperfect primer binding sites, vanB concentration in the genome mixes was selectively increased five-fold.
  • LoD values for presence of vanB were reported as a function of this increased concentration.
  • samples from the dilution series of a “partner” genome were used: mecA(-)!S. aureus mecC(+y vanA(-)/E.faecalis vanB(+ , vanB(-)!E.faecium vanA + , bla ⁇ pc(-)/E. coli blacTx-u(+ and h/ac ⁇ x- ⁇ (-)/'K. pneumoniae Z>/flKPc(+).
  • Replicates were tested over 10 assay runs across 10 days to incorporate variance in typical laboratory conditions.
  • the LoD was estimated using a probit analysis and reported in units of pathogen MPM.
  • each read set was subsampled to 300,000, 3.4 million, and 7.8 million reads, representing the 10th, 25th, and 50th percentile of typical clinical sample sequencing depths (Supplemental Fig. 1), and the LoD was recalculated. Lower sequencing depth is expected to increase the LoD.
  • the calibrated background subtraction method was applied based on non-down sampled batch controls to down sampled LoD samples. After evaluating all sample quality control metrics, two replicates were removed from all subsequent analyses that use LoD samples due to failure of the cross-contamination quality control metrics.
  • in-silico samples were created using available AMR alleles from curated NCBI® nucleotide sequences available in the Comprehensive Antibiotic Resistance Database (CARD).
  • CARD Comprehensive Antibiotic Resistance Database
  • an allele from the desired on-panel AMR gene was chosen from the set of available nucleotide sequences at random. From the LoD experiment, the empirically tested concentration closest to, but not below, twice the LoD for calling the presence of the gene at median sequencing depth was selected.
  • the AMR target amplicons, and their corresponding AMR target primers which are referred to to as “activated” primers, i.e., primers that successfully amplified a template molecule, were identified.
  • the nucleotide sequences of the amplicons were then swapped with nucleotide sequences corresponding to the chosen AMR allele.
  • the inclusivity of housekeeping genes was considered (Supplemental Methods) and found that housekeeping genes have high percent identity matches to all the target species assemblies tested. Given high inclusivity of housekeeping genes for their target species, the “read swapping” protocol was not carried out for them. For each on-panel AMR gene, 50 such in-silico samples were created.
  • AMR genes and primers were mapped to microbe, fungi, and viral reference genomes from the Test pathogen reference genome database and the human reference genome to generate a list of cross-reactive sequences for each on-panel AMR gene. High identity matches (>95% identity, >99% coverage) to the AMR target genes or alleles of the AMR target genes are excluded.
  • an AMR (-) genome from the appropriate target species or a cross-reactive sequence from the above lists was chosen. Then, the empirically tested concentration closest to, but not below, twice the LoD for calling the absence of the AMR gene at median sequencing depth from the LoD experiment was selected. Using relaxed parameters that allow less-than-perfect primer matches, the activated primers were mapped to the chosen AMR (-) genome or cross-reactive sequence to identify putative primerbinding regions and a “read swapping” protocol was performed as described above. This is a conservative approach that does not fully account for the selectivity of primers in a reaction. 100 such in-silico samples were created per AMR gene (50 with AMR-negative genomes and 50 with putative cross-reactive sequences).
  • a total of 115 residual patient plasma samples which had one of the 18 target pathogens detected in the Test, concordant culture results, applicable orthogonal AST results, and adequate residual volume were included in the clinical validation study.
  • the study sites, number of samples with orthogonal data by site, and the data or sample collection protocols that met the inclusion criteria are shown in Table 2.
  • the orthogonal culture results were obtained from a wide variety of sample types including 58 blood, 8 sputum, 7 BAL, 7 wound, 5 tissue/aspirate, and 2 urine.
  • Descriptive statistics used to assess test performance characteristics were calculated by accepted methods.
  • Diagnostic yield (DY) was defined as the percentage of tests that yielded an actionable result either detected or not detected.
  • the copy number of the AMR genes was estimated using the Test sequencing data (without targeting) for 100,000 MPM of contrived sheared pathogen genome carrying the AMR gene.
  • the copy number was estimated using known ratios between the AMR gene and the pathogen genome length, the Test estimated pathogen read count, and the AMR read counts from a custom alignment to the AMR genes (Supplemental Table 2).
  • the observed AMR gene EDT was scaled by the copy number, e.g., with five gene copies, 20 EDT becomes 4 EDT, and recalculated the LoD.
  • Each gene copy was assumed to contribute the same EDT signal as all the others, which is a conservative assumption, given that it does not allow for attribution of most of the EDT signal to a single gene copy.
  • Housekeeping gene targets for S. aureus, S. epider midis, and E. faecium were used in the targeted AMR gene assay as a secondary check for species presence and judgment of AMR presence/absence. If the ratio of AMR EDT to housekeeping gene EDT for one of the three species departed from the expected relationship, the AMR call was “indeterminate”. For example, lack of adequate housekeeping gene EDT would lower confidence in an AMR absence call, making it indeterminate.
  • Housekeeping genes were chosen from species-specific core protein coding genes and filtered to ones where the nucleotide sequence would not cross-react with genomes of other pathogens. Only housekeeping genes for S. aureus, S. epidermidis, and E. faecium showed high species specificity and were then incorporated into the targeted AMR gene assay.
  • Housekeeping genes were checked against the Test pathogen reference genome database and evaluated for the percent identity, alignment length, and number of species reference genome assemblies they aligned to. An ideal species-specific housekeeping gene would align fully at 100% identity to all of the species reference genomes in the database, but no other genomes.
  • C-SA Except for C-SA, all the chosen housekeeping genes align fully with almost 100% identity to all the available target species assemblies in the database (Supplemental Table 5). C- SA is one of 9 housekeeping genes for S. aureus and its presence or absence does not strongly bias the total housekeeping EDT recovery.
  • Table 1 Target Bacteria and associated AMR markers.
  • Table 2. Study sites providing data or specimens for the clinical validation of the Test AMR markers.
  • Table 3 Limit of detection characterization for SCCmec.
  • Table 4 LoD by target and sequencing depth for the AMR gene panel.
  • Table 11 Correlation of Z>/flcTx-M results with ESBL phenotype for 49 Gram-negative bacilli detections. The 95% confidence intervals for performance metric estimates are in brackets.
  • a Yes indicate resistance to at least one of following antimicrobials: Cefotaxime, ceftazidime, ceftriaxone, cefepime (oxyimino-beta-lactams), or aztreonam (monobactam) and no indicates susceptible or intermediate results for all key indicator drugs. Not all drugs were tested for each sample.
  • b Includes 4 detections of Escherichia coli and 1 detection of Klebsiella pneumoniae.
  • C ln cludes 3 detections of Enterobacter cloacae complex and 1 detection of E. coli.
  • FN false negative
  • TN true negative
  • S susceptible
  • R resistant
  • b ESBL and AmpC P-lactamases can be distinguished phenotypically based on differential susceptibility to cefepime.
  • Supplemental Table 1 Staphylococcus aureus genomes used for SCCmec LoD assessment.
  • the presence and absence LoD in MPM were sensitive to sequencing depth, with the LoD increasing approximately 10- fold at the minimum depth. This is expected, as higher pathogen abundance is needed to make a determination in a shallowly sequenced sample.
  • the presence LoD is 3,916 MPM and absence LoD is 683 MPM.
  • LoD was not strongly sensitive to lower sequencing depths - this is because the targeted amplification method creates many amplicons from the original fragment making “surplus” signal.
  • Presence LoD is largely influenced by the length of the targeted region on the gene, as well as the gene copy number and gene variant.
  • the vanB presence LoD is a conservative estimate, a limitation imposed by the vanB gene variant in the genome mixture (see Methods). Since only one microbial genome was assessed to determine the gene LoD, LoD was also characterized assuming one copy of the AMR gene (Table 4), to give a conservative upper bound for the LoD as several of the panel AMR genes were carried in multiple copies by the microbes (Table 2). For AMR genes /CTX I that were carried in multiple copies, except /CTX-M, LoD increased when assuming one copy.
  • LoD was also impacted by the presence of reagent background signal (Supplemental Fig. 2), especially the absence LoD.
  • reagent background signal Secondary Fig. 2
  • blacrx-u down sampling to lower sequencing depths decreased the blacrx-u reagent background signal, leading to more effective background subtraction and improved probit fits to the observed data.
  • blacrx-u absence LoD was unable to be estimated using a probit model for samples down sampled to the 25 th percentile of sequencing depth because there was no concentration at which there was a >95% absence call rate.
  • Inclusivity for blacrx-u was limited to 54% in this analysis (based on a single sampling of 50 alleles) because the targeted approach has primers with perfect homology only to a single clade of blacrx-u variants (blacrx-u-3 clade), but not the other highly divergent clades (Fig. 2).
  • the sensitivity for blacrx-u variants is expected to be 67-75%.
  • Common blacrx-u variants captured by AMR gene enrichment include blacrx-u-3, Z’/ACTX-M-IS, /CTX- ⁇ I-32, blacrx-u-55, and blacrx-u-i.
  • Common CTX-M variants poorly captured or missed by AMR gene enrichment include blacrx-u-2, blacrx-u-% blacrx-u-14, blacrx-u-27, and Z>/flcTx-M-65. Except for the highly divergent blacrx-u, the AMR assay is largely insensitive to variation in gene sequences.
  • Test AMR detection assays may be impeded by the presence of non-AMR sequences that are a match for the primers and similar in sequence to the AMR gene.
  • an in-silico approach was chosen to assess a wide array of possible cross-reactive sequences. An exclusivity of 100% was observed (table in FIG. 7), showing a remarkably high assay specificity for the AMR markers tested. From a literature and database search, allotypes were identified of mecC, mecCl and mecC2, which have 93.7% and 96.3% sequence identity to mecC, respectively. When simulating samples using these allotypes, mecC was called in all simulations, due to the very high sequence similarity.
  • the single AMR gene tests required PCR amplification prior to sequencing to enrich target sequences.
  • Other mNGS assays have also used selective enrichment of AMR genes prior to sequencing to enhance their detection.
  • the LoDs for both the presence and absence of all AMR markers were determined. In a typical production sample (unique WINC 300,000), the presence LoD is 3,916 MPM and absence LoD is 683 MPM for the SCCmec. The higher presence LoD is likely due to interfering species in the human background which cause no-calls for MRSA assemblies when SCCmec and S. aureus are low.
  • the probit fits did not fit the observed data patterns very well for MSSA because there was a sharp boundary between no absence calls and >95% absence calls.
  • the presence LoDs ranged from 425 to 6,107 MPM and the absence LoDs ranged from 472 to 22,680 MPM for the single AMR gene assays.
  • the vanB and blac x. assays represented the biggest LoD challenges for the presence and absence of these genes at 6,107 and 22,680 MPM, respectively.
  • the vanB variant carried by the ATCC strain used in the genome mix shares 95% nucleotide identity with vanB used to design targeting primers. It was compensated for the loss of signal due to imperfect primer binding sites by increasing the vanB concentration in the genome five-fold.
  • the high LoD for blac-vx. was a result of elevated reagent background signal for this gene. All of the AMR marker assays had within- and between-run precision of 100%.
  • the assays not only detect vanA, vanB, mecA, and mecC but also differentiate between them. While vanA encodes resistance to vancomycin and other glycopeptides, vanB encodes resistance to vancomycin only. Although both mecA and mecC encode resistance to methicillin, mecC is much less common and more difficult to detect phenotypically, and as a result may be missed and reported as MSSA. In addition, this feature may increase the understanding of the prevalence of mecC. However, neither vanB nor mecC genes were detected in the clinical validation samples.
  • the Test AMR marker detection covers 8 of 10 bacterial species driving global AMR- attributable deaths and all of the ESKAPE pathogens which are important drivers of multidrug resistance, and the panel of 7 markers detect resistance to 5 different classes of antibiotics, including anti-staphylococcal semisynthetic penicillins, glycopeptides, oxyimino-cephalosporins, aztreonam and carbapenems. Resistance to these drugs is among CDC’s top urgent and serious AMR threats in the US. In addition, it was estimated from a commercial sample database that the target bacteria comprise approximately 18% of all reported detections and thus could be eligible for AMR marker detection.
  • the estimate of the PPA for the Z>/acTx-M was only 55.5% based on initial criteria of resistance to any oxyimino-beta-lactams, or aztreonam to label samples as putative ESBL producers.
  • ESBLs are not readily distinguishable phenotypically from each other and from de-repressed or plasmid-encoded AmpC P-lactamase .
  • ESBL and AmpC P-lactamases can be distinguished phenotypically based on differential susceptibility to cefepime.
  • Cefepime is usually not hydrolyzed by AmpC P-lactamases whereas ESBL producers most often have elevated cefepime MICs and are interpreted as susceptible dose-dependent or resistant to cefepime.
  • the PPA and DY improved to 83.3.% and 97.1%, respectively.
  • the remaining false negative was an E. cloacae complex detection that was resistant to cefepime, ceftriaxone and aztreonam.
  • OXA-48-like and metallo-P-lactamase (MBL) encoding genes were detected among seven (1.7%) and 12 (2.7%) isolates, respectively.
  • MBL genes ZI/ANDM- i was the most common, but A/C/XDM-S, Z>ZaviM-i and bla p- i were also identified.
  • One hundred and sixty-nine (37.6%) isolates did not produce carbapenemases.
  • orthogonal phenotypic AST was performed with the Vitek2 and MicroScan Walkaway systems. In some embodiments, these systems are used by clinical microbiology laboratories. In some embodiments, these systems provide a better estimate of the performance of these assays in clinical practice, such as compared to other clinical methods.
  • DY was defined for each AMR marker as the percentage of tests that yielded a clinically actionable result (detected/not detected) and ranged from 56.8% for bla ⁇ c to 83.3% for van A. While some might consider the DY low, the assays were optimized for accuracy. This is best demonstrated by the results for methicillin resistance in staphylococci and cefepime resistance in gram-negative bacilli in which the CA were estimated to be 95.2% and 97.1%, respectively. [00330] Based on earlier identification of the cause of infection and increased diagnostic yield, mcfDNA sequencing results led to a change in antimicrobial management in 32%-72% of the reported cases.
  • AMR markers to the Test will provide clinicians more opportunities for optimization of therapy and anticipate the greatest clinical impact in those patients with infections in which the microbe is detected only by mcfDNA sequencing and in immunocompromised hosts who are particularly vulnerable to infections with multidrugresistant bacteria.
  • Faster identification of drug-resistant organisms has the potential to prevent the spread of multi drug resistance, improve patient outcomes by enabling appropriate therapy to get patients on the right targeted therapy faster, and potentially reducing mortality and hospitalization costs.
  • the results can be reported within a clinically relevant time frame with results being available 1 to 3 days after receipt of the sample in the laboratory.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Mycology (AREA)
  • Botany (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne des procédés et des systèmes améliorés de détection de marqueurs génétiques AMR à l'aide de dosages de séquençage à étapes multiples. L'invention concerne également des procédés et des systèmes de préparation d'échantillons destinés à être utilisés dans des dosages de séquençage à haut débit pour détecter des marqueurs génétiques AMR.
PCT/US2025/016120 2024-02-16 2025-02-14 Procédés à étapes multiples pour détecter des acides nucléiques Pending WO2025175229A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463554922P 2024-02-16 2024-02-16
US63/554,922 2024-02-16

Publications (1)

Publication Number Publication Date
WO2025175229A1 true WO2025175229A1 (fr) 2025-08-21

Family

ID=94968914

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/016120 Pending WO2025175229A1 (fr) 2024-02-16 2025-02-14 Procédés à étapes multiples pour détecter des acides nucléiques

Country Status (1)

Country Link
WO (1) WO2025175229A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12601089B2 (en) 2018-11-21 2026-04-14 Karius, Inc. Direct-to-library methods, systems, and compositions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019178157A1 (fr) * 2018-03-16 2019-09-19 Karius, Inc. Série d'échantillons pour différencier des acides nucléiques cibles d'acides nucléiques contaminants
US20210301356A1 (en) * 2013-11-07 2021-09-30 The Board Of Trustees Of The Leland Stanford Junior University Cell-free nucleic acids for the analysis of the human microbiome and components thereof
WO2022150725A1 (fr) * 2021-01-11 2022-07-14 Karius, Inc. Détection rapide non invasive et surveillance en série d'infections chez des sujets à l'aide d'un séquençage d'adn acellulaire microbien

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210301356A1 (en) * 2013-11-07 2021-09-30 The Board Of Trustees Of The Leland Stanford Junior University Cell-free nucleic acids for the analysis of the human microbiome and components thereof
WO2019178157A1 (fr) * 2018-03-16 2019-09-19 Karius, Inc. Série d'échantillons pour différencier des acides nucléiques cibles d'acides nucléiques contaminants
WO2022150725A1 (fr) * 2021-01-11 2022-07-14 Karius, Inc. Détection rapide non invasive et surveillance en série d'infections chez des sujets à l'aide d'un séquençage d'adn acellulaire microbien

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
BOOLCHANDANI MANISH ET AL: "Sequencing-based methods and resources to study antimicrobial resistance", NATURE REVIEWS GENETICS, NATURE PUBLISHING GROUP, GB, vol. 20, no. 6, 18 March 2019 (2019-03-18), pages 356 - 370, XP036785112, ISSN: 1471-0056, [retrieved on 20190318], DOI: 10.1038/S41576-019-0108-4 *
DONGSHENG HAN: "Liquid biopsy for infectious diseases: a focus on microbial cell-free DNA sequencing", THERANOSTICS, vol. 10, no. 12, 1 January 2020 (2020-01-01), AU, pages 5501 - 5513, XP093169327, ISSN: 1838-7640, DOI: 10.7150/thno.45554 *
EICHENBERGER EMILY M ET AL: "Microbial Cell-Free DNA Identifies the Causative Pathogen in Infective Endocarditis and Remains Detectable Longer Than Conventional Blood Culture in Patients with Prior Antibiotic Therapy", CLINICAL INFECTIOUS DISEASES, vol. 76, no. 3, 10 June 2022 (2022-06-10), US, pages e1492 - e1500, XP093273084, ISSN: 1058-4838, Retrieved from the Internet <URL:https://academic.oup.com/cid/article-pdf/76/3/e1492/50496787/ciac426.pdf> DOI: 10.1093/cid/ciac426 *
GOVENDER KUMEREN N. ET AL: "Metagenomic Sequencing as a Pathogen-Agnostic Clinical Diagnostic Tool for Infectious Diseases: a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies", JOURNAL OF CLINICAL MICROBIOLOGY, vol. 59, no. 9, 18 August 2021 (2021-08-18), US, XP093273429, ISSN: 0095-1137, Retrieved from the Internet <URL:https://journals.asm.org/doi/pdf/10.1128/JCM.02916-20> DOI: 10.1128/JCM.02916-20 *
LEWINSKI MICHAEL A. ET AL: "Exploring the Utility of Multiplex Infectious Disease Panel Testing for Diagnosis of Infection in Different Body Sites", THE JOURNAL OF MOLECULAR DIAGNOSTICS, vol. 25, no. 12, 1 December 2023 (2023-12-01), pages 857 - 875, XP093273164, ISSN: 1525-1578, DOI: 10.1016/j.jmoldx.2023.08.005 *
LINDNER ET AL., NUCL. ACIDS RES, vol. 41, no. 1, 2013, pages e10
ONICIUC ELENA A. ET AL: "The Present and Future of Whole Genome Sequencing (WGS) and Whole Metagenome Sequencing (WMS) for Surveillance of Antimicrobial Resistant Microorganisms and Antimicrobial Resistance Genes across the Food Chain", GENES, vol. 9, no. 5, 22 May 2018 (2018-05-22), US, pages 268, XP093273770, ISSN: 2073-4425, DOI: 10.3390/genes9050268 *
RODINO KYLE G. ET AL: "Status check: next-generation sequencing for infectious-disease diagnostics", THE JOURNAL OF CLINICAL INVESTIGATION, vol. 134, no. 4, 15 February 2024 (2024-02-15), US, XP093273737, ISSN: 1558-8238, DOI: 10.1172/JCI178003 *
SMOLLIN MATTHEW ET AL: "#35: Rapid, Non-invasive Detection and Serial Monitoring of Invasive Fungal Infections in Immunocompromised Children Using the Karius Test (a Plasma-based Microbial Cell-free DNA Sequencing Test)", JOURNAL OF THE PEDIATRIC INFECTIOUS DISEASES SOCIETY, vol. 10, no. Supplement_2, 28 June 2021 (2021-06-28), pages S2 - S3, XP093273118, ISSN: 2048-7207, DOI: 10.1093/jpids/piab031.004 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12601089B2 (en) 2018-11-21 2026-04-14 Karius, Inc. Direct-to-library methods, systems, and compositions

Similar Documents

Publication Publication Date Title
US20210403986A1 (en) Detection and prediction of infectious disease
Li et al. High‐throughput metagenomics for identification of pathogens in the clinical settings
US20240409989A1 (en) Methods for assessing risk using total and specific cell-free dna
US20240344111A1 (en) Sample series to differentiate target nucleic acids from contaminant nucleic acids
CN108368542B (zh) 用于基因组组装、单元型定相以及独立于靶标的核酸检测的方法
JP2024069344A (ja) 敗血症の診断
US20230040907A1 (en) Diagnostic assay for urine monitoring of bladder cancer
CN104212890A (zh) 诊断传染病病原体及其药物敏感性的方法
CN106029903A (zh) 用于鉴定基因的等位基因的方法和探针
WO2017187178A1 (fr) Procédé pour la détection de bactéries produisant de la carbapénémase
WO2025175229A1 (fr) Procédés à étapes multiples pour détecter des acides nucléiques
WO2024006765A2 (fr) Procédé universel d&#39;identification d&#39;endosymbiote eucaryote et parasite
EP3377654B1 (fr) Systèmes et méthodes pour identifier et distinguer des échantillons génétiques
WO2025160484A1 (fr) Biomarqueurs d&#39;adn microbien et acellulaire humain pour diagnostiquer et évaluer la gravité d&#39;une maladie intestinale inflammatoire
WO2023097231A1 (fr) Procédé d&#39;hybridation in situ par fluorescence multiplexée capable de détecter rapidement des milliards de cibles
WO2025231437A1 (fr) Prévision du syndrome de relargage cytokinique durant un traitement par lymphocytes car-t
KR20230110606A (ko) 감염원을 식별하는 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25711338

Country of ref document: EP

Kind code of ref document: A1