WO2025072084A1 - Mise à jour d'enregistrements sur la base d'annotations de consensus de variants génétiques - Google Patents

Mise à jour d'enregistrements sur la base d'annotations de consensus de variants génétiques Download PDF

Info

Publication number
WO2025072084A1
WO2025072084A1 PCT/US2024/047949 US2024047949W WO2025072084A1 WO 2025072084 A1 WO2025072084 A1 WO 2025072084A1 US 2024047949 W US2024047949 W US 2024047949W WO 2025072084 A1 WO2025072084 A1 WO 2025072084A1
Authority
WO
WIPO (PCT)
Prior art keywords
population
records
annotation
processors
variant
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2024/047949
Other languages
English (en)
Inventor
Garrett Michael Frampton
Dexter JIN
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foundation Medicine Inc
Original Assignee
Foundation Medicine Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foundation Medicine Inc filed Critical Foundation Medicine Inc
Publication of WO2025072084A1 publication Critical patent/WO2025072084A1/fr
Anticipated expiration legal-status Critical
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/40Population genetics; Linkage disequilibrium
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/10Ontologies; Annotations
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients

Definitions

  • Genetic mutations and other variants are relevant to human health.
  • specific types of genetic variants are associated with common characteristics.
  • a specific type of genetic variant may be associated with a particular type or subtype of cancer.
  • a specific type of genetic variant is associated with cancers that are resistant to some therapies and responsive to other therapies. These associations can be highly pertinent to cancer treatment and management.
  • a sample obtained from a patient can be analyzed and sequenced in order to identify genetic variants of the patient.
  • a report summarizing the genetic variants, and any known associations between the genetic variants and clinically relevant characteristics, can be generated and output.
  • on-going cancer research leads to new relevant information being generated rapidly.
  • the characteristics of a given genetic variant reported at one time may be quickly out-of-date due to additional research findings.
  • FIG. 1 illustrates an example environment for efficiently distributing updated annotations of genetic variants.
  • FIG. 2 illustrates example records stored in at least one patient database.
  • FIG. 3 illustrates an example report summarizing predicted categories of a cancer of a subject.
  • FIG. 4 illustrates an example process for updating annotations in patient records.
  • FIG. 5 illustrates an example environment for sequencing various nucleic acid molecules.
  • FIG. 6 illustrates one or more devices configured to perform various operations described herein.
  • Various implementations of the present disclosure relate to techniques for identifying and distributing consensus annotations associated with genetic variants. Records of multiple samples associated with the same genetic variant may be stored in one or more databases. Each record may include one or more annotations associated with the variant, which can describe associations between the variant and various pathologies (e.g., types or subtypes of cancers), whether the pathologies are responsive to treatments, whether the pathologies are resistant to treatments, and other clinically relevant characteristics. As additional instances of the genetic variant are identified in a population, additional records may be added. Earlier records may, however, indicate out-of-date annotations in view of later records. It may be beneficial to update the earlier records with updated annotations based on the later records.
  • pathologies e.g., types or subtypes of cancers
  • a live database may be designed that includes entries corresponding to variant identifiers, rather than entries corresponding to instances of observed variants. While this may minimize memory resources, it presents other problems.
  • implementations of the present disclosure enable automatically updating patient records without the use of a live database and while minimizing inconsistency between recent annotations. Accordingly, implementations of the present disclosure can reduce the amount of processing and memory resources utilized to track records of genetic variants throughout a population. Further, implementations of the present disclosure relate to identifying whether recent annotations represent consensus annotations, prior to distributing the consensus annotations to earlier records that may not yet have included the consensus annotations or that may include older annotations that are now outdated in view of the consensus annotations. Thus, implementations may prevent the distribution of inaccurate annotations of genetic variants. For at least these reasons, implementations of the present disclosure provide significant improvements to the technical fields of bioinformatics, particularly related to genetic analysis.
  • deoxyribonucleic acid may refer to a polymer of nucleotides (also referred to as “nucleobases”) containing deoxyribose.
  • the nucleotides in DNA include cytosine (C), guanine (G), adenine (A), and thymine (T).
  • Each DNA nucleotide includes a deoxyribose and a phosphate group.
  • An example single-stranded DNA (ssDNA) molecule includes a chain of covalently bonded DNA nucleotides.
  • the phosphate group of the mth nucleotide is covalently bonded to the deoxyribose of the (m-1)th nucleotide, wherein m is a positive integer greater than 2 and less than or equal to the number of DNA nucleotides in the chain.
  • DNA is double-stranded and includes two ssDNA molecules that are complementary to one another and coiled around each other in a double helix form.
  • the nucleotides of one ssDNA molecule are hydrogen bonded to the nucleotides of the other ssDNA molecule.
  • the pyrimidines (A and T) hydrogen bond to each other
  • ribonucleic acid may refer to a polymer of nucleotides containing ribose.
  • the nucleotides in RNA include cytosine (C), guanine (G), adenine (A), and uracil (U).
  • Each RNA nucleotide includes a ribose and a phosphate group.
  • RNA molecule the phosphate group of the nth nucleotide is covalently bonded to the ribose of the (n-1)th nucleotide, wherein n is a positive integer greater than 2 and less than or equal to the number of RNA nucleotides in the chain.
  • Messenger RNA is a type of RNA molecule that is synthesized (or "transcribed”) by RNA polymerase (an enzyme) to be complementary to a gene encoded in a DNA sequence, and is also used by a ribosome to synthesize a polypeptide or protein.
  • RNA is therefore an example of a "coding RNA.”
  • intron sequences are removed from an mRNA via a process known as "RNA splicing.”
  • MicroRNA MicroRNA
  • miRNA are single-stranded RNA molecules that perform post-transcriptional gene expression regulation.
  • a miRNA may bind to a complementary mRNA molecule, thereby cleaving, destabilizing, or otherwise preventing the mRNA molecule from being translated into a polypeptide or protein by a ribosome.
  • a miRNA has a length in a range of 21 to 23 RNA nucleotides.
  • non-coding RNA may refer to a type of RNA that is not translated into a protein.
  • non-coding RNA examples include miRNA, transfer RNA (tRNA), and ribosomal RNA (rRNA).
  • RNA transfer RNA
  • rRNA ribosomal RNA
  • the term "functional RNA,” and its equivalents, may refer to any RNA molecule that impacts a biological process.
  • functional RNA may include mRNA, miRNA, tRNA, rRNA, and the like.
  • base may refer to a monomer of a polymer.
  • a base of DNA or RNA is a nucleotide.
  • a base pair may refer to a pair of complementary DNA nucleotides, which are hydrogen-bonded to one another in a double-stranded DNA molecule.
  • a base pair includes a first base in a first ssDNA and a second base in a second ssDNA, wherein the first and second bases are complementary and hydrogen-bonded to one another.
  • nucleotide As used herein, the terms “nucleotide,” “nucleobase,” “nucleic acid,” “nucleic acid molecule,” and their equivalents, may refer to an organic molecule that includes a nitrogenous base, a sugar, and a phosphate group. In various cases, a nucleotide is a monomer of DNA or RNA. A nucleotide, for instance, is a chemical structure.
  • 3' end may refer to a terminus of a singlestranded nucleotide polymer that includes a base whose third carbon in its deoxyribose or ribose is bound to a hydroxyl group while being unbound to another base.
  • the terms “5' end,” “5-prime end,” and their equivalents may refer to a terminus of a singlestranded nucleotide polymer that includes a base whose fifth carbon in its deoxyribose or ribose ring is unbound to another base. In some cases, the fifth carbon is bound to a phosphate group.
  • the "length” of a polymer refers to a number of covalently bonded monomers that are included in the polymer.
  • the length of a DNA molecule may be the number of covalently bonded nucleotides in at least one strand of the DNA molecule and/or the number of base pairs in the DNA molecule.
  • the length of an RNA molecule may be the number of covalently bonded nucleotides in the RNA molecule.
  • the term "gene,” and its equivalents, refers to a sequence of DNA nucleotides that is transcribed into a functional RNA.
  • the functional RNA for instance, is RNA that is translated into a polypeptide or protein (e.g., mRNA) or that has some other biological function (e.g., miRNA, tRNA, etc.).
  • a gene is "expressed” when it is used as a template to generate a functional RNA.
  • a subject for instance, has numerous genes contained in the subject's genome. A gene may include both introns and exons.
  • the term "intron,” and its equivalents, may refer to a subset of DNA nucleotides in a gene that is not used to code for any functional RNA that is expressed by the organism.
  • the term “exon,” and its equivalents may refer to a subset of DNA nucleotides in a gene that is used to code for a functional RNA.
  • an exon may encode a polypeptide or protein that is expressed by the organism.
  • a gene can be represented in data (e.g., as data representative of the sequence of DNA nucleotides in the gene) or as a chemical structure (e.g., as the sequence of DNA nucleotides itself).
  • the term "genome,” and its equivalents, refers to the aggregate of genes of a subject.
  • a genome represents the sequences of several linear DNA molecules that are present in a subject's chromosomes.
  • a "reference genome” refers to an aggregation of genes of one or more reference subjects.
  • a genome is represented in data.
  • pangenome refers to an aggregate set of genes from multiple subgroups (e.g., strains) within a population (e.g., a clade) of subjects.
  • a pangenome indicates genes that are present in all subjects within the population, as well as genes that are present in some of the subjects of the population.
  • a pangenome is represented in data, for instance.
  • transcriptome refers to the aggregate of RNA sequences of a subject. In some cases, a transcriptome is limited to mRNA sequences. In various examples, a transcriptome is represented in data.
  • genomic DNA may refer to DNA molecules that are obtained from a chromosome and/or nucleus of a cell.
  • DNA fragment may refer to DNA molecules that are excised and/or broken off from a larger DNA molecule.
  • cell-free DNA may refer to DNA fragments that are non-encapsulated and obtained outside of cells within a sample (e.g., a liquid biopsy sample).
  • circulating tumor DNA may refer to a cfDNA molecule that originates from a cancer cell.
  • end motif may refer to a sequence of nucleotides extending from a 3' or 5' end of a DNA or RNA molecule.
  • the end motif is shorter than a length of the DNA or RNA molecule.
  • the end motif may have a length in a range of 5 to 30 bases or base pairs, a range of 3 to 30 bases or base pairs, or a range of 1 to 30 base pairs.
  • the term "promoter,” and its equivalents may refer to a portion of a DNA molecule that binds one or more proteins in order to initiate transcription of a gene.
  • the promotor is located "upstream” of the gene.
  • the promotor is located between the 5' end of the DNA molecule and the gene.
  • a promotor may include one or more binding sites for RNA polymerase, and/or one or more transcription factor binding sites.
  • a promotor includes one or more CpG islands.
  • a promoter for instance, includes a transcription start site.
  • CpG island may refer to a continuous portion of a DNA molecule whose sequence includes greater than a threshold amount (e.g., greater than 50%) of G- C base pairs.
  • a threshold amount e.g. 50%
  • the term “enhancer,” and its equivalents may refer to a portion of a DNA molecule that binds one or more proteins in order to increase the chance that a gene will be transcribed. For instance, an enhancer includes one or more transcription factor binding sites. In various cases, an enhancer includes one or more CpG islands.
  • cancer may refer to a condition of a subject in which particular cells (referred to as “cancer cells”) divide uncontrollably in the subject's body.
  • a cancer is characterized by a location or tissue type from which the cancer cells originated.
  • a cancer is characterized by a location or tissue type in which the cancer cells are located.
  • tumor As used herein, the terms “tumor,” “neoplasm,” and their equivalents, may refer to a mass of tissue including cancer cells.
  • tissue of origin refers to a differentiated type of tissue from which cancer cells in the body of a subject began dividing uncontrollably in the subject's body.
  • liquid biopsy may refer to a process of obtaining a fluid sample from a subject's body. The sample, for instance, can be referred to as a "liquid biopsy sample.” Examples of fluids that are sampled from the body include blood, plasma, cerebrospinal fluid, sputum, stool, urine, lymphatic fluid, and saliva.
  • tissue biopsy may refer to a process of obtaining a sample of cells from a subject's body.
  • a tissue biopsy in various cases, is performed by cutting a mass of cells from the subject's body.
  • a tissue biopsy is a procedure performed by a surgeon, interventional radiologist, interventional cardiologist, or other specialized clinician.
  • tissue or tissue biopsy sample can be used to refer to the sample of cells obtained using a tissue biopsy.
  • the term "subject,” and its equivalents, may refer to a human or non-human animal.
  • a subject that is receiving care from at least one care provider may be referred to as a "patient.”
  • variant may refer to a difference between a subject genetic sequence and a reference sequence.
  • a variant may correspond to a difference between one or more nucleotides in a genome of a subject and one or more corresponding nucleotides in at least one reference genome or pangenome.
  • a variant may be characterized by its identity (e.g., what nucleotides are different), its position (e.g., where are the nucleotides located in the genome, what chromosome contains the nucleotides, what gene contains the nucleotides, etc.), its length (e.g., how many nucleotides are different from the reference sequence), its type (e.g., substitution, insertion, deletion, copy number alternation, rearrangement of fusion, etc.), and other features that indicates its significance and/or relevance.
  • a variant represents any apparent alteration in a sequence that has been read from a nucleic acid molecule with respect to the reference sequence, such as reads cleaved by restriction enzymes (RE).
  • RE restriction enzymes
  • a variant can be represented in data (e.g., by data characterizing the variant) or as a chemical structure (e.g., the nucleotides themselves).
  • the term "mutation,” and its equivalents, may refer to a change in a gene.
  • substitution can refer to a nucleotide in a subject sequence that is different than an equivalent nucleotide (e.g., a nucleotide at the same position) in a reference sequence.
  • insertion can refer to a nucleotide in a subject sequence that is added with respect to a reference sequence.
  • deletion can refer to the removal of a nucleotide from a nucleotide sequence.
  • CNA copy number alternation
  • CNV copy number variation
  • the terms “rearrangement of fusion,” “fusion rearrangement,” “translocation,” and their equivalents can refer to a change in the relative position of one or more portions of a reference sequence, thereby generating a gene that was not present in the reference sequence.
  • the term “sequencing,” and its equivalents may refer to a process of identifying the order and identity of monomers in a polymer chain, such as the order and identity of nucleotides in a DNA or RNA molecule.
  • the terms “whole genome sequencing,” “WGS,” and their equivalents, may refer to the process of sequencing an entire genome of a subject, including the introns and exons of the genes of the subject.
  • the term “whole exome sequencing,” and its equivalents, may refer to the process of sequencing all exomes of a subject.
  • targeted sequencing and its equivalents, may refer to the process of sequencing a portion of the genome of a subject, such as sequencing a single gene of the subject.
  • targeted sequencing may entail detecting the presence and/or sequences of a panel of predetermined genes.
  • Various techniques can be utilized to sequence a DNA or RNA molecule, such as massively parallel sequencing (MPS), nanopore sequencing, direct sequencing, Sanger sequencing, or next-generation sequencing.
  • MPS massively parallel sequencing
  • nanopore sequencing nanopore sequencing
  • direct sequencing Sanger sequencing
  • next-generation sequencing sequencing is performed on physical molecules (e.g., RNA or DNA) and is used to generate data.
  • NGS next-generation sequencing
  • Examples of NGS include, for instance, massive parallel sequencing and nanopore sequencing.
  • nanopore sequencing may refer to a technique for identifying the order and identity of monomers in a polymer chain by transporting the polymer chain from a first space to a second space, wherein the first space and the second space are separated by a substrate, by directing the polymer chain through a small hole (known as a "nanopore”) embedded in the substrate, and monitoring a relative electrical signal (e.g., a voltage or current) between the first space and the second space.
  • a relative electrical signal e.g., a voltage or current
  • the term "sensor,” and its equivalents, may refer to a physical device or other apparatus that is configured to detect one or more detection signals.
  • detection signal may refer to a physical signal that can be identified, characterized, or otherwise perceived by a sensor.
  • sequence read data may refer to data that is indicative of an order and identity of monomers in a polymer, such as the order and identity of nucleotides in a DNA or RNA sequence.
  • sequence read data is generated via a sequencing operation.
  • image may refer to 2D or 3D array of data indicative of an array of pixels or voxels.
  • ligating may refer to a process of joining two molecules together, for example, with a chemical bond.
  • the term "adapter,” and its equivalents may refer to an oligonucleotide that can be ligated to a target nucleic acid molecule. In various cases, an adapter prepares the target nucleic acid molecule for sequencing.
  • the term "bait molecule,” and its equivalents may refer to a nucleic acid molecule having a region that is complementary to a region of a target molecule (e.g., cfDNA).
  • a bait molecule includes, for instance, a nucleic acid molecule that can hybridize to (/.e., is complementary to) a target molecule can be used to capture the target molecule.
  • the bait molecule is a capture oligonucleotide (or capture probe). In some instances, the bait molecule is suitable for solution phase hybridization to the target molecule. In some instances, the bait molecule is suitable for solid phase hybridization to the target molecule. In some instances, the bait molecule is suitable for both solution-phase and solid-phase hybridization to the target molecule.
  • the design and construction of bait molecules is described in more detail in, e.g., International Patent Application Publication No. WO 2020/236941 .
  • the term "amplifying,” and its equivalents may refer to a process of generating copies of a target molecule, such as a nucleic acid molecule.
  • hybridization may refer to a process by which to complementary single-stranded nucleic acid molecules bind to one another, thereby forming a double-stranded nucleic acid molecule.
  • double-stranded nature of the nucleic acid molecule is maintained under stringent hybridization conditions.
  • Exemplary stringent hybridization conditions include an overnight incubation at 42 °C in a solution including 50% formamide, 5XSSC (750 mM NaCI, 75 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5XDenhardt's solution, 10% dextran sulfate, and 20 pig/ml denatured, sheared salmon sperm DNA, followed by washing the filters in 0.1XSSC at 50 °C.
  • 5XSSC 750 mM NaCI, 75 mM trisodium citrate
  • 50 mM sodium phosphate pH 7.6
  • 5XDenhardt's solution 10% dextran sulfate
  • 20 pig/ml denatured, sheared salmon sperm DNA followed by washing the filters in 0.1XSSC at 50 °C.
  • the term "complementary,” and its equivalents may refer to a state of two single-stranded nucleic acid molecules with respective sequences that cause the nucleic acid molecules to spontaneously hybridize to one another.
  • One nucleic acid molecule for instance, may have a sequence that causes each nucleic acid to hydrogen bond to a respective nucleic acid in the other nucleic acid molecule.
  • cancer may refer to a composition or process that can be used to remediate a health problem.
  • Cancer therapies for instance, include surgery, radiotherapy, chemotherapy, immunotherapy, cell-based therapies, and the like.
  • cancer therapies include abemaciclib (Verzenio), abiraterone acetate (Zytiga), acalabrutinib (Calquence), ado-trastuzumab emtansine (Kadcyla), afatinib dimaleate (Gilotrif), aldesleukin (Proleukin), alectinib (Alecensa), alemtuzumab (Campath), alitretinoin (Panretin), alpelisib (Piqray), amivantamab-vmjw (Rybrevant), anastrozole (Arimidex), apalutamide (Erleada), asciminib hydrochloride (Scemblix), atezolizumab (Tecentriq), avapritinib (Ayvakit), avelumab (Bavencio), axicabtagene ciloleucel (Yescarta
  • treatment-responsive may refer to a type of cancer cells that can be substantially killed using a predetermined type of therapy.
  • cancer cells of a subject may be responsive to a particular treatment if, after the subject is administered the treatment, the cancer cells are diminished by a particular progression level (e.g., radiographic progression level, marker-based progression level, such as prostate-specific antigen (PSA) progression, etc.).
  • a particular progression level e.g., radiographic progression level, marker-based progression level, such as prostate-specific antigen (PSA) progression, etc.
  • PSA prostate-specific antigen
  • treatment-resistant may refer to a type of cancer that cannot be substantially killed using a predetermined type of therapy.
  • metastasis profile may refer to a propensity of a type of cancer to metastasize into one or more differentiated tumor types besides the cancer's tissue origin.
  • the metastasis profile can further indicate the type of tissue in which the cancer can or is likely to metastasize.
  • clinical trial may refer to a research study used to evaluate a hypothesis based on participation by one or more subjects.
  • a clinical trial can be used to assess the efficacy and/or safety of a proposed therapy.
  • a clinical trial may be performed in furtherance of approval of a treatment by a regulatory authority (e.g., the United States Food & Drug Administration (FDA)).
  • FDA United States Food & Drug Administration
  • FIG. 1 illustrates an example environment 100 for efficiently distributing updated annotations of genetic variants.
  • a subject 102 presents to a clinical environment with at least one symptom or characteristic associated with cancer.
  • the subject 102 may be a patient of the clinical environment.
  • the subject 102 is a human, but implementations are not so limited.
  • the subject 102 could be a non-human animal.
  • the clinical environment includes a hospital, a health clinic, or a veterinary clinic.
  • a care provider 104 obtains a sample 106 from the subject 102.
  • the care provider 104 is a clinician responsible for caring for the subject 102.
  • the care provider 104 is a physician, a physician's assistant, a nurse, a resident, a medical student, a medical technician, or the like.
  • the sample 106 in various cases, can be a liquid biopsy sample, a tissue biopsy sample, or a combination thereof.
  • the sample 106 is obtained by obtaining a fluid from the subject 102 in a liquid biopsy procedure.
  • fluids in the sample 106 include blood, plasma, cerebrospinal fluid, sputum, stool, urine, lymphatic fluid, or saliva.
  • the sample 106 is obtained by intravenously extracting blood and/or plasma from the subject 102.
  • a liquid biopsy sample of the subject 102 may include circulating tumor cells (CTCs).
  • a tissue biopsy sample is obtained by physically excising a portion of a tissue of the subject 102.
  • a portion of a lesion in the body of the subject 102 may be surgically removed from the subject 102.
  • the lesion is a tumor.
  • the sample 106 includes nucleic acid molecules 108.
  • the nucleic acid molecules 108 include DNA and/or RNA.
  • the nucleic acid molecules 108 are disposed inside of cells in the sample 106.
  • the nucleic acid molecules 108 may include cellular DNA and/or RNA.
  • the nucleic acid molecules 108 are disposed outside of cells of the subject 102.
  • the nucleic acid molecules 108 may include cell-free DNA (cfDNA) and/or circulating tumor DNA (ctDNA).
  • the nucleic acid molecules 108 include mRNA, microRNA, non-coding RNA, functional RNA, or any combination thereof.
  • sequences of the nucleic acid molecules 108 in the sample 106 are indicative of a cancer of the subject 102.
  • the nucleic acid molecules 108 may include ctDNA released from cells in a tumor of the subject 102.
  • the nucleic acid molecules 108 include genomic DNA (gDNA) of the subject 102, which may include sequences indicative of a type of cancer that the subject 102 has developed or is predisposed toward.
  • the nucleic acid molecules 108 are sequenced by a sequencer 110.
  • the nucleic acid molecules 108 are extracted from the sample 106.
  • Various types of techniques can be used by the sequencer 110 to sequence the nucleic acid molecules 108.
  • the sequencer 110 may perform Sanger sequencing, second-generation sequencing (e.g., sequencing-by-synthesis), third-generation sequencing (e.g., nanopore sequencing), or any combination thereof.
  • the sequencer 110 identifies methylated cytosines in the nucleic acid molecules 108 using methylation sequencing (methyl-seq).
  • the sequencer 110 generates complementary DNA (cDNA) to RNA molecules in the nucleic acid molecules 108 and sequences the cDNA in order to identify the sequences of the RNA.
  • cDNA complementary DNA
  • the sequencer 110 divides the nucleic acid molecules 108 into fragments.
  • the sequencer 110 in some instances, ligates adapters onto the fragments and/or nucleic acid molecules 108.
  • the adapters include amplification primers, flow cell adaptor sequences, substrate adapter sequences, sample index sequences, or any combination thereof.
  • the sequencer 110 amplifies the fragments and/or nucleic acid molecules 108.
  • the sequencer 110 performs a polymerase chain reaction (PCR) amplification technique, a non-PCR amplification technique, an isothermal amplification technique, or any combination thereof.
  • PCR polymerase chain reaction
  • the sequencer 110 captures the amplified molecules onto a substrate and performs sequencing-by-synthesis in order to identify the sequences of the amplified molecules.
  • the sequencer 110 includes one or more photosensors configured to detect light signals emitted by fluorescently tagged nucleotide triphosphates (NTPs) that are added to nucleic acid molecules synthesized by DNA polymerase using the amplified molecules as templates.
  • NTPs fluorescently tagged nucleotide triphosphates
  • the sequencer 110 includes one or more sensors that detect signals indicative of the sequences of the nucleic acid molecules 108.
  • the sequencer 110 outputs sequence read data 112 indicative of sequences of the nucleic acid molecules 108.
  • the sequence read data 112 indicates an identity and/or order of bases in the nucleic acid molecules 108.
  • the sequence read data 112 indicates the presence and/or location of methylated cytosines in the nucleic acid molecules 108.
  • a variant identifier 114 identifies the presence of a variant in the nucleic acid molecules 108 by analyzing the sequence read data 112.
  • the variant identifier 114 may compare the sequences indicated in the sequence read data 112 to one or more reference sequences, such as one or more reference genomes and/or a pangenome.
  • the variant identifier 114 is configured to map the sequences indicated in the sequence read data 112 to regions in the reference sequence(s). For instance, the variant identifier 114 may determine that a sequence indicated in the sequence read data 112 is indicative of the subject's 102 copy of a predetermined gene.
  • the variant identifier 114 may identify one or more differences between a sequence in the sequence read data 112 and the reference sequence(s). In various cases, the difference(s) is representative of one or more variants indicated by the sequence read data 112. [0076]
  • the variant identifier 114 outputs an variant indication 116 based on the identified variant.
  • the variant indication 116 may indicate a position and/or type of variant identified from the sequence read data 112. Examples of variants identified in the variant indication 116 include nucleotide substitutions, nucleotide additions, nucleotide deletions, structural variants, and copy number variants.
  • the variant indication 116 may indicate a chromosome on which the variant is observed, a position on the chromosome where the variant was detected, a region (e.g., a gene, promoter, enhancer, transcription factor binding site, hotspot, etc.) where the variant was detected, or any combination thereof.
  • a region e.g., a gene, promoter, enhancer, transcription factor binding site, hotspot, etc.
  • the variant indication 116 further includes one or more annotations.
  • the annotation(s) are prestored in the variant identifier 114.
  • the variant identifier 114 may store or otherwise access a table of annotations associated with various variants. The table, for instance, is indexed according to variant.
  • the variant identifier 114 may identify an entry of the table corresponding to the identified variant, which may also include the annotation(s).
  • the annotations stored in the variant identifier 114 are received from one or more external devices. For instance, one or more clinical providers may input the annotations into the variant identifier 114.
  • the annotations may be provided as user input via an external device, for instance based on an analysis of the sequence read data 112 associated with the variant.
  • the annotation(s), for instance, indicate characteristics of the variant.
  • the annotation(s) indicates an association between the variant and at least one disease, such as at least one type of cancer.
  • the annotation(s) include an indication of an effective therapy for the disease(s), such as a treatment that the disease(s) is responsive to.
  • the annotation(s) indicate an expected progression of the disease(s) associated with the variant, such as a metastasis profile associated with the variant.
  • the annotation(s) include an identifier of the variant, such as a name that is used to refer to the variant in the medical or research community.
  • the annotation(s) include an association between the variant and a particular ancestry (e.g., whether the variant is typically associated with individuals having Ashkenazi heritage).
  • the annotation(s) in some examples, indicate whether the variant is a germline variant.
  • the annotation(s) may include a class identifier, indicating a class or category of the variant.
  • the annotation(s) include other information, such as custom notes associated with the variant.
  • the variant indication 116 is provided to a report generator 118.
  • the report generator 118 generates an original report 120 based on the variant indication 116.
  • the original report 120 can be printed, output, transmitted over a network, displayed via a user interface, or otherwise provided to one or more end users or devices.
  • the original report 120 can be provided to a clinical device 124.
  • the care provider 104 may diagnose and/or initiate a treatment of the subject 102. In some cases, the care provider 104 communicates a prognosis of the subject 102 based on the original report 120.
  • the original report 120 includes an outdated annotation 122 corresponding to the identified variant.
  • the outdated annotation 122 may indicate clinically acceptable information associated with the identified variant at the time of the original report 120 .
  • additional data related to the identified variant becomes available over time.
  • additional research studies, clinical trials, longitudinal studies, and other research associated with the subject 102 and/or other subjects in a population may indicate that the outdated annotation 122 is erroneous or incomplete.
  • a record system 126 is configured to track variants across a population.
  • the record system 126 includes one or more population databases 128 that store various indications of variants of a population.
  • the population includes one or more individuals other than the subject 102.
  • the population databases 128, for instance, store various information about variants of the individuals in the population.
  • the population databases 128 may indicate the variants, as well as annotations associated with the variants.
  • the variant identifier 114 may access the population databases 128 in order to identify annotations, such as the outdated annotation 122, to provide to the report generator 118.
  • the population database(s) 128 include a variant bin 130.
  • the record system 126 may sort various records into collections of records (or “bins”) according to variant.
  • the variant bin 130 may correspond to the specific type of variant of the subject 102 that was identified by the variant identifier 114 and indicated in the original variant indication 116.
  • the variant bin 130 includes n records (also referred to as "patient records”) corresponding to the variant, where n is an integer greater than one.
  • the n records correspond to n individuals in the population that share the same type of variant.
  • the record system 126 generates the first record among the n records based on the original variant indication 116.
  • the record system 126 in various cases, generates additional records among the n records based on population variant indications 132.
  • the population variant indications 132 may be received by the record system 126 from one or more external devices.
  • the population variant indications 132 identify variants observed from the population.
  • the population variant indications 132 may further indicate annotations of the variants observed from the population.
  • the records stored in the variant bin 130 include first to nth timestamps 134-1 to 134-n.
  • the first to nth timestamps 134-1 to 134-n indicate times associated with the observed variant of the subject 102 and the population.
  • the first to nth timestamps 134-1 to 134-n indicate times at which the variant was observed in samples of the population (e.g., the time at which the variant identifier 114 identified the variant), times at which the variant was reported to the record system 126 (e.g., times at which the variant indication 116 and the population variant indications 132 were received by the record system 126), times at which the annotations were generated (e.g., times at which a study indicating that a variant is associated with a treatment-responsive therapy was published), times at which reports indicating the variants were output (e.g., a time at which the original report 120 was output to the clinical device 124), or any combination thereof.
  • times at which the variant was observed in samples of the population e.g., the time at which the variant identifier 114 identified the variant
  • times at which the variant was reported to the record system 126 e.g., times at which the variant indication 116 and the population variant indications 132 were received by the record system 126
  • Each of the n records also includes one or more annotations associated with the variant.
  • the first record corresponding to the subject 102, includes the outdated annotation 122.
  • annotations stored in the variant bin 130 for additional records may be different than the outdated annotation 122.
  • the record system 126 may monitor the population database(s) 128 and/or the variant bin 130 in order to identify a consensus annotation 136. For instance, the record system 126 may review the annotations stored in the variant bin 130 at a predetermined frequency (e.g., every day, week, month, etc.) and/or in response to an event (e.g., in response to a new record being added to the variant bin 130).
  • the nth record for example, includes the consensus annotation 136.
  • the consensus annotation 136 includes an annotation that is shared by multiple records stored in the variant bin 130.
  • the consensus annotation 136 is stored in greater than a threshold number (e.g., m, wherein m is a positive integer) of records stored in the variant bin 130.
  • the threshold number of records may be the mth most recent records stored in the variant bin 130.
  • the record system 126 may identify the consensus annotation 136 among records whose associated timestamps 134-1 to 134-n indicate times later than a threshold time point.
  • the consensus annotation 136 may represent a commonly accepted annotation in a set of recent records due to recently accepted research.
  • the consensus annotation 136 may accordingly differ from annotations, such as the outdated annotation 122, that are included in older records.
  • the record system 126 is configured to update annotations in the variant bin 130 based on the consensus annotation 136.
  • the record system 126 rewrites annotations in the variant bin 130 that are not the consensus annotation 136.
  • the record system 126 may replace the outdated annotation 122 in the first record with the consensus annotation 136.
  • the record system 126 appends annotations in the variant bin 130 that do not include the consensus annotation 136.
  • the record system 126 may also, or alternately, append the consensus annotation 136 to records that already include the outdated annotation 122, such that the records include both the outdated annotation 122 and the consensus annotation 136.
  • a user or system can have access to both the outdated annotation 122 and the newer appended consensus annotation 136 in a particular record when the user or system accesses that particular record, and can determine how the annotations were updated over time.
  • the record system 126 may flag the outdated annotation 122 as being an older annotation, and flag the appended consensus annotation 136 as being a current annotation.
  • the record system 126 may further output an indication of the consensus annotation 136. For instance, in response to modifying the first record to include the consensus annotation 136, the record system 126 may transmit an indication of the consensus annotation 136 to the report generator 118.
  • the report generator 118 may generate an updated report 138 based on the consensus annotation 136.
  • the report generator 118 can be printed, output, transmitted over a network, displayed via a user interface, or otherwise provided to one or more end users or devices.
  • the report generator 118 may output the updated report 138 to the clinical device 124, or some other clinical deice associated with the subject 102 and/or the care provider 104.
  • the subject 102 and/or care provider 104 may update a treatment, diagnosis, prognosis, or any combination thereof, based on the consensus annotation 136.
  • FIG. 1 illustrates various elements that can be embodied in one or more computing devices.
  • the functions of the sequencer 110, the variant identifier 114, the report generator 118, the clinical device 124, the record system 126, the population database(s) 128, the variant bin 130, or any combination thereof are performed by one or more processors in at least one computing device.
  • Examples of computing devices include server computers, desktop computers, laptop computers, tablet computers, mobile phones, wearable devices, Internet of Things (loT) devices, and the like.
  • instructions for performing at least a portion of the functions of these elements are stored in memory and/or in a non-transitory computer readable medium. The instructions, for instance, are executed by the processor(s).
  • FIG. 1 also illustrates various types of data.
  • the sequence read data 112 the original variant indication 116, the original report 120, the outdated annotation 122, the variant bin 130, the population variant indications 132, the first to nth timestamps 134-1 to 134-n, the consensus annotation 136, the updated report 138, or any combination thereof, includes data.
  • the various types of data illustrated in FIG. 1 may be stored, such as in memory or in non-transitory computer readable media.
  • at least a portion of the data is transmitted or otherwise output by one or more computing devices.
  • a computing device may transmit one or more communication signals to another computing device, wherein the communication signal(s) encode at least a portion of the data.
  • Examples of communication signals include electromagnetic signals, optical signals, ultrasonic signals, optical signals, and electrical signals.
  • communication signals can be transmitted wirelessly and/or in a wired fashion.
  • the communication signals for instance, are transmitted over one or more wireless channels and/or one or more wired channels (e.g., optical cabling, electrical cabling, etc.).
  • the communication signal(s) are transmitted over one or more communication networks.
  • a communication network for instance, may be defined according to one or more physical channels, such as one or more frequency spectra.
  • a communication network is defined according to one or more communication protocols and/or standards.
  • Examples of communication networks include fiber optic networks, Institute of Electrical and Electronics Engineers (IEEE) networks (e.g., WI-FI TM networks, WiMAX networks, BLUETOOTHTM networks, etc.), cellular networks (e.g., a 3 rd Generation Partnership Project (3GPP) radio network, such as a Long Term Evolution (LTE) network, a New Radio (NR) network; or a cellular core network such as a 3 rd Generation (3G) core, a 4 th Generation (4G) core, a 5 th Generation (5G) core, etc.), ultrasonic networks, and the like.
  • 3GPP 3 rd Generation Partnership Project
  • LTE Long Term Evolution
  • NR New Radio
  • a cellular core network such as a 3 rd Generation (3G) core, a 4 th Generation (4G) core, a 5 th Generation (5G) core, etc.
  • ultrasonic networks and the like.
  • the data is broadcasted from one device to multiple other devices.
  • the subject 102 may present to the care provider 104 with breast cancer.
  • the care provider 104 may order, or perform, a tissue biopsy procedure on a breast tumor of the subject 102 in order to obtain the sample 106.
  • the nucleic acid molecules 108 may be obtained from cancer cells in the sample 106 of the tumor.
  • the sequencer 110 generates sequence read data 112 indicating sequences of the nucleic acid molecules 108.
  • the variant identifier 114 may identify, based on the sequence read data 112, that the cancer cells in the tumor of the subject 102 have a genetic variant associated with HER2 negative status.
  • the variant identifier 114 further determines, with reference to one or more databases, that the genetic variant identified is associated with cancers that are resistant to a first immunotherapy.
  • the report generator 118 may indicate, in the original report 120, the genetic variant.
  • the outdated annotation 122 my indicate that the genetic variant is associated with cancers that are resistant to the first immunotherapy.
  • the care provider 104 may review the original report 120 and recommend a chemotherapy regimen for the subject 102 based on the original report 120.
  • the record system 126 further stores the outdated annotation 122 in a first record of the variant bin 130 associated with the subject 102, wherein the variant bin 130 stores records associated with the genetic variant associated with HER2 negative status.
  • the first record may further include the first timestamp 134-1 , which indicates the time at which the original report 120 was output to the clinical device 124. In the weeks following the time indicated in the first timestamp 134-1 , a clinical trial may demonstrate the effectiveness of a second immunotherapy on cancers associated with the genetic variant associated with the variant bin 130. Additional records are added to the variant bin 130, most of which may indicate that the variant is associated with cancers that are responsive to the second immunotherapy.
  • the record system 126 may identify the consensus annotation 136 indicating the responsiveness of the second immunotherapy. Upon identifying the consensus annotation 136, the record system 126 may edit the first record (associated with the subject 102) to include the consensus annotation 136.
  • the record system 126 may output an indication of the consensus annotation 136 to the report generator 118.
  • the report generator 118 may generate and output the updated report 138 that includes the consensus annotation 136 indicating the responsiveness of the second immunotherapy.
  • the clinical device 124 outputs a push notification indicating the updated report 138 to the care provider 104.
  • the care provider 104 may view the updated report 138. Accordingly, the care provider 104 may update the treatment regimen of the subject 102 to include the second immunotherapy, which may result in superior outcomes for the subject 102.
  • FIG. 2 illustrates example records 200 stored in at least one patient database.
  • the example records 200 include various records stored in the population database(s) 128 described above with reference to FIG. 1.
  • thirty-four records are included in the example records 200.
  • a record system e.g., the record system 126) is configured to identify a consensus annotation based on the example records 200.
  • the example records 200 include timestamps 202 associated with each observed variant.
  • the timestamps 202 indicate the date and time at which reports on the observed variants were output.
  • the example records 200 further identify the observed variants by indicating positions 204 and types 206 of the observed variants.
  • the example records 200 indicate variants at the same position, but implementations are not so limited.
  • multiple types 206 of variants are indicated in the example records 200. For instance, record six and record twenty-two are C>G variants, whereas the rest of the records are C>T variants.
  • the variants are binned. For example, records six and record twenty-two are omitted from a bin corresponding to the C>T variant.
  • the record system may be configured to identify the consensus annotation based on annotations 208 of the portion of the example records 200 associated with the bin. For example, multiple early records among the example records 200 include annotations 208 indicating that the variant is associated with cancers that are "resistant to slugimib.” However, the majority of the later records occurring after September 1, 2021 include annotations 208 indicating that the variant is associated with cancers that are "resistant to slugimib” and "responsive to titanamib.” In various cases, the record system determines that greater than a threshold number (e.g., ten) records with timestamps after a threshold point in time (e.g., September 1 , 2021) include the same annotation. The record system may therefore determine that the same annotation is a consensus annotation.
  • a threshold number e.g., ten
  • the record system may update other records among the example records 200 to indicate the consensus annotation. For instance, the record system may rewrite at least one of the annotations 208 in records 1-5, 7-11 , 21 , or 23-25 to include the consensus annotation. In some cases, the record system further updates the timestamps 202 to indicate the time at which the example records 200 were rewritten to include the consensus annotation.
  • FIG. 3 illustrates an example report 300 summarizing predicted categories of a cancer of a subject. In various cases, the report 300 is the original report 120 or the updated report 138 described above with reference to FIG. 1 . The report 300, for instance, may be displayed to a patient and/or care provider. In some cases, the report 300 is generated based on features of a sample (e.g., a liquid biopsy sample) obtained from the subject.
  • a sample e.g., a liquid biopsy sample
  • the report 300 includes a tissue origin 302 of the cancer.
  • the tissue origin 302 for instance, indicates a histological tissue type 304, a primary site 306, cell subtype 307, or any combination, of the cancer.
  • the report 300 includes one or more therapy indicators 308.
  • the therapy indicator(s) 308 convey whether the cancer is predicted to be resistant to one or more predetermined therapies and/or whether the cancer is predicted to be responsive to one or more predetermined therapies.
  • the report 300 includes one or more prognostic indicators 310.
  • the prognostic indicator(s) 310 for instance, indicate a prognosis of the subject in view of the categorized cancer.
  • the prognostic indicator(s) 310 may indicate a survivability, a recoverability, a quality of life indicator, or other information indicative of the prognosis of the subject.
  • the report 300 may include a trial qualification 312 of the subject.
  • the trial qualification 312 indicates whether the subject is predicted to qualify for a predetermined clinical trial.
  • the report 300 includes a metastasis profile 314 of the subject.
  • the metastasis profile 314 indicates a likelihood that the cancer will metastasize (e.g., at a particular point in time), one or more tissues in which the cancer is predicted to metastasize, or the like.
  • the report 300 includes recommended follow-up tests 316.
  • the report 300 may include a recommendation to perform whole genome sequencing on the subject, particularly in cases if the cancer cannot be categorized above a threshold certainty.
  • the report 300 may include a genomic profile 318 of the subject.
  • the genomic profile 318 includes or is generated based on the results of genetic analyses of the subject.
  • FIG. 4 illustrates an example process 400 for updating annotations in patient records.
  • the process 400 for instance, is performed by an entity that includes at least one processor, at least one database, at least one computing device, the record system 126, or any combination thereof.
  • the entity identifies a consensus annotation of a genetic variant of a first set of population records.
  • the consensus annotation is shared by the first set of population records.
  • the consensus annotation is shared by greater than a threshold number of the first set of population records.
  • the genetic variant in various cases, may include nucleotide substitution, a nucleotide addition, a nucleotide deletion, a structural variant, or a number of copies of a predetermined nucleotide sequence.
  • the entity groups among population records indicating multiple types of genetic variants, population records indicating the genetic variant associated with the first set of population records. For instance, the population records indicate their respective genetic variants, and the entity bins the population records by genetic variant.
  • the population records indicate at least one of a chromosome where the genetic variant is detected, a position on the chromosome where the genetic variant is detected, a type of the genetic variant, or any combination thereof.
  • the entity updates a second set of population records based on the consensus annotation.
  • the second set of population records correspond to the same genetic variant.
  • the second set of population records are grouped with the first set of population records.
  • the second set of population records may have excluded the consensus annotation prior to performance of 404.
  • the second set of population records was generated before the first set of population records.
  • the first set of population records and the second set of population records include respective timestamps, wherein the timestamps of the first set of population records indicate later times than the timestamps of the second set of population records.
  • the first set of population records are associated with times after a threshold time.
  • the second set of population records are associated with at least one time before the threshold time.
  • the entity generates a report of a subject corresponding to the second set of population records based on the consensus annotation. For instance, the subject corresponds to one of the second set of population records.
  • the report in various cases, indicates the consensus annotation.
  • the report is output to at least one external device.
  • the second set of population records can be updated without the use of a linked database.
  • FIG. 5 illustrates an example environment 500 for sequencing various nucleic acid molecules 502.
  • the nucleic acid molecules 502 include cfDNA and/or gDNA.
  • the nucleic acid molecules 502 may include ctDNA.
  • the nucleic acid molecules 502, in various cases, are extracted from a sample, such as a biological sample obtained from a subject.
  • the nucleic acid molecules 502 include DNA that is complementary to RNA present in the sample.
  • the nucleic acid molecules 502, in various cases, are ligated with adapters 504.
  • the adapters 504 are hybridized to the nucleic acid molecules 502.
  • the adapters 504, for example, include additional nucleic acid molecules.
  • the adapters 504 have a shorter length than the nucleic acid molecules 502 being sequenced.
  • the adapters 504 include amplification primers, flow cell adapter sequences, substrate adapter sequences, or sample index sequences.
  • FIG. 5 illustrates adapters 504 being ligated to one end of each of the nucleic acid molecules 502, implementations are not so limited.
  • the adapters 504 may be ligated to both ends of each of the nucleic acid molecules 502.
  • the nucleic acid molecules 502 ligated with the adapters 504 are amplified in order to generate amplified molecules 506.
  • Various amplification techniques can be performed.
  • the amplified molecules 506 are generated using PGR, a non-PCR amplification technique, an isothermal amplification technique, or any combination thereof.
  • Amplified molecules 506 may be captured by bait molecules 510 and sequenced.
  • the amplified molecules 506 are sequenced via sequencing-by-synthesis.
  • fluorescently tagged deoxyribonucleotide triphosphates (dNTP) 512 are utilized to synthesize a strand that is complementary to DNA strands bound to the substrate 508.
  • dNTP 512 is added to the strand (e.g., by an enzyme)
  • the dNTP 512 emits an optical signal 514.
  • the frequency of the optical signal 514 is dependent on the type of dNTP 512 from which the optical signal 514 is emitted.
  • the amplified molecules 506 are sequenced via nanopore sequencing. For instance, the amplified molecules 506 are directed through a nanopore 516 extending through a substrate 518. In various cases, the amplified molecules 506 are negatively charged, such that they can be directed through the nanopore 516 by imposing an electrical field across the substrate 518. In various cases, the amplified molecules 506 and the nanopore 516 are in the presence of a charged solution. Thus, charged solutes traveling through the nanopore 516 can be monitored by reviewing an electrical signal (e.g., a current) sensed between electrodes 520 on either side of the substrate 518.
  • an electrical signal e.g., a current
  • the individual bases within the amplified molecule 506 will block the nanopore 516, which may decrease the amount of charged solutes traveling through the nanopore 516 and consequently, the magnitude of the electrical signal detected by the electrodes 520.
  • Each of the four types of bases within the amplified molecules 506, may block the nanopore 516 to a different extent. Therefore, the sequence of the nucleic acid molecules 502 can be derived by analyzing the measured electrical signal with respect to time as the amplified molecules 506 are directed through the nanopore 516.
  • FIG. 6 illustrates one or more devices 600 configured to perform various operations described herein.
  • the device(s) 600 include one or more processor(s) 602.
  • the processor(s) 602 includes a central processing unit (CPU), a graphics processing unit (GPU), both CPU and GPU, or other processing unit or component known in the art.
  • CPU central processing unit
  • GPU graphics processing unit
  • both CPU and GPU or other processing unit or component known in the art.
  • the processor(s) 602 is operably connected to memory 604.
  • the memory 604 is volatile (such as random access memory (RAM)), non-volatile (such as read only memory (ROM), flash memory, etc.) or some combination of the two.
  • the memory 604 stores instructions that, when executed by the processor(s) 602, causes the processor(s) 602 to perform various operations.
  • the memory 604 stores methods, threads, processes, applications, objects, modules, any other sort of executable instruction, or a combination thereof.
  • the memory 604 stores files, databases, or a combination thereof.
  • the memory 604 includes, but is not limited to, RAM, ROM, electrically erasable programmable read-only memory (EEPROM), flash memory, or any other memory technology.
  • the memory 604 includes one or more of CD-ROMs, digital versatile discs (DVDs), content-addressable memory (CAM), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other (e.g., non-transitory) medium which can be used to store the desired information and which can be accessed by the processor(s) 602.
  • the memory 604 stores instructions that, when executed by the processor(s) 602, causes the processor(s) 602 to perform operations of the variant identifier 114, the report generator 118, the record system 126, or any combination thereof.
  • the processor(s) 602 is operably connected to one or more input devices 606 and one or more output devices 608. Collectively, the input device(s) 606 and the output device(s) 608 function as an interface between at least one user and the device(s) 600.
  • the input device(s) 606 is configured to receive an input from a user and includes at least one of a keypad, a cursor control, a touch-sensitive display, a voice input device (e.g., a microphone), a haptic feedback device (e.g., a gyroscope), or any combination thereof.
  • the output device(s) 608 includes at least one of a display, a speaker, a haptic output device, a printer, or any combination thereof.
  • the processor(s) 602 causes a display among the input device(s) 606 to visually output various data described herein.
  • the input device(s) 606 includes one or more touch sensors
  • the output device(s) 608 includes a display screen
  • the touch sensor(s) are integrated with the display screen.
  • the processor(s) 602 is operably connected to one or more transceivers 610 that transmit and/or receive data over one or more communication networks 612.
  • the transceiver(s) 610 includes a network interface card (NIC), a network adapter, a local area network (LAN) adapter, or a physical, virtual, or logical address to connect to the various external devices and/or systems.
  • the transceiver(s) 610 includes any sort of wireless transceivers capable of engaging in wireless communication (e.g., radio frequency (RF) communication).
  • RF radio frequency
  • the communication network(s) 612 includes one or more wireless networks that include a 3rd Generation Partnership Project (3GPP) network, such as a Long Term Evolution (LTE) radio access network (RAN) (e.g., over one or more LTE bands), a New Radio (NR) RAN (e.g., over one or more NR bands), or a combination thereof.
  • 3GPP 3rd Generation Partnership Project
  • LTE Long Term Evolution
  • NR New Radio
  • the transceiver(s) 610 includes other wireless modems, such as a modem for engaging in WI-FI®, WIGIG®, WIMAX®, BLUETOOTH®, or infrared communication over the communication network(s) 612.
  • the device(s) 600 may further include the sequencer 110.
  • the sequencer 110 includes one or more fluidic circuits 614 configured to receive a sample 616 derived from a subject 618.
  • the sequencer 110 in various cases, may be configured to generate data indicative of one or more sequences of nucleic acid molecules (e.g., DNA and/or RNA) present in the sample 616.
  • the sequencer 110 introduces one or more reagents 619 to the fluidic circuit(s) 614 in order to prepare for and perform sequencing of the nucleic acid molecules.
  • the sequencer 110 may include one or more sensors 620 configured to measure or otherwise detect detection signals from the fluidic circuit(s) 614, which may be indicative of the sequences of the nucleic acid molecules.
  • the sensor(s) 620 may further include one or more ADCs.
  • the sequencer 110 outputs sequence read data to the processor(s) 602 for additional processing.
  • a method including: providing a plurality of nucleic acid molecules obtained from a sample from a subject in a population; ligating one or more adapters onto one or more nucleic acid molecules from the plurality of nucleic acid molecules; amplifying the one or more ligated nucleic acid molecules from the plurality of nucleic acid molecules; capturing all or a subset of the amplified nucleic acid molecules; sequencing, by a sequencer, the captured nucleic acid molecules to obtain a plurality of sequence reads that represent the captured nucleic acid molecules; receiving, at one or more processors, sequence read data for the plurality of sequence reads; identifying, using the one or more processors, a genetic variant by comparing the sequence read data to a reference sequence; generating, using the one or more processors, an example record including an indication of the genetic variant and an example annotation indicating information associated with the genetic variant; storing the example record in one or more databases; identifying, using the one or more processors, population records, corresponding to the genetic variant
  • timestamp data indicates that the first set of the population records was generated after the second set of the population records
  • the first set of the population records includes a different annotation than the second set of the population records
  • the one or more processors identify the different annotation from the first set of the population records as the consensus annotation.
  • identifying the population records corresponding to the genetic variant includes: identifying, using the one or more processors, a set of database records that: correspond to different genetic variants, and indicate variant attributes of the different genetic variants; grouping, using the one or more processors, the set of database records into a plurality of bins based on one or more of the variant attributes; and selecting, using the one or more processors, a group of database records associated with a bin, of the plurality of bins, as the population records corresponding to the genetic variant.
  • a method including: identifying, using one or more processors, population records of a genetic variant shared by samples of a population of subjects, the population records respectively including indications of the genetic variant and annotations of the genetic variant; determining, using the one or more processors, that a first set of the population records share a consensus annotation among the annotations, the first set of the population records corresponding to greater than a threshold number of subjects in the population; in response to determining that the first set of the population records includes the consensus annotation, updating, using the one or more processors, a second set of the population records to include the consensus annotation; and in response to updating the second set of the population records: identifying, using the one or more processors, an example subject among the subjects corresponding to an example population record among the second set of population records; and generating, using the one or more processors, a report based on the genetic variant and the consensus annotation.
  • identifying the population records of the genetic variant shared by the samples of the population of subjects includes: identifying, using the one or more processors, sequence read data of an example sample obtained from the example subject; identifying, using the one or more processors, the genetic variant by analyzing the sequence read data; and generating, using the one or more processors, the example population record indicating the genetic variant and an annotation of the genetic variant.
  • the one or more bait molecules include one or more nucleic acid molecules, each including a region that is complementary to a region of a captured nucleic acid molecule.
  • amplifying the one or more ligated nucleic acid molecules includes performing a polymerase chain reaction (PCR) amplification technique, a non-PCR amplification technique, or an isothermal amplification technique.
  • PCR polymerase chain reaction
  • sequencing the captured nucleic acid molecules includes use of a massively parallel sequencing (MPS) technique, whole genome sequencing (WGS), whole exome sequencing, targeted sequencing, direct sequencing, or Sanger sequencing.
  • MPS massively parallel sequencing
  • WGS whole genome sequencing
  • SEQ whole exome sequencing
  • sequencing the captured nucleic acid molecules includes next generation sequencing (NGS).
  • NGS next generation sequencing
  • sequencing the captured nucleic acid molecules includes sequencing-by-synthesis or nanopore sequencing.
  • generating, using the amplified ligated molecules, the detection signals includes simultaneously: synthesizing, by a polymerase using fluorescently tagged nucleotide triphosphates (NTPs), a synthesized nucleic acid molecule that is complementary to one of the amplified ligated molecules, and wherein detecting, by the at least one sensor, the detection signals includes: detecting, by at least one optical sensor, optical signals emitted by the fluorescently tagged NTPs upon binding to the synthesized nucleic acid molecule.
  • NTPs fluorescently tagged nucleotide triphosphates
  • RNA includes messenger RNA, microRNA, or non-coding RNA.
  • identifying the population records of the genetic variant shared by the samples of the population of subjects includes: grouping, using the one or more processors, a plurality of records of the plurality of genetic variants of the samples into one or more bins respectively corresponding to the plurality of the genetic variants, an example bin among the one or more bins including the population records corresponding to the particular genetic variant.
  • the example population record includes: an identity field indicating at least one of a chromosome where the genetic variant is detected, a position on the chromosome where the genetic variant is detected, or a type of the genetic variant.
  • the type of the genetic variant includes a nucleotide substitution, a nucleotide addition, a nucleotide deletion, a structural variant, or a number of copies of a predetermined nucleotide sequence.
  • updating the second set of the population records to include the consensus annotation includes: adding, using the one or more processors, the consensus annotation to the example population record.
  • updating the second set of the population records to include the consensus annotation includes: replacing, using the one or more processors, an outdated annotation in the example population record with the consensus annotation.
  • a timestamp associated with the example population record includes: a time field indicating a time at which information was added to an annotation of the example population record.
  • updating the second set of population records to include the consensus annotation includes adding, using the one or more processors, the consensus annotation to the example population record, and the method further includes: generating, using the one or more processors, a second report indicating prognostic information of an individual based at least in part on the consensus annotation added to the example population record.
  • a system including: at least one processor; and memory storing instructions that, when executed by the at least one processor, cause the at least one processor to perform operations including: the method of any of clauses 6 to 59.
  • a non-transitory computer-readable medium storing instructions for performing operations that include: the method of any of clauses 6 to 59.
  • TMB Tumor mutational burden
  • TMB tumor mutational burden
  • Mb megabase
  • germline (inherited) variants are excluded when determining TMB, given that the immune system has a higher likelihood of recognizing these as self.
  • driver mutations are excluded from a TMB calculation.
  • Microsatellites are highly polymorphic DNA-repeat regions.
  • microsatellite refers to a repetitive nucleic acid having repeat units of less than about 10 base pairs or nucleotides in length.
  • a microsatellite refers to a tract of tandemly repeated (i.e. adjacent) DNA motifs ranging from one to six or up to ten nucleotides, with each motif repeated 5 to 50 repeated times.
  • Mosatellite instability refers to genetic instability in the microsatellite regions.
  • a viral status test refers to a test that identifies the presence of viral RNA or DNA in a subject.
  • the test can identify viral load and/or viral identity.
  • the viral status test can identify the presence of viral RNA or DNA associated with the occurrence of certain cancers.
  • viruses include Hepatitis B Virus (HBV) and Hepatitis C Virus (HCV), Kaposi Sarcoma-Associated Herpesvirus (KSHV), Merkel Cell Polyomavirus (MCV), Human Papillomavirus (HPV), Human Immunodeficiency Virus Type 1 (HIV-1, or HIV), Human T-Cell Lymphotropic Virus Type 1 (HTLV-1), and Epstein-Barr Virus (EBV).
  • hotspot'' mutations give rise to oncological outcomes.
  • PhyloP, SIFT, Grantham, COSMIC and Poly Phen-2 are in silico tools that can be used to assess pathogenicity of identified variants.
  • Exemplary hotspot genes and mutations include EGFR exon 19 activating mutation, EGFR exon 19 deletion, EGFR exon 19 insertion, EGFR exon 19 sensitizing mutation, EGFR exon 20 activation mutation, EGFR exon 20 insertion, EGFR G719 mutation, EGFR L858R mutation, EGFR L861 mutation, EGFR S768 mutation, EGFR T790M mutation, C797 mutation, KIT activating mutation, KRAS activating mutation, MET activating mutation, NRAS activating mutation, PMS2 promoter mutations, among many others.
  • Hotspot mutations also occur in the following genes: AKT2, BRCA1, BRCA2, ERC1, NSD1 , POLH, PPM1G, PTEN, RAD18, RAD51, RAD51 B, RB1 , TERT, TP53, TP53Bp1 , ALK, ARMT1, ATAD5, ATG7, ATIC, AXL, BIRC6, BRD3, BRD4, CAPRIN1, CCAR2, CCDC6, CDK5RAP2, CHD9, CIT, CTNNB1, CUL1, EBF1, EIF3E, HIP1, HMGA2, IRF2BP2, NOTCH1, NOTCH4, NPM1 , OFD1 , TACC1 , TACC3, TERF2, TMEM106B, UBE2L3, USP10, WRDR48, YAP1, ZEB2, and ZMYND8.
  • a "DNA methylation test” refers to an assay, which can be commercially available, for distinguishing methylated versus unmethylated cytosine loci in DNA.
  • Techniques for measuring cytosine methylation include bisulfite-based methylation assays. The addition of bisulfite to DNA results in the methylation of unmethylated cytosine and its ultimate conversion to the nucleotide uracil. Uracil has similar binding properties to thiamine in the DNA sequence. Previously methylated cytosine does not undergo similar chemical conversion on exposure to bisulfite. Bisulfite assays can thus be used to discriminate previously methylated versus unmethylated cytosine.
  • An exemplary quantitative methylation detection assay combines bisulfite treatment and restriction analysis COBRA, which uses methylation sensitive restriction endonucleases, gel electrophoresis, and detection based on labeled hybridization probes. (Ziong and Laird, Nucleic Acid Res. 1997 25; 2532-4).
  • Another exemplary detection assay is the methylation specific polymerase chain reaction PCR (MSPCR) for amplification of DNA segments of interest. This assay can be performed after sodium bisulfite conversion of cytosine and uses methylation sensitive probes.
  • QM Quantitative Methylation
  • MethyLightTM Quantitative Methylation
  • pyrosequencing can be used to detect marker methylation.
  • Pyrosequencing is a method of DNA sequencing that relies on detection of the release of pyrophosphates as DNA is synthesized (and is therefore a "sequencing by synthesis” technique).
  • a DNA sample can be incubated with sodium bisulfite, converting unmethylated cytosine to uracil. The presence of uracil will result in thymine incorporation during PGR amplification. Therefore, sequencing results that include thymine at a nucleotide position that is known to encode cytosine can be interpreted as unmethylated sites.
  • cytosines present in the sequencing results indicate that the site was methylated in the original DNA sample, because methylation protects cytosine from conversion to uracil upon treatment.
  • Bisulfite treatment can also be performed on control samples with known methylation patterns, to reduce or eliminate false positive results.
  • Commercially available pyrosequencing machines include Pyro Mark Q96 (Qiagen, Hilden, Germany). For more details on methods to use pyrosequencing for measurement of methylation, see Delaney et al. Methods Mol Biol. 2015 1343: 249-264. Pyrosequencing is especially useful for detecting methylation in the CpG sites within genes.
  • a protein marker is detected by contacting a sample with reagents (e.g., antibodies), generating complexes of reagent and marker(s), and detecting the complexes.
  • reagents e.g., antibodies
  • detecting and measuring protein levels can use methods including agglutination, chemiluminescence, electrochemiluminescence (ECL), enzyme-linked immunoassays (ELISA), immunoassay, immunoblotting, immunodiffusion, Immunoelectrophoresis, immunofluorescence, immunohistochemistry, immunoprecipitation, mass-spectrometry, and western blot. See also, e.g., E.
  • Read depth refers to the number of times that a specific genomic site is sequenced during a sequencing run.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Bioethics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Ecology (AREA)
  • Public Health (AREA)
  • Physiology (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Des techniques de mise à jour d'annotations de variants génétiques sont décrites. Un procédé donné à titre d'exemple consiste à identifier des enregistrements de population d'un variant génétique partagé par des échantillons d'une population de sujets, les enregistrements de population comprenant respectivement des indications du variant génétique et des annotations du variant génétique. Le procédé donné à titre d'exemple consiste en outre à déterminer qu'un premier ensemble des enregistrements de population partage une annotation de consensus. Le premier ensemble des enregistrements de population correspond à un nombre supérieur à un nombre de seuil de sujets dans la population. Le procédé donné à titre d'exemple consiste à mettre à jour un second ensemble des enregistrements de population pour inclure l'annotation de consensus. En outre, un rapport basé sur le variant génétique et l'annotation de consensus est généré.
PCT/US2024/047949 2023-09-25 2024-09-23 Mise à jour d'enregistrements sur la base d'annotations de consensus de variants génétiques Pending WO2025072084A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202363540325P 2023-09-25 2023-09-25
US63/540,325 2023-09-25

Publications (1)

Publication Number Publication Date
WO2025072084A1 true WO2025072084A1 (fr) 2025-04-03

Family

ID=95202052

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2024/047949 Pending WO2025072084A1 (fr) 2023-09-25 2024-09-23 Mise à jour d'enregistrements sur la base d'annotations de consensus de variants génétiques

Country Status (1)

Country Link
WO (1) WO2025072084A1 (fr)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140088942A1 (en) * 2012-09-27 2014-03-27 Ambry Genetics Molecular genetic diagnostic system
US20150154354A1 (en) * 2011-10-31 2015-06-04 The Scripps Research Institute Systems and Methods for Genomic Annotation and Distributed Variant Interpretation
US20190205502A1 (en) * 2017-09-07 2019-07-04 Regeneron Pharmaceuticals, Inc. Systems and methods for leveraging relatedness in genomic data analysis
US20190390253A1 (en) * 2016-12-22 2019-12-26 Guardant Health, Inc. Methods and systems for analyzing nucleic acid molecules
US20200135296A1 (en) * 2018-10-31 2020-04-30 Ancestry.Com Dna, Llc Estimation of phenotypes using dna, pedigree, and historical data
US20200395100A1 (en) * 2015-10-09 2020-12-17 Guardant Health, Inc. Population based treatment recommender using cell free dna
US20210233664A1 (en) * 2018-10-17 2021-07-29 Tempus Labs Data Based Cancer Research and Treatment Systems and Methods
US20220399087A1 (en) * 2021-06-11 2022-12-15 Rady Children's Hospital Research Center Method and system for improved management of genetic diseases

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154354A1 (en) * 2011-10-31 2015-06-04 The Scripps Research Institute Systems and Methods for Genomic Annotation and Distributed Variant Interpretation
US20140088942A1 (en) * 2012-09-27 2014-03-27 Ambry Genetics Molecular genetic diagnostic system
US20200395100A1 (en) * 2015-10-09 2020-12-17 Guardant Health, Inc. Population based treatment recommender using cell free dna
US20190390253A1 (en) * 2016-12-22 2019-12-26 Guardant Health, Inc. Methods and systems for analyzing nucleic acid molecules
US20190205502A1 (en) * 2017-09-07 2019-07-04 Regeneron Pharmaceuticals, Inc. Systems and methods for leveraging relatedness in genomic data analysis
US20210233664A1 (en) * 2018-10-17 2021-07-29 Tempus Labs Data Based Cancer Research and Treatment Systems and Methods
US20200135296A1 (en) * 2018-10-31 2020-04-30 Ancestry.Com Dna, Llc Estimation of phenotypes using dna, pedigree, and historical data
US20220399087A1 (en) * 2021-06-11 2022-12-15 Rady Children's Hospital Research Center Method and system for improved management of genetic diseases

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ATZENI ROSSANO, MASSIDDA MATTEO, FOTIA GIORGIO, UVA PAOLO: "VariantAlert: A web‐based tool to notify updates in genetic variant annotations", HUMAN MUTATION, vol. 43, no. 12, 1 December 2022 (2022-12-01), US , pages 1808 - 1815, XP093301265, ISSN: 1059-7794, DOI: 10.1002/humu.24495 *
BRENT S. PEDERSEN, RYAN M. LAYER, AARON R. QUINLAN: "Vcfanno: fast, flexible annotation of genetic variants", GENOME BIOLOGY, vol. 17, no. 1, 1 December 2016 (2016-12-01), XP055374338, DOI: 10.1186/s13059-016-0973-5 *
PAN QI, LIU YUE-JUAN, BAI XUE-FENG, HAN XIAO-LE, JIANG YONG, AI BO, SHI SHAN-SHAN, WANG FAN, XU MING-CONG, WANG YUE-ZHU, ZHAO JUN,: "VARAdb: a comprehensive variation annotation database for human", NUCLEIC ACIDS RESEARCH, vol. 49, no. D1, 8 January 2021 (2021-01-08), GB , pages D1431 - D1444, XP093301264, ISSN: 0305-1048, DOI: 10.1093/nar/gkaa922 *
XIAO CHANG, KAI WANG: "wANNOVAR: annotating genetic variants for personal genomes via the web", JOURNAL OF MEDICAL GENETICS, BMJ PUBLISHING GROUP LTD, ENGLAND, vol. 49, no. 7, 1 July 2012 (2012-07-01), England, pages 433 - 436, XP009562287, ISSN: 0022-2593, DOI: 10.1136/jmedgenet-2012-100918 *

Similar Documents

Publication Publication Date Title
US20220213562A1 (en) Detection and treatment of disease exhibiting disease cell heterogeneity and systems and methods for communicating test results
US11746379B2 (en) Gene fusions and gene variants associated with cancer
Marcon et al. Comprehensive genomic analysis of translocation renal cell carcinoma reveals copy-number variations as drivers of disease progression
EP2986736B1 (fr) Fusions de gènes et variants de gènes associés au cancer
EP3122901B1 (fr) Fusions de gènes et variants de gènes associés au cancer
KR20220157976A (ko) 무세포 핵산의 분석 방법 및 이의 적용
WO2024081769A2 (fr) Méthodes et systèmes de détection du cancer sur la base de la méthylation de l'adn de sites cpg spécifiques
CN119948176A (zh) 检测样品中的癌症dna的方法
US20250272835A1 (en) Predicting treatment efficacy by analyzing non-cancer cells
WO2025072084A1 (fr) Mise à jour d'enregistrements sur la base d'annotations de consensus de variants génétiques
US20240052419A1 (en) Methods and systems for detecting genetic variants
US20240071628A1 (en) Database for therapeutic interventions
US20240018597A1 (en) DNA Copy Number Alterations (CNAs) to Determine Cancer Phenotypes
US20250197932A1 (en) Disease subtype classification using genomic features and clustering
EP4728097A2 (fr) Prédiction de l'expression d'une cellule cancéreuse par analyse de l'état de méthylation d'un adntc
US20250382667A1 (en) Identifying patient conditions by transforming nucleic acid sequence data into alternate domains
WO2026096829A1 (fr) Identification et correction de variants faux positifs
EP4728102A2 (fr) Identification et classification de tumeur à l'aide de caractéristiques fragmentomiques
WO2026006641A1 (fr) Détermination de la chronologie relative d'apparition d'une mutation et d'une amplification
WO2026050265A1 (fr) Séquençage multi-régions
WO2026096832A2 (fr) Détermination de variants importants de nombre de copies
WO2025010296A2 (fr) Classification pronostique basée sur des marqueurs génétiques
WO2025080809A1 (fr) Classification d'une maladie à l'aide d'images de fragment
CA2972433C (fr) Detection et traitement d'une maladie faisant preuve d'heterogeneite des cellules malades et systemes et procedes de communication des resultats de test
WO2024077041A2 (fr) Procédés et systèmes d'identification de signatures de nombre de copies

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24873372

Country of ref document: EP

Kind code of ref document: A1