WO2010143941A1 - Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique - Google Patents
Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique Download PDFInfo
- Publication number
- WO2010143941A1 WO2010143941A1 PCT/NL2009/050334 NL2009050334W WO2010143941A1 WO 2010143941 A1 WO2010143941 A1 WO 2010143941A1 NL 2009050334 W NL2009050334 W NL 2009050334W WO 2010143941 A1 WO2010143941 A1 WO 2010143941A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- aml
- genes
- subject
- class
- probe sets
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/106—Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/136—Screening for pharmacological compounds
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/16—Primer sets for multiplex assays
Definitions
- the present invention is in the field of medicine.
- the invention relates in particular to methods of genetic analysis for the classification, diagnosis and prognosis of leukemia, more especially acute myeloid leukaemia (AML), even more especially pediatric AML.
- the invention relates to nucleic acid expression profiles as obtained from cells of AML patients, which profiles by group similarity into a plurality of distinct and defined clusters that characterize different classes of AML.
- the invention relates to the use of such expression profiles and compositions in diagnosis and therapy of AML and specifically in the prediction of prognostically important AML classes.
- the invention further relates to methods for the diagnosis of AML and for the determination of the prognosis of a subject affected by AML and to kits of parts comprising sets of nucleic acid probes suitable for performing methods of the invention either by means of genomics or proteomics.
- AML is a heterogeneous disease that accounts for 15-20% of the acute leukemias in children (Downing J. R, and Shannon K.M. Cancer Cell 2002;2:437-45).
- Pediatric AML is classified according to the WHO classification, which is based on morphology (FAB MO- 7), immunophenotyping and non-random genetic aberrations (Vardiman J. W. et al., Blood 2002;100:2292-302). Over recent decades the outcome of pediatric AML has improved and current overall survival rates are in the range of 50-70% (Hann LM. et al., Ann Hematol 2004;83 Suppl 1:S1O8-12).
- Type I mutations often reflect molecular mutation hotspots in specific genes (FLT3, C-KIT and NRAS, KRAS, PTPNIl and NFl) involved in the proliferation of the hematopoietic cells (Balgobind B.V. et al., Blood 2008, 111:4322-8; Renneville A. et al., Leukemia 2008, 22:915-31).
- Type-II mutations are often chromosomal rearrangements of transcription factors leading to impaired differentiation of the hematopoietic cell (t(15;17), t(8;21), inv(16) or MLL-rearrangements). While t(15;17), t(8;21) and inv(16) are considered prognostically favorable subtypes, children with MLL-rearranged AML have an inferior outcome (Raimondi et al., supra).
- cytogenetic aberrations are part of the diagnostic screening.
- Conventional karyotyping can reveal these aberrations, but additionally FISH and RT-PCR are necessary.
- conventional karyotyping sometimes yields failures and FISH and RT-PCR false-negative results. Therefore, patients could benefit from more sensitive diagnostic methods, like gene expression microarrays.
- the introduction of genome-wide microarrays that quantify gene expression (mRNA) levels has accelerated the search for new markers of disease and outcome in cancer.
- Classical examples are gene expression profiles suitable to predict stage and prognosis of breast cancer (Van 't Veer, L.J.
- a double-loop cross validation approach generated a highly stable and accurate classifier with high predictive value in the cross-validation set as well as in a totally independent pediatric ALL set (PCT/NL2008/050373; Den Boer, M.L. et al, 2009, Lancet Oncology 10:125-134).
- One of the aims of the present invention is to provide an accurate classification tool for the diagnosis of pediatric AML. It is another aim to classify AML patients in which specific abnormal genotypes have not been found and to distinguish these groups not only from the cytogenetically well- defined AML classes, but also to define new molecular and prognostic subgroups within these unclassified AML types. The presence of additional molecular and prognostic classes in AML, not recognizable with currently available methods, may provide important insights into their pathophysiology. Therefore, it is an aim of the present invention to provide a more complete way of classification and risk-assignment of children with AML.
- gene expression signatures can be used for the classification of children with newly diagnosed AML with high accuracy to predict MLL-gene rearranged AML, t(8;21), inv(16), t(15;17) and t(7;12). Furthermore, the present invention has been able to identify specific gene expression signatures for FLT3- ITD within the subset of cases with t(15;17) and normal cytogenetics. Importantly, a high expression of HOXB-cluster related genes was found in cases with FLT3- ITD and normal cytogenetics.
- the present invention shows that classification by gene expression signatures was able to correctly predict several important cytogenetic and molecularsubtypes of pediatric AML. Based on these findings, the present invention now provides in a first aspect a method for producing a classification scheme for childhood acute myeoloid leukaemia (AML) comprising the steps of: a) providing a plurality of reference samples, said reference samples comprising cell samples from a plurality of reference subjects suffering from childhood AML, with known class of childhood AML, and optionally known prognosis; b) providing reference profiles by establishing a gene expression profile, matched with parameters for class, and optionally prognosis for each of said reference samples individually; c) constructing a classifier based upon the gene expression profiles of said reference samples, according to a double-loop cross-validation procedure, comprising:
- the classifier construction of said reference profiles is performed based on the information of genes that are differentially-expressed between profiles, and in an even more preferred embodiment of such a method, the classifier construction of said reference profiles is performed on the basis of the information of the genes of table 5, which table is provided hereinbelow.
- the classifier construction and accuracy calculation are performed using the said reference samples by means of a double-loop cross-validation method according to the concept described by Wessels L, Reinders M, Hart A, Veenman C, Dai H, et al. (2005). A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 21: 3755-62.
- the double-loop cross-validation method used herein is described in detail in the Examples below.
- the present invention provides a method for classifying a sample of a subject suffering from childhood AML, comprising the steps of: a) providing a classification scheme for classes of childhood AML according to the method for producing a classification scheme as described above; b) providing a subject profile by establishing a gene expression profile for said subject; c) applying the classifier to the subject profile; d) assigning to said sample the childhood AML class based on the result of step c.
- said gene expression profile comprises the expression parameters of a set of genes represented by at least the first 15 probe sets of Table 5, still more preferably the first 30 probe sets of Table 5 and more preferable all 75 probe sets of Table 5.
- step d) comprises the steps of: a) isolation of RNA from a sample of said subject; b) preparation of antisense RNA to the RNA of step a); c) hybridisation of said antisense RNA to an oligonucleotide microarray comprising 15 to 75 probe sets as defined in Table 5.
- the present invention provides a method of determining the prognosis for a subject suffering from childhood AML, said method comprising the steps of: a) performing a method for classifying a sample of a subject suffering from childhood AML according to the present invention,and b) assigning to said subject the prognosis corresponding to the established childhood AML class of said subject.
- the present invention provides an oligonucleotide microarray, comprising at least 15 oligonucleotide probes, more preferably at least 30, even more preferably at least 30-75 oligonucleotide probes, capable of hybridizing under stringent conditions to the genes corresponding to the first 15 probe sets, more preferably the first 30, even more preferably the first 30-75 probe sets associated with childhood AML as defined in Table 5.
- the present invention provides a kit-of-parts comprising an oligonucleotide microarray according to the invention and means for comparing a gene expression profile determined by using said microarray with a database of childhood AML reference expression profiles and optionally an instruction for performing a diagnostic or prognostic method of the present invention.
- the present invention provides a method for classifying the AML class of an AML affected subject, where an AML class may include one or more AML cytogenetic subtypes, but each subtype belongs to one class only.
- the said classifying method comprises the steps of: a) providing a classification scheme for AML by producing such a scheme according to the method of the invention; b) providing a subject profile by establishing a gene expression profile for said subject; c) applying the said classification scheme to the said reference profiles, yielding a classifier; and d) applying the said classifier to the said subject profile determining its AML class.
- the present invention provides a method for diagnosing AML class in an AML affected subject comprising: a) providing a classification scheme for AML according to a method of the invention; b) defining class- specific genes for each AML class by selecting those genes of which the expression level characterizes the clustered position of the corresponding AML class among the various AML classes within said scheme; c) determining the level of expression of a sufficient number of said class- specific genes in an AML affected subject; d) establishing whether the level of expression of said class- specific genes in said subject shares sufficient similarity to the level of expression that characterizes an individual AML class to thereby determine the presence of AML corresponding to said class in said subject.
- said class- specific genes may comprise all genes comprised in said gene expression profile as listed in Table 5.
- said class- specific genes comprise a set of at least the first 15 probe sets that resemble genes of table 5, more preferably at least the first 30 probes sets representing genes of table 5, still more preferably at least the first 50 probe sets of the genes of table 5.
- said class- specific genes comprise a set of at least 60 probe sets of the genes of table 5, still more preferably all 75 probe sets of the genes of table 5.
- at least the first 15 probe sets as listed in Table 5 are used. It is most preferred that the genes are selected from the genes having the highest ranking (i.e.
- the gene set comprises the genes covered by probe sets numbered 1-15 in Table 5, more preferably from genes numbered 1-30 or even more preferably from genes numbered 1-60, or still more preferably the gene set comprising genes covered by all 75 probes sets listed in Table 5.
- the present invention further provides a classification scheme for childhood AML, said scheme comprising a plurality of distinct AML classes that are differentiated on the basis of similarity clustering of gene expression profiles obtained from a plurality of reference subjects affected by AML.
- Said classification scheme is for instance obtainable by a method of the invention for producing such a scheme.
- said classification scheme is obtained by a method involving a support- vector machine applied to the said reference gene expression profiles based on, for instance, gene chip array-acquired values for hybridization intensities for each gene, such as for instance those obtainable by using an Affymetrix gene chip.
- Analysis of gene expression profiles obtained by using such gene chips preferably involves variance- stabilizing normalization (vsn) of all intensity values in order to guarantee comparability between the various arrayed specimens.
- vsn variance- stabilizing normalization
- For each gene and each class an empirical Bayes linear regression model (or comparable method) is applied and a p- value calculated.
- Per class the genes with the smallest p-values are identified and selected.
- the support- vector machine is applied to the reference samples, using a double-loop cross- validation to determine both the final number of genes that will be considered, and its corresponding accuracy.
- the final classifier is produced by choosing the best class- discriminating genes using all reference samples, using the number of genes determined, and then training it on those samples.
- the present invention further provides genes that are modulated
- genes and the proteins they encode are useful for diagnostic and prognostic purposes, and may also be used as targets for screening therapeutic compounds that modulate AML, such as antibodies, small chemical compounds and antisense oligonucleotides.
- the methods of detecting nucleic acids of the invention or their encoded proteins can be used for a number of purposes. Examples include early detection of AML, detection of subclones of an other subtype within AML cases with a different main subtype, monitoring and early detection of relapse following treatment of AML, monitoring response to therapy of AML, determining prognosis of AML, directing therapy of AML, and selection of therapy.
- Other aspects of the invention will become apparent to the skilled artisan by the following description of the invention.
- the present invention provides a method of detecting an AML-associated transcript in one or more cells from a patient, the method comprising contacting a biological sample from the patient with a polynucleotide, such as an oligonucleotide, that selectively hybridizes to a sequence at least 80% identical to a sequence of a gene as shown in Table 5.
- a polynucleotide such as an oligonucleotide
- the polynucleotide selectively hybridizes to a sequence at least 95% identical to a sequence of a gene as shown in Table 5.
- the polynucleotide comprises a sequence of a gene as shown in Table 5.
- the biological sample used in such methods of detection is a tissue sample or body liquids (e.g. peripheral blood, cerebrospinal fluid).
- the biological sample comprises isolated nucleic acids, e.g., mRNA.
- the polynucleotide is labeled, e.g., with a fluorescent label.
- the polynucleotide is immobilized on a solid surface.
- Figure 1 Hierarchical clustering of cytogenetic subgroups of pediatric AML by gene expression profiles.
- Heat map shows which probe sets/genes are relatively over-expressed (in red, light grey) and which probe sets/genes are relatively under-expressed (in green, dark grey) compared to the mean expression of all probe sets (see color legend).
- FLT3-ITD negative and positive cases are indicated by a number 0 and 1, respectively at the Y-axis.
- classifying is used in its art-recognized meaning and thus refers to arranging or ordering items, i.e. gene expression profiles, by classes or categories or dividing them into logically hierarchical classes, classes, and sub-classes based on the characteristics they have in common and/or that distinguish them.
- classifying refers to assigning, to a class or kind, an unclassified item.
- a "class” then being a grouping of items, based on one or more characteristics, attributes, properties, qualities, effects, parameters, etc., which they have in common, for the purpose of classifying them according to an established system or scheme.
- classification scheme is used in its art-recognized meaning and thus refers to a list of classes arranged according to a set of pre- established principles, for the purpose of organizing items in a collection or into groups based on their similarities and differences.
- classifier refers to a collection of items jointly defining the method. This collection is composed by: 1. the mathematical model used to predict the AML class using gene expression values as input; 2. the way reporters, such as expression levels of genes, are ordered when being selected for inclusion in the classifier; 3. the way such reporters are eliminated from the reporter set, which defines the basis for the classifier.
- clustering refers to the activity of collecting, assembling and/or uniting into a cluster or clusters items with the same or similar elements, a "cluster” referring to a group or number of the same or similar items, i.e. gene expression profiles, gathered or occurring closely together based on similarity of characteristics. "Clustered” indicates an item has been subjected to clustering.
- the term "clustered position" refers to the location of an individual item, i.e. a gene expression profile, in amongst a number of clusters, said location being determined by clustering said item with at least a number of items from known clusters.
- the process of clustering used in a method of the present invention may be any mathematical process known to compare items for similarity in characteristics, attributes, properties, qualities, effects, parameters, etc..
- Statistical analysis such as for instance multivariate analysis, or other methods of analysis may be used.
- methods of analysis such as self- organising maps, hierarchical clustering, multidimensional scaling, principal component analysis, supervised learning, k-nearest neighbours, support vector machines, discriminant analysis, partial least square methods and/or Pearson's correlation coefficient analysis are used.
- Pearson's correlation coefficient analysis significance analysis of microarrays (SAM) and/or prediction analysis of microarrays (PAM) are used to cluster gene expression profiles according to similarity.
- the present invention now provides several methods to accurately identify known as well as newly discovered diagnostically, prognostically and therapeutically relevant subgroups of pediatric AML.
- the basis of these methods resides in the measurement of (AML- specific) gene expression in subjects suffering from leukemia.
- the methods and compositions of the invention thus provide tools useful in choosing a therapy for leukemia patients, including methods for assigning a leukemia patient to an AML class or cluster, methods of choosing a therapy for an AML patient, and methods of determining the survival prognosis for an AML patient.
- the methods of the invention comprise in various aspects the steps of establishing a gene expression profile of subject samples, for instance of reference subjects suffering from leukemia or of a subject diagnosed or classified as having AML.
- the expression profiles of the present invention are generated from samples from subjects affected by AML, including subjects having AML, subjects suspected of having AML, subjects having a propensity to develop AML, or subjects who have previously had AML, or subjects undergoing therapy for AML.
- the samples from the subject used to generate the expression profiles of the present invention can be derived from a variety of sources including, but not limited to, single cells, a collection of cells, tissue, cell culture, bone marrow, blood, or other bodily fluids.
- the tissue or cell source may include a tissue biopsy sample, a cell sorted population, cell culture, or a single cell.
- Sources for the sample of the present invention include cells from peripheral blood or bone marrow, such as blast cells from peripheral blood or bone marrow.
- Samples may comprise at least 20%, at least 30%, at least 40%, at least 50%, at least 55%, at least 60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, or at least 95% cells having differential expression in AML classes, with a preference for samples having a high percentage of such cells.
- these cells are blast cells, such as leukemic cells.
- the percentage of a sample that constitutes blast cells may be determined by methods well known in the art; see, for example, the methods described in WO 03/083140.
- Gene expression profiling or “expression profiling” is used herein in its art-recognised meaning and refers to a method for measuring the transcriptional state (mRNA) or the translational state (protein) of a plurality of genes in a cell. Depending on the method used, such measurements may involve the genome-wide assessment of gene expression, but also the measurement of the expression level of selected genes, resulting in the establishment of a “gene expression profile” or “expression profile”, which terms are used in that meaning hereinbelow.
- an "expression profile” comprises one or more values corresponding to a measurement of the relative abundance of a gene expression product. Such values may include measurements of RNA levels or protein abundance.
- the expression profile can comprise values representing the measurement of the transcriptional state or the translational state of the gene.
- the transcriptional state of a sample includes the identities and relative abundance of the RNA species, especially mRNAs present in the sample. Preferably, a substantial fraction of all constituent RNA species in the sample are measured, but at least a sufficient fraction to characterize the transcriptional state of the sample is measured.
- the transcriptional state can be conveniently determined by measuring transcript abundance by any of several existing gene expression technologies.
- Translational state includes the identities and relative abundance of the constituent protein species in the sample. As is known to those of skill in the art, the transcriptional state and translational state are often related.
- Each value in the expression profiles as determined and embodied in the present invention is a measurement representing the absolute or the relative expression level of a differentially-expressed gene.
- the expression levels of these genes may be determined by any method known in the art for assessing the expression level of an RNA or protein molecule in a sample.
- expression levels of RNA may be monitored using a membrane blot (such as used in hybridization analysis such as Northern blot, and the like), or microwells, sample tubes, gels, beads or fibres (or any solid support comprising bound nucleic acids). See U.S. Patent Nos. 5,770,722, 5,874,219, 5,744,305, 5,677,195 and 5,445,934, to which explicit reference is made.
- the gene expression monitoring system may also comprise nucleic acid probes in solution.
- microarrays are used to measure the values to be included in the expression profiles. Microarrays are particularly well suited for this purpose because of the reproducibility between different experiments. DNA microarrays provide one method for the simultaneous measurement of the expression levels of large numbers of genes. Each array consists of a reproducible pattern of capture probes attached to a solid support. Labeled RNA or DNA is hybridized to complementary probes on the array and then detected by laser scanning. Hybridization intensities for each probe on the array are determined and converted to a quantitative value representing relative gene expression levels. See, the Experimental section. See also, U.S. Pat. Nos.
- High-density oligonucleotide arrays are particularly useful for determining the gene expression profile for a large number of RNA's in a sample.
- RNA isolated from the sample is converted to labeled cRNA and then hybridized to an oligonucleotide array. Each sample is hybridized to a separate array. Relative transcript levels are calculated by reference to appropriate controls present on the array and in the sample. See, for example, the Experimental section.
- the values in the expression profile are obtained by measuring the abundance of the protein products of the differentially-expressed genes.
- the abundance of these protein products can be determined, for example, using antibodies specific for the protein products of the differentially-expressed genes.
- antibody refers to an immunoglobulin molecule or immunologically active portion thereof, i.e., an antigen-binding portion.
- immunologically active portions of immunoglobulin molecules include F(ab) and F(ab')2 fragments which can be generated by treating the antibody with an enzyme such as pepsin.
- the antibody can be a polyclonal, monoclonal, recombinant, e.g., a chimeric or humanized, fully human, non-human, e.g., murine, or single chain antibody. In a preferred embodiment it has effector function and can fix complement.
- the antibody can be coupled to a toxin or imaging agent.
- a full-length protein product from a differentially-expressed gene, or an antigenic peptide fragment of the protein product can be used as an immunogen.
- Preferred epitopes encompassed by the antigenic peptide are regions of the protein product of the differentially-expressed gene that are located on the surface of the protein, e.g., hydrophilic regions, as well as regions with high antigenicity.
- the antibody can be used to detect the protein product of the differentially- expressed gene in order to evaluate the abundance and pattern of expression of the protein.
- These antibodies can also be used diagnostically to monitor protein levels in tissue as part of a clinical testing procedure, e.g., to, for example, determine the efficacy of a given therapy. Detection can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance (i.e., antibody labeling).
- detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials.
- suitable enzymes include horseradish peroxidase, alkaline phosphatase, ( ⁇ - galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride, quantum dots or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include 125 I, 131 I, 35 S or 3 H.
- the subject profile is compared to the reference profile to determine whether the subject expression profile is sufficiently similar to the reference profile.
- the subject expression profile is compared to a plurality of reference expression profiles to select the reference expression profile that is most similar to the subject expression profile. Any method known in the art for comparing two or more data sets to detect similarity between them may be used to compare the subject expression profile to the reference expression profiles.
- the subject expression profile and the reference profile are compared using a supervised learning algorithm such as the support vector machine (SVM) algorithm, prediction by collective likelihood of emerging patterns (PCL) algorithm, the k-nearest neighbour algorithm, or the Artificial Neural Network algorithm.
- SVM support vector machine
- PCL collective likelihood of emerging patterns
- a subject expression profile shows "statistically significant similarity” or "sufficient similarity” to a reference profile
- statistical tests may be performed to determine whether the similarity between the subject expression profile and the reference expression profile is likely to have been achieved by a random event. Any statistical test that can calculate the likelihood that the similarity between the subject expression profile and the reference profile results from a random event can be used.
- the method used for comparing a subject expression profile to one or more reference profiles is preferably carried out by applying the classifier, constructed as herein described, on the new subject expression profile. Also, in order to identify the AML class reference profile that is most similar to the subject expression profile, as performed in the methods for establishing the AML class of a subject having leukemia, i.e.
- profiles are clustered according to similarity and it is determined whether the subject profile corresponds to a known class of reference profiles.
- this method is used wherein the clustered position of the subject profile, obtained after performing the clustering analysis of the present invention, is compared to any known AML class. If the clustered position of the subject profile is within a cluster of reference profiles, i.e. forms a cluster therewith after performing the similarity clustering method, it is said that the AML of the subject corresponds to the AML class of reference profiles.
- the expression profiles comprise values representing the expression levels of genes that are differentially-expressed in AML classes.
- differentially-expressed as used herein means that the measured expression levels of a particular gene have a p-value less than 0.05, when comparing said expression levels of patients in a known AML class to said expression levels of patients in remaining AML classes.
- p-value refers to a value obtained from any statistical test that can be used to compare two independent groups of values, such as for example the Student's t-test, the Wilcoxon test and empirical-Bayes linear regression.
- the expression level may be up-regulated or down-regulated in a sample from a subject in comparison with a sample from a normal blood or bone marrow sample, or in comparison with the mean or median of all AML patients.
- the present invention provides groups of genes that are differentially-expressed in samples of patients in different AML classes. Values representing the expression levels of the nucleic acid molecules detected by the probes were analyzed as described in the Experimental section the statistical package R (version 2.3.1) and its packages vsn, elO71, globaltest, limma, multtest and marray (R Development Core Team. R: a language and environment for statistical computing. R foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-07-0, http://www.R- project.org (2006)). Probe sets that were most discriminative for the AML classes were used to perform a hierarchical clustering of patients using GeneMaths 2.0 software (Applied Maths, Sint-Martems-Latem, Belgium).
- the present invention thus provides a method of classifying AML subtypes. Using this method, an independent group of 80 AML samples could be accurately classified. A similar method has been described in
- PCT/NL2008/050373 for the classification of pediatric ALL subtypes, which for the method as applied serves as a priority document.
- the present invention thus provides a comprehensive classification of AML covering previously identified and defined classes. Further analysis of the minimum number of genes that defined or predicted these classes resulted in the establishment of class- specific genes or signature genes.
- the methods of the present invention comprise in some aspects the step of defining class- specific genes by selecting those genes of which the expression level characterizes the position of the corresponding AML class within a classification scheme of the present invention.
- the methods of the present invention comprise in some aspects the step of establishing whether the level of expression of class-predictive genes in a subject shares sufficient similarity to the level of expression that is characteristic for an individual AML class. This step is necessary in determining the presence of that particular AML class in a subject under investigation, in which case the expression of that gene is used as a predictive marker. Whether the level of expression of class-predictive genes in a subject shares sufficient similarity to the level of expression of that particular gene in an individual AML class is preferably determined by the classifier function used, which is preferably the support vector machine.
- the present invention also reveals gene expression profiles comprising values representing the expression levels of genes in the various identified AML classes.
- these expression profiles comprise the values representing the differential expression levels.
- the expression profiles of the invention comprise one or more values representing the expression level of a gene typical of a subject belonging to a defined AML class.
- Each expression profile contains a sufficient number of values such that the profile can be used to distinguish subjects belonging to different AML classes, and thus also distinguishing subjects with different prognosis.
- the expression profile comprises more than one or two values corresponding to a differentially-expressed gene, for example at least 3 values, at least 4 values, at least 5 values, at least 6 values, at least 7 values, at least 8 values, at least 9 values, at least 10 values, at least 11 values, at least 12 values, at least 13 values, at least 14 values, at least 15 values, at least 16 values, at least 17 values, at least 18 values, at least 19 values, at least 20 values, at least 22 values, at least 25 values, at least 27 values, at least 30 values, at least 35 values , at least 40 values, at least 45 values, at least 50 values, at least 75 values.
- the diagnostic accuracy of assigning a subject to an AML class will vary based on the number of values contained in the expression profile. Generally, the number of values contained in the expression profile is selected such that the diagnostic accuracy is at least 85%, at least 87%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%, as calculated using methods described elsewhere herein, with an obvious preference for higher percentages of diagnostic accuracy.
- the diagnostic accuracy of assigning a subject to an AML class will vary based on the strength of the correlation between the expression levels of the differentially-expressed genes within that specific AML class.
- the values in the expression profiles represent the expression levels of genes whose expression is strongly correlated with that specific AML class, it may be possible to use fewer values (genes) in the expression profile and still obtain an acceptable level of diagnostic or prognostic accuracy.
- the strength of the correlation between the expression level of a differentially-expressed gene and a specific AML class may be determined by a statistical test of significance.
- the empirical Bayes linear model used to select genes in some embodiments of the present invention assigns a standardized fold-change value to each differentially-expressed gene, indicating the strength of the correlation of the expression of that gene to a specific AML class.
- the Wilcoxon- statistic metric and Student's t- test both provide a value or score indicative of the strength of the correlation between the expression of the gene and its specific AML class. These scores may be used to select the genes of which the expression levels have the greatest correlation with a particular AML class to increase the diagnostic or prognostic accuracy of the methods of the invention, or in order to reduce the number of values contained in the expression profile while maintaining the diagnostic or prognostic accuracy of the expression profile.
- a database is kept wherein the expression profiles of reference subjects are collected and to which database new profiles can be added and classified using the already existing classifier such as to provide the predicted class of said new profile.
- new profiles added to the database may be used together with already existing profiles to re-build the said classifier, so as to improve the diagnostic and prognostic accuracy of the methods of the invention.
- the methods of the invention involve the classification of a subject affected by pediatric AML into one AML class, comprising the steps of: 1. providing an expression profile from a sample from a subject affected by pediatric AML; 2. calculating the distance of this subject expression profile to each one of the AML classses with known prognoses, or classes with known response to therapy; and 3. assigning the subject expression profile to the AML class it is closest.
- the prognosis of a subject affected by pediatric AML can be predicted by determining to which AML class with an established prognosis, such as a good prognosis or a bad prognosis, the expression profile from the subject is closest. Whenever a subject's expression profile can be classified into one of the AML classes, a preferred intervention strategy, or therapeutic treatment can then be proposed for said subject, and said subject can be treated according to said assigned strategy. As a result, treatment of a subject with AML can be optimized according to the identified class.
- the assignment of a subject affected by AML to an AML class is used in a method of choosing a therapy for the subject affected by AML.
- a therapy refers to a course of treatment intended to reduce or eliminate the effects or symptoms of a disease, in this case AML.
- a therapy regime will typically comprise, but is not limited to, a prescribed dosage of one or more drugs or hematopoietic stem cell transplantation. Therapies, ideally, will be beneficial and reduce the disease state but in many instances the effect of a therapy will have non-desirable effects as well.
- the present invention provides a method of determining the prognosis for an AML patient, said method comprising the steps of providing a classification scheme for AML by producing such a scheme according to a method of the invention and determining the prognosis for each AML class in said scheme based on clinical records for the AML subjects comprised in said class.
- the present invention provides for the assignment of the various clinical data recorded to reference subjects affected by AML. This assignment preferably occurs in a database. This has the advantage that once a new subject is identified as belonging to a particular AML class, then the prognosis that is assigned to that class may be assigned to that subject.
- compositions that are useful in determining the gene expression profile for a subject affected by AML and selecting a reference profile that is similar to the subject expression profile.
- These compositions include arrays comprising a substrate having capture probes that can bind specifically to nucleic acid molecules that are differentially-expressed in AML classes.
- a computer-readable medium having digitally encoded reference profiles useful in the methods of the claimed invention.
- the present invention provides arrays comprising capture probes for detection of polynucleotides (transcriptional state) or for detection of proteins (translational state) in order to detect differentially-expressed genes of the invention.
- array is intended a solid support or substrate with peptide or nucleic acid probes attached to said support or substrate.
- Arrays typically comprise a plurality of different nucleic acid or peptide capture probes that are coupled to a surface of a substrate in different, known locations. These arrays, also described as “microarrays” or colloquially “chips” have been generally described in the art, and reference is made U.S. Patent. Nos. 5,143,854, 5,445,934, 5,744,305, 5,677,195, 6,040,193, 5,424,186,6,329,143, and 6,309,831 and Fodor et al. (1991) Science 251:767-77. These arrays may generally be produced using mechanical synthesis methods or light directed synthesis methods which incorporate a combination of photolithographic methods and solid phase synthesis methods. Typically, "oligonucleotide microarrays” will be used for determining the transcriptional state, whereas “peptide microarrays” will be used for determining the translational state of a cell.
- Nucleic acid or “oligonucleotide” or “polynucleotide” or grammatical equivalents used herein means at least two nucleotides covalently linked together. Oligonucleotides are typically from about 5, 6, 7, 8, 9, 10, 12, 15, 25, 30, 40, 50 or more nucleotides in length, up to about 100 nucleotides in length. Nucleic acids and polynucleotides are a polymers of any length, including longer lengths, e.g., 200, 300, 500, 1000, 2000, 3000, 5000, 7000, 10,000, etc.
- a nucleic acid of the present invention will generally contain phosphodiester bonds, although in some cases, nucleic acid analogues are included that may have alternate backbones, comprising, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O- methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical Approach, Oxford University Press); and peptide nucleic acid backbones and linkages.
- Other analogous nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, and
- nucleic acids containing one or more carbocyclic sugars are also included within one definition of nucleic acids. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g. to increase the stability and half-life of such molecules in physiological environments or as probes on a biochip. Mixtures of naturally occurring nucleic acids and analogues can be made; alternatively, mixtures of different nucleic acid analogues, and mixtures of naturally occurring nucleic acids and analogues may be made.
- PNA peptide nucleic acids
- These backbones are substantially non- ionic under neutral conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. This results in two advantages.
- the PNA backbone exhibits improved hybridization kinetics. PNAs have larger changes in the melting temperature (T m ) for mismatched versus perfectly matched base pairs. DNA and RNA typically exhibit a 2-4 °C drop in T m for an internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C.
- T m melting temperature
- hybridization of the bases attached to these backbones is relatively insensitive to salt concentration.
- PNAs are not degraded by cellular enzymes, and thus can be more stable.
- the nucleic acids may be single stranded or double stranded, as specified, or contain portions of both double stranded or single stranded sequence.
- the depiction of a single strand also defines the sequence of the complementary strand; thus the sequences described herein also provide the complement of the sequence.
- the nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine, isoguanine, etc.
- Transcript typically refers to a naturally occurring RNA, e.g., a pre-mRNA, hnRNA, or mRNA.
- nucleoside includes nucleotides and nucleoside and nucleotide analogues, and modified nucleosides such as amino modified nucleosides.
- nucleoside includes non-natural occurring analogue structures. Thus, e.g. the individual units of a peptide nucleic acid, each containing a base, are referred to herein as a nucleoside.
- nucleic acid probe or oligonucleotide is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
- a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.).
- the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not functionally interfere with hybridization.
- probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions.
- the probes are preferably directly labeled such as with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin complex may later bind or with enzymatic labels.
- By assaying for the hybridization of the probe to its target nucleic acid sequence one can detect the presence or absence of the select sequence or subsequence. Diagnosis or prognosis may be based at the genomic level, or at the level of RNA or protein expression.
- oligonucleotide probes that can be used in diagnostic methods of the present invention.
- such probes are immobilised on a solid surface as to form an oligonucleotide microarray of the invention.
- the oligonucleotide probes useful in methods of the present invention are capable of hybridizing under stringent conditions to AML-associated nucleic acids, such as to one or more of the genes selected from Table 5.
- arrays may be fabricated on a surface of virtually any shape or even a multiplicity of surfaces.
- Arrays may be peptides or nucleic acids on beads, gels, polymeric surfaces, fibers such as fiber optics, glass or any other appropriate substrate, for the purpose of which reference is made to U.S. Pat. Nos. 5,770,358, 5,789,162,
- arrays may be packaged in such a manner as to allow for diagnostics or other manipulation of an all-inclusive device. Reference is for example made to U.S. Pat. Nos. 5,856,174 and 5,922,591.
- the arrays provided by the present invention comprise capture probes that can specifically bind a nucleic acid molecule that is differentially- expressed in AML classes. These arrays can be used to measure the expression levels of nucleic acid molecules to thereby create an expression profile for use in methods of determining the therapeutic treatment and prognosis for AML patients.
- each capture probe in the array detects a nucleic acid molecule selected from the nucleic acid molecules of the probe sets designated in Table 5.
- the designated nucleic acid molecules include those differentially-expressed between AML classes.
- the arrays of the invention comprise a substrate having a plurality of addresses, where each address has a capture probe that can specifically bind a target nucleic acid molecule.
- the number of addresses on the substrate varies with the purpose for which the array is intended.
- the arrays may be low- density arrays or high- density arrays and may contain 4 or more, 8 or more, 12 or more, 16 or more, 20 or more, 24 or more, 32 or more, 48 or more, 64 or more, 72 or more 80 or more, 96, or more addresses, or 192 or more, 288 or more, 384 or more, 768 or more, 1536 or more, 3072 or more, 6144 or more, 9216 or more, 12288 or more, 15360 or more, or 18432 or more addresses.
- the substrate has no more than 12, 24, 48, 96, or 192, or 384 addresses, no more than 500, 600, 700, 800, or 900 addresses, or no more than 1000, 1200, 1600, 2400, or 3600 addresses.
- the array comprises no more than 500 unique addresses, more preferably no more than 100 unique addresses. If an array is used to also determine other diseases, such as, for instance pediatric ALL (see PCT/NL2008/050373), the array would of course contain more addresses.
- the invention also provides a computer-readable medium comprising one or more digitally encoded expression profiles, where each profile has one or more values representing the expression of a gene that is differentially-expressed in an AML class.
- the preparation and use of such profiles is well within the reach of the skilled person (see e.g. WO 03/083140).
- the digitally-encoded expression profiles are comprised in a database. See, for example, U.S. Patent No. 6,308,170.
- kits useful for predicting the responsiveness to therapy or the survival chances in subjects affected by pediatric AML comprise an array and a computer readable medium.
- the array comprises a substrate having addresses, where each address has a capture probe that can specifically bind a nucleic acid molecule (by using an oligonucleotide array) or a peptide (by using a peptide array) that is differentially-expressed in an AML class.
- the results are converted into a computer-readable medium that has digitally-encoded expression profiles containing values representing the expression level of a nucleic acid molecule detected by the array.
- the amounts of various kinds of nucleic acid molecules contained in a nucleic acid sample can be simultaneously determined.
- mRNA in the sample is labelled, or labelled cDNA is prepared by using mRNA as a template, and the labelled mRNA or cDNA is subjected to hybridization with the array, so that mRNAs being expressed in the sample are simultaneously detected, whereby their expression levels can be determined.
- Genes each of which expression is altered due to AML can be found by determining expression levels of various genes in blood or bone marrow samples of patients and classified into certain types as described above and comparing the expression levels with the expression level in a control tissue.
- the method for determining the expression levels of genes is not particularly limited, and any of techniques for confirming alterations of the gene expressions mentioned above can be suitably used.
- the method using the array is especially preferable because the expressions of a large number of genes can be simultaneously determined. Suitable arrays are commercially available, e.g., from Affymetrix.
- mRNA is prepared from blood or bone marrow, and then reverse transcription is carried out with the resulting mRNA as a template.
- labelled cDNA can be obtained by using, for instance, any suitable labelled primers or labelled nucleotides.
- the labelling substance used for labelling there can be used substances such as radioisotopes, fluorescent substances, chemiluminescent substances and substances with fluophor, and the like.
- the fluorescent substance includes Cy2, FluorX, Cy3, Cy3.5, Cy5, Cy5.5, Cy7, fluorescein isothiocyanate (FITC), Texas Red, Rhodamine and the like.
- samples to be tested cancer samples to be tested in the present selection method
- a sample to be used as a control are each labelled with different fluorescent substances, using two or more fluorescent substances, from the viewpoint of enabling simultaneous detection.
- labelling of the samples is carried out by labelling mRNA in the samples, cDNA derived from the mRNA, or nucleic acids produced by transcription or amplification from cDNA.
- the hybridization is carried out between the above-mentioned labelled cDNA and the array to which a nucleic acid corresponding to a suitable gene or its fragment is immobilized.
- the hybridization may be performed according to any known processes under conditions that are appropriate for the array and the labelled cDNA to be used. For instance, the hybridization can be performed under the conditions described in Molecular Cloning, A laboratory manual, 2nd ed., 9.52-9.55 (1989).
- the hybridization between the nucleic acids derived from the samples and the array is carried out, under the above-mentioned hybridization conditions.
- the degradation of mRNA may take place due to actions of ribonuclease.
- the mRNA levels in both of these samples are adjusted using a standard gene with relatively little alterations in expressions.
- genes exhibiting differential expression levels in both samples can be detected.
- a signal which is appropriate depending upon the method of labelling used is detected for the array which is subjected to hybridization with the nucleic acid sample labelled by the method as described above, whereby the expression levels in the samples to be tested can be compared with the expression level in the control sample for each of the genes on the array.
- genes thus obtained which have a significant difference in signal intensities are genes each of which expression is altered specifically for certain pediatric AML classes.
- the present invention also provides a computer-readable medium comprising a plurality of digitally-encoded expression profiles wherein each profile of the plurality has a plurality of values, each value representing the expression of a gene that is differentially-expressed in an AML class.
- the invention also provides for the storage and retrieval of a collection of data relating to AML specific gene expression data of the present invention, including sequences and expression levels in a computer data storage apparatus, which can include magnetic disks, optical disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, magnetic bubble memory devices, and other data storage devices, including CPU registers and on- CPU data storage arrays.
- the data records are stored as a bit pattern in an array of magnetic domains on a magnetizable medium or as an array of charge states or transistor gate states, such as an array of cells in a DRAM device (e.g., each cell comprised of a transistor and a charge storage area, which may be on the transistor).
- kits are also provided by the invention.
- such kits may include any or all of the following: assay reagents, buffers, AML class- specific nucleic acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, ribozymes, arrays, antibodies, Fab fragments, capture peptides etc.
- the kits may include instructional materials containing directions (i.e., protocols) for the practice of the methods of this invention. While the instructional materials typically comprise written or printed materials, they are not limited to such. Any medium capable of storing such instructions and communicating them to an end user is contemplated by this invention.
- Such media include, but are not limited to electronic storage media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the like.
- Such media may include addresses to internet sites that provide such instructional materials.
- One such internet site may provide a database of AML reference expression profiles useful for performing similarity clustering of a newly determined subject expression profiles with a large set of reference profiles of AML subjects comprised in said database.
- the database includes clinically relevant data such as patient prognosis, effects of methods of treatment and other characteristics relating to the AML patient.
- kits comprising an array of the invention and a computer-readable medium having digitally-encoded reference profiles with values representing the expression of nucleic acid molecules detected by the arrays. These kits are useful for assigning a pediatric AML patient subject to an AML class.
- kits for screening for modulators of AML-associated sequences can be prepared from readily available materials and reagents.
- kits can comprise one or more of the following materials: an AML-associated polypeptide or polynucleotide, reaction tubes, and instructions for testing AML-associated activity.
- the kit may comprise an array for detecting AML- associated genes, specifically cluster- defining genes according to the invention.
- a wide variety of kits and components can be prepared according to the present invention, depending upon the intended user of the kit and the particular needs of the user.
- kits-of-parts comprises an oligonucleotide microarray according to the invention and means for comparing a gene expression profile determined by using said microarray with a database of AML reference expression profiles.
- the present invention also comprises kits of parts suitable for performing a method of the invention as well as the use of the various products of the invention, including databases, microarrays, oligonucleotide probes and classification schemes in diagnostic or prognostic methods of the invention.
- the methods and compositions of the invention may be used to screen test compounds to identify therapeutic compounds useful for the treatment of childhood AML.
- the test compounds are screened in a sample comprising primary cells or a cell line representative of a particular AML class.
- the expression levels in the sample of one or more of the differentially-expressed genes of the invention are measured using methods described elsewhere herein. Values representing the expression levels of the differentially-expressed genes are used to generate a subject expression profile.
- This subject expression profile is then compared to a reference profile associated with the AML class represented by the sample to determine the similarity between the subject expression profile and the reference expression profile. Differences between the subject expression profile and the reference expression profile may be used to determine whether the test compound has anti-leukemogenic activity.
- test compounds of the present invention can be obtained using any of the numerous approaches in combinatorial library methods known in the art, including: biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the 'one-bead one-compound' library method; and synthetic library methods using affinity chromatography selection.
- biological libraries are limited to polypeptide libraries, while the other four approaches are applicable to polypeptide, non-peptide oligomer or small molecule libraries of compounds (Lam (1997) Anticancer Drug Res. 12:145).
- Candidate compounds include, for example, 1) peptides such as soluble peptides, including Ig- tailed fusion peptides and members of random peptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84; Houghten et al.
- antibodies e.g., polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, and single chain antibodies as well as Fab, F(ab')Z, Fab expression library fragments, and epitope binding fragments of antibodies
- small organic and inorganic molecules e.g., molecules obtained from combinatorial and natural product libraries; 5) zinc analogs; 6) leukotriene A4 and derivatives; 7) classical aminopeptidase inhibitors and derivatives of such inhibitors, such as bestatin and arphamenine A and B and derivatives; 8) and artificial peptide substrates and other substrates, such as those disclosed herein above and derivatives thereof.
- the present invention discloses a number of genes that are differentially-expressed in AML classes. These differentially-expressed genes are shown in Table 5. Because the expression of these genes is associated with AML risk factors, these genes may play a role in leukemogenesis. Accordingly, these genes and their gene products are potential therapeutic targets that are useful in methods of screening test compounds to identify therapeutic compounds for the treatment of AML.
- the differentially-expressed genes of the invention may be used in cell-based screening assays involving recombinant host cells expressing the differentially-expressed gene product.
- the recombinant host cells are then screened to identify compounds that can activate the product of the differentially-expressed gene (i.e. agonists) or inactivate the product of the differentially-expressed gene (i.e. antagonists).
- any of the leukemogenic functions mediated by the product of the differentially-expressed gene may be used as an endpoint in the screening assay for identifying therapeutic compounds for the treatment of AML.
- Such endpoint assays include assays for cell proliferation, assays for modulation of the cell cycle, assays for the expression of markers indicative of AML, and assays for the expression level of genes differentially-expressed in AML classes as described above.
- Modulators of the activity of a product of a differentially- expressed gene identified according to these drug- screening assays provided above can be used to treat a subject with AML. These methods of treatment include the steps of administering the modulators of the activity of a product of a differentially-expressed gene in a pharmaceutical composition as described herein, to a subject in need of such treatment.
- This invention shows that expression profiling can identify transcripts associated with cytogenetic and molecular aberrations, therapeutic response and survival after diagnosis in patients suffering from childhood AML. As described above this knowledge can be used to identify patient classes with a high likelihood to respond to treatment and patient classes with favorable prognosis as well as a high likelihood for not responding to treatment and having an unfavorable outcome.
- MATERIALS and METHODS Patients Viably frozen bone marrow or peripheral blood samples from 237 random pediatric AML patients were included in this study.. Informed consent was obtained according to local law and regulations, in accordance with the declaration of Helsinki. Leukemic cells were isolated and enriched from these samples as previously described (Den Boer M.L. et al, J Clin Oncol 2003;21 :3262-8/ All resulting samples contained ⁇ 80% or more leukemic cells, as determined morphologically by cytospins stained with May-Grunwald-Giemsa (Merck, Darmstadt, Germany).
- leukemic cells were used for DNA and RNA extraction, and a minimum of 5xlO 6 leukemic cells were lysed in Trizol reagent (Gibco BRL, Life Technologies, Breda, the Netherlands). Genomic DNA and total cellular RNA were isolated according to manufacturer's protocol, with minor modifications (Van Vlierberghe P. et al., Blood 2008; 111 :4668-80).
- Cytogenetics Leukemic samples were routinely investigated for cytogenetic aberrations by standard chromosome-banding analysis, and screened for recurrent non-random genetic aberrations characteristic for AML, including t(15;17), inv(16), t(8;21) and MLL-rearrangements, using either RT-PCR and/or fluorescent in- situ hybridization (FISH) by each study group or, in case of lacking data, this was performed in our laboratory. In addition, all patients under the age of 18 months were screened for t(7;12) by FISH.
- FISH fluorescent in- situ hybridization
- the probes used were five cosmid clones covering the breakpoints in the ETV6 gene and a PAC clone (RP5-1121A15) containing the HLXB9 gene (von Bergh A.R. et al., Genes Chromosomes Cancer 2006;45:731-9).
- a probe in the serpinB2 gene was used as external control (Balgobind et al, submitted).
- the patients' samples were analyzed with MLPA according to the manufacturer's protocol.
- Data were analyzed using GeneMarker vl.5 (Softgenetics, State College, USA).
- RNA Integrity of total RNA was checked using the Agilent 2100 Bio- analyzer (Agilent, Santa Clara, USA). cDNA and biotinylated cRNA was synthesized hybridized and processed on the Affymetrix Human Genome Ul 33 Plus 2.0 Array (Affymetrix, Santa Clara, USA) according to the manufacturer's guidelines. Data-acquisition was performed using expresso (Bioconductor package Affy) and probe- set intensities were normalized using the variance stabilization normalization (Bioconductor package VSN) in the statistical data analysis environment R, version 2.2.0.
- a double-loop cross-validation approach was used to build the classifier, using the training set of 157 samples, as done by Den Boer et al. Briefly, this included a first stage where the minimum number of probe sets for the classifier was estimated via cross-validation (inner loop) within 2/3 of the samples. This was repeated 100 times at random where the remaining 1/3 samples were left for the second stage.
- First stage cross-validation involved creating subsets of approximately 9/10 and 1/10 of the samples. This subset division was also repeated 100 times at random.
- the performance of such a classifier was estimated (outer loop) on the remaining 1/3 of the samples, yielding an estimate of its prediction accuracy.
- the minimum number of probe sets was estimated and the performance of the classifier was assessed, we selected the final list of discriminant probe sets and built the classifier on all 157 samples in the training cohort. This classifier was then applied to the independent validation cohort, yielding independent performance of measures.
- An empirical Bayes linear regression model was also used to estimate the effect of molecular aberrations controlling for the cytogenetic subgroups.
- the sensitivity to classify patients in each subgroup is defined as the number of cases correctly classified out of the total number of true positive cases in each subgroup whereas the specificity is defined as the number of negative cases that was correctly predicted out of the total number of true negative cases for that subtype.
- the prediction accuracy is defined as the number of correctly classified negative and positive cases out of the total number of cases in each cohort.
- the positive predictive value (PPV) is defined as the proportion of positive test cases that is also true positive for the predicted subtype.
- the negative predictive value (NPV) is defined as the proportion of negative test cases that is also true negative for the predicted subtype .
- the classifier was constructed by selecting the most statistically significantly discriminative probe sets for each of the 5 cytogenetic subtypes MLL- rearranged, t(8;21), inv(16), t(15;17) and t(7;12)-positive AML.
- a double-loop cross validation approach was used that also included a backward selection procedure to keep the number of probe sets needed for most accurate classification to a minimum in order to avoid over-fitting of data, as previously published. [Den Boer et al., Lancet Oncology 2009; Wessels et al. Bioinformatics, 2005] .
- the minimum number needed for highest predictive accuracy was determined to be 75 probe sets, i.e. 15 probe sets per cytogenetic subtype.
- the true accuracy of the classifier was tested in the independent validation cohort of 80 patients.
- the true sensitivity, specificity, positive predictive value, negative predictive value and accuracy in this validation cohort was 98%, 100%, 100%, 97% and 99%, respectively (Table 3B).
- Only one MLL-rearranged AML case was misclassified as AML-other.
- Hierarchical cluster analysis also illustrated the discriminative value of the selected probe sets in the independent validation cohort (Figure IB). Sofar, only patients at initial diagnosis of AML were included.
- the classifier was also suitable to predict the subtype of relapsed and secondary AML cases. All 9 MLL-rearranged cases were correctly predicted by our classifier as well as by hierarchical clustering.
- NPMl NPMl
- CEBP ⁇ MLL-PTD
- MLL-PTD MLL-PTD
- Double loop-cross validation for 8 subtypes revealed a median sensitivity and accuracy of 77% and 87% respectively.
- This drop in predictive value of gene expression signatures was mainly caused by misclassied NPMl mutated and MLL-PTD cases.
- 7/8 cases with an NPMl mutation 4/6 cases with a CEBP ⁇ mutation and all 3 MLL-PTD cases were misclassified.
- FLT3-ITD C-KIT, NRAS, KRAS and PTPNIl
- FLT3-ITD An internal tandem duplication in FLT3 (FLT3-ITD) and mutations in C-KIT and in genes involved in the RAS-pathway (NRAS, KRAS and PTPNIl) were observed in 44% of all cases.
- FLT3-ITD a large number of discriminative probe sets was found (Table 4). However, since FLT3-ITD is often found in t(15;17)-positive cases, many probe sets were mainly discriminative for t(15;17) instead of being specific for FLT3-ITD.
- Bayes linear regression model we applied the Bayes linear regression model while adjusting for cytogenetic subtype.
- Cytogenetic aberrations have prognostic value in pediatric AML and, hence, the genetic subtypes are used as risk stratification in most current pediatric AML treatment protocols.
- current diagnostic strategies including conventional karyotyping, FISH and/or RT-PCR cannot always identify these specific aberrations fast and accurate.
- microarray-based gene expression profiling was explored as a diagnostic tool.
- a gene expression signature of 75 probe sets predicted the most important cytogenetic aberrations in an independent pediatric AML cohort with 99% accuracy and a positive and negative predictive value of 100% and 97%, respectively.
- unique gene-expression signatures were found for FLT3- ITD AML which differ between cytogenetic subtypes.
- the detected signature should have a high positive predictive value (PPV) and negative predictive value (NPV).
- PPV positive predictive value
- NPV negative predictive value
- the sensitivity and specificity of such a signature should be assessed in an independent and representative cohort, since the absence of a validation cohort can easily result into over- interpretation of data (Michiels et al. Lancet 2005;365:488-92).
- the predicted subgroups are associated with hematopoietic lineages, i.e. t(8;21) with FAB M2, t(15;17) with FAB M3 and inv(16) with M4eo.
- FAB M2 cases without t(8;21) were misclassified as t(8;21)-positive. This was also the case for M3 without t(15;17) and M4 without inv(16).
- the 75 probe sets harbored probe sets for genes involved in the specific translocations e.g. the probe sets for RUNXTl were highly discriminative for t(8;21), those for MYHIl for inv(16) and the probe set for HLXB9 for t(7;12).
- MLL-rearranged AML forms a heteregeneous group in which the MLL gene can fuse to many different translocation partners and different MLL- fusion genes have different impact on clinical outcome (Meyer et al., Leukemia 2006;20:777-84).
- the type of fusion product resulting from these chimeric fusion partners may each have different biological effects and hence, may affect gene expression levels.
- the probe sets selected in this study did not reveal that MLL-rearranged cases cluster together based on the fusion partner.
- most of the probe sets discriminative for MLL- rearrangements have not yet been linked to this aberration before.
- 6/15 probe sets were located in non-coding regions, of which 4 were in a specific region on chromosome 10. This is of interest, since nowadays these regions cannot be considered as junk-DNA, but are probably involved in the regulation of other genes, e.g. miRNA's.
- the cytogenetic subgroups that could be correctly predicted in our study comprise 50% of all AML patients in our cohort.
- the remaining patients were characterized by normal cytogenetics (CN-AML), other cytogenetic aberrations or cytogenetic failures.
- CN-AML normal cytogenetics
- This subgroup has been further characterized by recurrent molecular aberrations, i.e. NPMl, CEBP ⁇ and MLL-PTD.
- the heterogeneity of pediatric AML is further illustrated by molecular aberrations detected in different cytogenetic subgroups, i.e. FLT3- ITD, C-KIT and mutations in the RAS-pathway.
- the FLT3- ITD subgroup could be predicted using different gene expression signatures for FLT3- ITD positive cases in a CN-AML than those with t(15;17)-positive AML.
- the same mutations can play different roles in the leukemogenesis of pediatric AML.
- the genes of the HOXB cluster were over-expressed in all patients with a FLT3- ITD and CN-AML, but not with a t(15;17).
- HOXB genes were also identified as discriminating genes for adult FLT3-ITD-positive AML patients without further detailed study of the underlying cytogenetic subtype of these cases (Verhaak et al., Haematologica 2009; 94:131-4).
- HOXB upregulation has been correlated with NPMl mutations in CN-AML (Mullighan et al., Leukemia 2007; 21: 2000-9). Based on the current findings, HOXB overexpression is not restricted to NPMl mutated cases but also to patients with a FLT3- ITD and CN-AML.
- a specific gene expression signature existing of 75 probe sets could accurately identify 5 cytogenetic subgroups in pediatric AML. All remaining patients (covering -50% of all AML cases) are predicted as AML- other cases, which in fact reflects a highly heterogeneous subtype. It remains to be determined whether underlying (to be discovered) genetic aberrations in this subset of AML will result into distint gene expression patterns that can be used for classification.
- the major cytogenetic subtypes can be accurately predicted by a gene expression profile generated by one single assay. If used as new diagnostic tool, classification by gene expression profiling could reduce the number of current diagnostic procedures (cytomorphology, FISH, RT-PCR, karyotyping) with at least 50%. To do so, prospective studies are needed that determine the feasibility of obtaining gene expression profiling in clinical practice and that compare the results with current diagnostic strategies. This knowledge is a pre-requisite before applying gene expression profiling as novel diagnostic tool.
- Table 3 Diagnostic test values for the classification of pediatric AML by gene expression profile consisting of 75 probe sets
- AML-other CN-AML, AML-other abnormalities or -unknown, no specific probe sets were selected for this category
- Table 5a and Table 5b List of the 75 probe sets (split up for convenience; top 15 probe sets of Table 5a correspond to a preferred embodiment of the invention).
- Table 5 provides probe set identifiers from Affymetrix HGU133 plus 2.0 microarray chip representing reporters used in the classifier. Genes mapped by the probe sets are also given. Probe sets are ordered according to importance in the classifier. Probe sets numbered 1 up to and including 15 are essential for the classifier to perform minimally.
- Table 5a Probe set identifiers from Affymetrix HGU133 plus 2.0 microarray chip representing reporters used in the classifier. Listed are identifiers 1-15.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Analytical Chemistry (AREA)
- Wood Science & Technology (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biochemistry (AREA)
- Theoretical Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Oncology (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Hospice & Palliative Care (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente invention concerne des procédés d'analyse génétique pour la classification, le diagnostic et le pronostic de la leucémie myéloïde aiguë (LMA). L'invention concerne un procédé de production d'un schéma de classification pour la leucémie myéloïde aiguë (LMA) de l'enfant comprenant les étapes consistant à : a) obtenir une pluralité d'échantillons de référence, lesdits échantillons de référence comprenant des échantillons de cellules provenant d'une pluralité de sujets de référence souffrant de LMA de l'enfant, dont la classe de LMA de l'enfant est connue, et dont le pronostic est éventuellement connu; b) obtenir des profils de référence en établissant un profil d'expression génique, corrélé à des paramètres de classe, et, en option, un pronostic pour chacun desdits échantillons de référence individuellement; et c) construire un classificateur basé sur les profils d'expression génique desdits échantillons de référence, conformément à un protocole de validation croisée en double boucle.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/NL2009/050334 WO2010143941A1 (fr) | 2009-06-12 | 2009-06-12 | Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/NL2009/050334 WO2010143941A1 (fr) | 2009-06-12 | 2009-06-12 | Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2010143941A1 true WO2010143941A1 (fr) | 2010-12-16 |
Family
ID=43309047
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/NL2009/050334 Ceased WO2010143941A1 (fr) | 2009-06-12 | 2009-06-12 | Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2010143941A1 (fr) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2012156515A1 (fr) | 2011-05-18 | 2012-11-22 | Rheinische Friedrich-Wilhelms-Universität Bonn | Analyse moléculaire de la leucémie myéloïde aiguë |
| CN109657731A (zh) * | 2018-12-28 | 2019-04-19 | 长沙理工大学 | 一种微滴数字pcr仪抗干扰分类方法 |
| CN120277548A (zh) * | 2025-06-11 | 2025-07-08 | 之江实验室 | 一种基于原型对比学习的急性髓系白血病亚型分型方法 |
-
2009
- 2009-06-12 WO PCT/NL2009/050334 patent/WO2010143941A1/fr not_active Ceased
Non-Patent Citations (6)
| Title |
|---|
| BALGOBIND ET AL.: "Identification of gene expression signature accurately predicting cytogenetic subtypes in pediatric acute myeloid leukemia", 50TH ANNUAL MEETING AND EXPOSITION, ONLINE PROGRAMS AND ABSTRACTS, 10 January 2009 (2009-01-10), pages 1 - 2, XP002545345, Retrieved from the Internet <URL:http://ash.confex.com/ash/2008/webprogram/Paper6686.html> [retrieved on 20090908] * |
| DEN BOER M L ET AL: "A subtype of childhood acute lymphoblastic leukaemia with poor treatment outcome: a genome-wide classification study", LANCET ONCOLOGY, LANCET PUBLISHING GROUP, LONDON, GB, vol. 10, no. 2, 1 February 2009 (2009-02-01), pages 125 - 134, XP025940624, ISSN: 1470-2045, [retrieved on 20090201] * |
| ROSS MARY E ET AL: "Gene expression profiling of pediatric acute myelogenous leukemia.", BLOOD 1 DEC 2004, vol. 104, no. 12, 1 December 2004 (2004-12-01), pages 3679 - 3687, XP002545344, ISSN: 0006-4971 * |
| TAN AIK CHOON ET AL: "Simple decision rules for classifying human cancers from gene expression profiles", BIOINFORMATICS (OXFORD), vol. 21, no. 20, October 2005 (2005-10-01), pages 3896 - 3904, XP002545348, ISSN: 1367-4803 * |
| WESSELS LODEWYK F A ET AL: "A protocol for building and evaluating predictors of disease state based on microarray data", BIOINFORMATICS (OXFORD), vol. 21, no. 19, October 2005 (2005-10-01), pages 3755 - 3762, XP002517267, ISSN: 1367-4803 * |
| YAGI TOMOHITO ET AL: "Identification of a gene expression signature associated with pediatric AML prognosis.", BLOOD 1 SEP 2003, vol. 102, no. 5, 1 September 2003 (2003-09-01), pages 1849 - 1856, XP002545343, ISSN: 0006-4971 * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2012156515A1 (fr) | 2011-05-18 | 2012-11-22 | Rheinische Friedrich-Wilhelms-Universität Bonn | Analyse moléculaire de la leucémie myéloïde aiguë |
| CN109657731A (zh) * | 2018-12-28 | 2019-04-19 | 长沙理工大学 | 一种微滴数字pcr仪抗干扰分类方法 |
| CN120277548A (zh) * | 2025-06-11 | 2025-07-08 | 之江实验室 | 一种基于原型对比学习的急性髓系白血病亚型分型方法 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP4980878B2 (ja) | 遺伝子発現プロファイリングによる急性骨髄性白血病の分類、診断、および予後 | |
| US11795496B2 (en) | Epigenetic chromosome interactions | |
| CA2585561C (fr) | Resultat de groupe esr1, pgr, bcl2 et scube2 comme indicateurs de pronostic de cancer du sein et de prediction de la reponse au traitement | |
| CA2585571C (fr) | Prediction de reaction a la chimiotherapie au moyen de marqueurs d'expression genique | |
| JP6404304B2 (ja) | メラノーマ癌の予後予測 | |
| WO2020146554A2 (fr) | Similarité de profilage génomique | |
| US20160060704A1 (en) | Methods and Compositions for Diagnosis of Glioblastoma or a Subtype Thereof | |
| AU2021265878A1 (en) | Immunotherapy response signature | |
| WO2004097051A2 (fr) | Techniques et appareils de diagnostic de lam et de mds | |
| CA2736124A1 (fr) | Voies a l'origine de la tumorigenese pancreatique et gene hereditaire du cancer pancreatique | |
| JP2010502227A (ja) | 生物学的経路の遺伝子発現分析を用いたリンパ節陰性の原発性乳がんの遠隔転移を予測する方法 | |
| US9890430B2 (en) | Copy number aberration driven endocrine response gene signature | |
| JP2011509689A (ja) | Ii及びiii期結腸癌の分子病期分類並びに予後診断 | |
| EP2419540B1 (fr) | Procédés et signature d'expression génétique pour évaluer l'activité de la voie ras | |
| JP2007508812A (ja) | 乳癌分類に関する材料および方法 | |
| CA2959670A1 (fr) | Compositions, procedes et trousses pour le diagnostic de neoplasme neuroendocrinien gastroenteropancreatique | |
| CA2912445A1 (fr) | Procedes de prediction du risque de recurrence de cancer du sein de stade precoce a ganglions positifs | |
| JP2008545399A (ja) | 白血病疾患遺伝子およびその使用 | |
| CA2753971C (fr) | Test de recidive a progression acceleree | |
| US20190024184A1 (en) | Distinguishing metastatic-lethal prostate cancer from indolent prostate cancer using methylation status of epigenetic markers | |
| WO2010143941A1 (fr) | Classification et évaluation des risques de leucémie myéloïde aiguë (lma) de l'enfant par des signatures d'expression génique | |
| CN117355616A (zh) | 用于肝细胞癌的dna甲基化生物标志物 | |
| US20050287541A1 (en) | Microarray for predicting the prognosis of neuroblastoma and method for predicting the prognosis of neuroblastoma | |
| CN118222713A (zh) | 生物标志物在检测脑胶质瘤相关tls中的应用 | |
| US20120264633A1 (en) | Methods for detecting thrombocytosis using biomarkers |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09788191 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 09788191 Country of ref document: EP Kind code of ref document: A1 |