CN114678122A - A cancer risk prediction method, system, device and medium - Google Patents
A cancer risk prediction method, system, device and medium Download PDFInfo
- Publication number
- CN114678122A CN114678122A CN202210149664.9A CN202210149664A CN114678122A CN 114678122 A CN114678122 A CN 114678122A CN 202210149664 A CN202210149664 A CN 202210149664A CN 114678122 A CN114678122 A CN 114678122A
- Authority
- CN
- China
- Prior art keywords
- cancer risk
- risk prediction
- cancer
- plasma sample
- detection data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N1/00—Sampling; Preparing specimens for investigation
- G01N1/28—Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
- G01N1/34—Purifying; Cleaning
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N1/00—Sampling; Preparing specimens for investigation
- G01N1/28—Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
- G01N1/40—Concentrating samples
- G01N1/4077—Concentrating samples by other techniques involving separation of suspended solids
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/62—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
- G01N21/63—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
- G01N21/65—Raman scattering
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N1/00—Sampling; Preparing specimens for investigation
- G01N1/28—Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
- G01N1/40—Concentrating samples
- G01N1/4077—Concentrating samples by other techniques involving separation of suspended solids
- G01N2001/4088—Concentrating samples by other techniques involving separation of suspended solids filtration
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Pathology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Data Mining & Analysis (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The invention relates to a cancer risk prediction method, system, device and medium, the method comprises obtaining Raman detection data of a plasma sample; and inputting the data into a cancer risk prediction model trained in advance based on a BP neural network to obtain a prediction result. The method has the advantages that the Raman detection data and the artificial intelligence algorithm are combined to establish a cancer risk prediction model, so that the detection time is greatly shortened, the cancer risk prediction result can be obtained in about 15 minutes, and whether follow-up accurate detection is performed or not is determined; the detection method is simple to operate, can simultaneously detect various substances at one time, and has high accuracy of the cancer risk prediction result; the detection time is short; the cost is low; the medical waste is less; the prediction method has high specificity, high sensitivity and high accuracy; the rapid large-scale detection of the small molecule metabolites can be realized, and the method has the advantages of high throughput and high accuracy and is relatively low in detection cost. The method can achieve better prediction effect if combined with clinical detection results.
Description
Technical Field
The present invention relates to the technical field of medical data management, and in particular, to a cancer risk prediction method, system, computer device, and computer-readable storage medium.
Background
One common feature of cancer cell metabolism is the ability to take the necessary nutrients from an environment that is often nutrient deficient and use these nutrients to maintain viability and establish new biomass. Alterations in intracellular and extracellular metabolites that accompany cancer-related metabolic reprogramming have profound effects on gene expression, cell differentiation, and the tumor microenvironment. Cancer-associated metabolic changes can be summarized from a metabolite perspective as six markers: (1) uncontrolled uptake of glucose and amino acids; (2) using a speculative mode for nutrient acquisition, (3) using glycolysis/TCA cycle intermediates for biosynthesis and NADPH production; (4) increasing the demand for nitrogen; (5) a metabolite-driven alteration of gene regulation; (6) interact with the metabolism of the microenvironment. While few tumors have all six markers at the same time, most show several of them, so a particular feature exhibited by a single tumor may ultimately contribute to better tumor classification and to guide treatment.
Although the first observation of metabolic changes characteristic of tumors was first made almost a century ago, the field of cancer metabolism has become a renewed topic of interest in the past decade. With the help of new biochemical and molecular biological tools, the study of cancer cell metabolism has expanded our understanding of the mechanisms and functional consequences of tumor-associated metabolic alterations at various stages of tumorigenesis. Although changes at the Gene level, such as Gene knock-out (Deletion) or knock-in (Insertion), have been studied extensively, resulting in changes in Gene Copy Number (GCNA), Gene Mutation (Mutation), non-coding RNA and post-transcriptional modifications, such as Methylation, Acetylation (Acetylation), etc., markers, these Gene level, protein level and transcription, regulation of post-transcriptional levels, must be shown in some form as metabolites. Thus, the metabolite is the common pool after expression of all genes.
The two major nutrients that support mammalian cell survival and biosynthesis are glucose and glutamine. Through catabolism of glucose and glutamine, cells maintain a variety of carbon intermediates, are used as structural bases for assembly of various macromolecules, while they mediate electron transport chains to facilitate ATP production, or in the form of the relevant cofactor NADPH, which provides reducing power for various biosynthetic reactions to maintain cellular redox capacity. The consumption of glucose by tumors was significantly increased compared to non-proliferating normal tissue, a phenomenon first described by the german physiologist Otto Warburg more than 90 years ago. Are sequentially demonstrated in various tumor environments and have been shown to correlate with poor tumor prognosis. Imaging of the uptake of the radiofluoride labelled glucose analogue 18F-fluorodeoxyglucose (18F-FDG) based on Positron Emission Tomography (PET) has been successful for full clinical tumour diagnosis and staging, as well as monitoring the response to therapy.
Glutamine is the second major metabolite that supports tumor growth. It provides not only carbon but also reduced nitrogen for de novo biosynthesis of many different nitrogen-containing compounds. Thus, glutamine provides the nitrogen required for the biosynthesis of purine and pyrimidine nucleotides, glucosamine 6-phosphate and non-essential amino acids. Glutamine has also been reported to play a role in the absorption of essential amino acids. While non-essential amino acids can be produced de novo by mammalian cells, essential amino acids must be obtained from an external source. Interestingly, the transport of the essential amino acid leucine into cells via the plasma membrane-localized neutral amino acid antiporter (LAT1) is associated with the simultaneous extracellular transport of glutamine. In this manner, intracellular glutamine can facilitate the import of a variety of LAT1 substrates, including leucine, isoleucine, valine, methionine, tyrosine, tryptophan, and phenylalanine.
There is a review article for 31 reviews including targeted and non-targeted metabolite studies in serum showing: various metabolites, including glucose, fructose, galactose, mannose, malonic acid and inosine, cholesterol and arachidonic acid, glycosylation (immunoglobulin G [ IgG ] Fc-glycosylation), choline derivatives, lactic acid, fatty acids, etc. are all significant changes in the development of thyroid cancer, where citrate is considered the first most important biomarker, followed by lactic acid.
It follows that metabolites have been studied for a long time in the context of tumor alterations, and their association with tumors has a very broad and robust basis for research. Only methods for predicting cancer tumors based on the above metabolites have not been found so far.
The prior art has the following defects:
1) the detection process is complex: at present, the protein-based markers basically rely on enzyme-linked immunosorbent assay, and finally develop color to judge the concentration of substances;
2) the detection component is single: at present, multitubular blood is often drawn for detecting different indexes of clinical serum or plasma, and different tumor markers are respectively detected;
3) the detection time is long: the detection time of different serum markers varies from 15 minutes to 12 hours, even longer;
4) The detection cost is high: each cost is dozens to hundreds of yuan, and the sum of a plurality of items of the cost is less and more;
5) environmental pollution: a large amount of waste of the kit, namely medical waste, is generated in the detection process, the treatment is time-consuming, and the environment is easily polluted.
At present, no effective overall solution exists for the problems of complex detection process, single detection component, multiple detection items, high detection cost, long detection time and environmental pollution existing in the related technology for detecting the serological marker of the tumor.
Disclosure of Invention
The present application aims to overcome the disadvantages in the prior art, and provide a cancer risk prediction method, system, device and storage medium based on metabolite detection in serum, so as to solve at least the problems of complex detection process, single detection component, multiple detection items, high detection cost, long detection time and environmental pollution existing in the related art.
In order to achieve the purpose, the technical scheme adopted by the application is as follows:
in a first aspect, the present invention provides a method for predicting cancer risk, comprising:
acquiring Raman detection data of a plasma sample;
inputting the Raman detection data into a cancer risk prediction model which is trained in advance and is based on a BP neural network artificial intelligence algorithm so as to obtain a cancer risk prediction result;
Wherein the accuracy rate of the cancer risk prediction result is greater than 91.48%, the sensitivity is greater than 88.70%, and the specificity is greater than 95.98%.
In some embodiments thereof, the cancer comprises renal cancer, gastric cancer, cervical cancer, rectal cancer, prostate cancer, lung cancer, ovarian cancer, breast cancer.
In some of these embodiments, the raman detection data comprises generic raman-like detection data comprising at least:
| ID | Name | |
| 1 | M365T412_2 | Lactose |
| 2 | M487T404 | Blood group b trisaccharide |
| 3 | M203T272 | DL-tryptophan |
| 4 | M173T309 | Gly-Val |
| 5 | M437T96 | Hc toxin |
| 6 | M291T33 | Dl-norleucine methyl ester |
| 7 | M697T235 | Izenamide c |
。
in some of these embodiments, the raman detection data further comprises:
renal cancer raman detection data comprising at least:
| ID | Name | |
| 1 | M751T38 | 1-myristoyl-2-palmitoyl-sn-glycero-3-phosphocholine |
| 2 | M141T292 | Kojic acid |
| 3 | M865T137 | Pc(16:1e/17-hdohe) |
| 4 | M791T137 | Pe(18:1e/10-hdohe) |
| 5 | M671T141_2 | 1-Hexadecanoyl-2-(9Z,12Z-octadecadienoyl)-sn-glycero-3-phosphoric acid |
| 6 | M309T25_2 | Mestranol |
| 7 | M841T140 | Pc(16:0e/8-hepe) |
| 8 | M580T180 | 1-behenoyl-2-hydroxy-sn-glycero-3-phosphocholine |
| 9 | M538T185 | 1-Palmitoyllysophosphatidylcholine |
| 10 | M193T82_1 | Trans-3'-hydroxycotinine |
。
in some of these embodiments, the raman detection data further comprises:
gastric cancer raman detection data comprising at least:
in some of these embodiments, the raman detection data further comprises:
cervical cancer raman detection data comprising at least:
in some of these embodiments, the raman detection data further comprises:
raman data for rectal cancer comprising at least:
in some of these embodiments, the raman detection data further comprises:
raman detection data of prostate cancer comprising at least:
in some of these embodiments, the raman detection data further comprises:
Lung cancer raman detection data comprising at least:
in some of these embodiments, the raman detection data further comprises:
raman ovarian cancer detection data comprising at least:
| ID | Name | |
| 1 | M187T44 | 1-hydroxy-2-naphthoic acid |
| 2 | M355T37_1 | Fumarprotocetraric acid |
| 3 | M159T119_2 | 3-hydroxyoctanoic acid |
| 4 | M498T44 | Taurochenodeoxycholate |
| 5 | M182T40 | 4-pyridoxic acid |
| 6 | M300T204 | N-Acetyl-D-Glucosamine 6-Phosphate |
| 7 | M204T47 | N,n'-diacetylchitobiose |
| 8 | M191T255 | 5-methyl-5-phenylhydantoin |
| 9 | M174T570 | Ala-Thr-Arg |
| 10 | M130T582 | D-Pipecolinic acid |
| 11 | M196T258 | 1-deoxy-1-(methylamino)-d-galactitol |
| 12 | M283T72 | Hexaethylene glycol |
。
in some of these embodiments, the raman detection data further comprises:
breast cancer raman detection data comprising at least:
in some of these embodiments, prior to obtaining raman detection data for the plasma sample, the method further comprises:
obtaining a blood sample;
centrifuging the blood sample to obtain a plasma ultrafiltration sample;
subjecting the initial plasma sample to a membrane filtration-based centrifugation process to obtain a plasma sample;
wherein the time for performing centrifugal treatment based on membrane filtration on the initial plasma sample is less than or equal to 5min, the volume of the initial plasma sample is less than or equal to 450 μ l, and the volume of the plasma sample is less than or equal to 5 μ l.
In some of these embodiments, the blood sample is centrifuged to obtain an initial plasma sample at 2000-3000 rpm for 3-5 min.
In some of these embodiments, the blood sample is centrifuged to obtain an initial plasma sample at 3000rpm for 2 min.
In some of these embodiments, the initial plasma sample is centrifuged based on membrane filtration to obtain a plasma sample with working parameters of 10000-15000 rpm for 3-6 min.
In some of these embodiments, the initial plasma sample is centrifuged based on membrane filtration to obtain a post-plasma ultrafiltration sample with operating parameters of 12000rpm for 3 min.
In some embodiments, the plasma sample has a volume of 2 to 5. mu.l.
In some of these embodiments, the membrane treatment is an ultrafiltration treatment.
In some of these embodiments, after obtaining the plasma sample, the method further comprises:
the plasma samples were placed on an aluminized slide and dried.
In some of these embodiments, the slide is pre-placed in an environment at 37 ℃.
In some of these examples, the plasma sample was dried for 30 seconds on an aluminized slide.
In some of these embodiments, the plasma sample is dried on an aluminized glass slide at 37 ℃.
In some of these embodiments, the raman detection parameters for obtaining raman detection data for a plasma sample are:
Excitation wavelength: 532 nm;
power: 10-14 mw;
grating: 1200 g/mm;
single spectrum integration time: 16-20 s;
an objective lens: 100 x/0.9.
In some of these embodiments, acquiring raman detection parameters of raman detection data of the plasma sample further comprises:
5-8 profiles were collected for each of the plasma samples.
In some embodiments, inputting the raman detection data into a pre-trained cancer risk prediction model based on a BP neural network artificial intelligence algorithm to obtain a cancer risk prediction result comprises:
processing the Raman detection data, and mapping the Raman detection data into 1024-dimensional initial feature vectors;
inputting 1024-dimensional initial feature vectors into a pre-trained cancer risk prediction model based on a BP neural network artificial intelligence algorithm to obtain 4-dimensional final feature vectors;
and inputting the 4-dimensional final feature vector into a classification function for processing so as to obtain a cancer risk prediction result.
In some of these embodiments, the classification function is a softmax classification function.
In some of these embodiments, the method of training a cancer risk prediction model comprises:
constructing a cancer risk prediction model, wherein the cancer risk prediction model comprises an input layer, a hidden layer and an output layer, the input layer has 1024 inputs and 512 outputs, the hidden layer has 512 inputs and 128 outputs, and the output layer has 128 inputs and 4 outputs;
Constructing a training set and a test set according to Raman detection data, wherein the ratio of the training set to the test set is 3: 1;
inputting the training set into the cancer risk prediction model for training to obtain an output result;
inputting the output result into a classification function for processing to obtain a cancer risk prediction result;
and testing the cancer risk prediction result to detect the effectiveness of training, and iterating the cancer risk prediction model according to the effectiveness detection result until the training is completed.
In some of these embodiments, the cancer risk prediction outcome is tested as:
the test was performed every 10 rounds of training.
In some of these embodiments, the method for training a cancer risk prediction model further comprises:
calculating a loss value by using a cross entropy objective function;
the weights of the cancer risk prediction model are optimized using a stochastic gradient descent method and a back-propagation algorithm.
In a second aspect, the present invention provides a cancer risk prediction system comprising:
a plasma sample obtaining device for performing centrifugation on a blood sample to obtain an initial plasma sample, and performing centrifugation based on membrane filtration on the initial plasma sample to obtain a plasma sample; wherein the time for performing the membrane filtration-based centrifugation treatment on the initial plasma sample is less than or equal to 8 min;
The Raman detection device is used for carrying out Raman detection on the plasma sample so as to acquire Raman detection data;
the cancer risk prediction device is used for acquiring Raman detection data of a plasma sample, inputting the Raman detection data into a pre-trained cancer risk prediction model based on a BP neural network artificial intelligence algorithm to acquire a cancer risk prediction result, and setting different weights according to different tumors combined with clinical information including patient medical history and clinical serological indexes to be beneficial to more accurately diagnosing different tumors;
wherein the cancer risk prediction result has an accuracy of greater than 91.48%, a sensitivity of greater than 88.70%, and a specificity of greater than 95.98%.
In some of these embodiments, the plasma sample acquisition device comprises:
the anticoagulation blood taking unit is used for placing a blood sample;
a centrifugation unit for centrifuging the blood sample of the anticoagulation blood-taking unit to obtain a primary plasma sample, and centrifuging the primary plasma sample to obtain a plasma sample;
a membrane filtration unit for performing a membrane treatment on the initial plasma sample to obtain the plasma sample while the centrifugation unit performs a centrifugation treatment on the initial plasma sample.
In some of these embodiments, the anti-clotting unit is an EDTA-K2 anti-clotting unit.
In some of these embodiments, the membrane filtration unit is an ultrafiltration centrifuge tube.
In some of these embodiments, further comprising:
in some of these embodiments, the raman detection device employs confocal laser raman technology, surface enhanced raman scattering technology.
In a third aspect, the invention provides a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the cancer risk prediction method as described above when executing the computer program.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a cancer risk prediction method as described above.
Compared with the related art, the cancer risk prediction method, the system, the equipment and the storage medium provided by the embodiment of the application greatly shorten the detection time by combining the Raman detection with an Artificial Intelligence (AI) cancer risk prediction model based on a BP neural network, can obtain a cancer risk prediction result in about 15 minutes, and can determine whether to perform subsequent accurate detection according to the prediction result; the detection method is simple and easy to operate, can simultaneously detect a plurality of substances by one-time test, and has high accuracy of the cancer risk prediction result; the detection cost is low, a large amount of medical waste cannot be generated, and the environment pollution is avoided; the prediction method has high specificity, high sensitivity and high accuracy; the Raman detection can realize rapid large-scale detection of small molecule metabolites, has the advantages of high flux and high accuracy, and has very wide application prospect in accurate diagnosis and (or) prediction of metastasis and recurrence of tumors based on the method if the Raman detection can be combined with information such as clinical serology and image detection and adjusted according to weights of different cancer indexes or imaging changes in the process of cancer occurrence or metastasis, thereby having the advantages of large-scale popularization and application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a flow chart (one) of a cancer risk prediction method according to an embodiment of the present application;
fig. 2 is a flow chart of a cancer risk prediction method according to an embodiment of the present application (two);
fig. 3 is a flow chart of a cancer risk prediction method according to an embodiment of the present application (iii);
FIG. 4 is a flow chart of a method of training a cancer risk prediction model according to an embodiment of the present application;
FIG. 5 is a block diagram of a cancer risk prediction system according to an embodiment of the present application;
FIG. 6 is a block diagram of a plasma sample acquiring device according to an embodiment of the present application;
FIG. 7 is the AUC curves for seven pan-carcinoma serum metabolites.
Wherein the reference numerals are:
500. a cancer risk prediction system;
510. a cancer risk prediction device;
520. a plasma sample acquiring device;
521. an anticoagulation blood taking unit;
522. a centrifugal unit;
523. a membrane filtration unit;
530. a Raman detection device.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the application, and that it is also possible for a person skilled in the art to apply the application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that such a development effort might be complex and tedious, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure, given the benefit of this disclosure, without departing from the scope of this disclosure.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly and implicitly understood by one of ordinary skill in the art that the embodiments described herein may be combined with other embodiments without conflict.
Unless otherwise defined, technical or scientific terms referred to herein should have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The use of the terms "a" and "an" and "the" and similar referents in the context of describing the invention (including a single reference) are to be construed in a non-limiting sense as indicating either the singular or the plural. The use of the terms "including," "comprising," "having," and any variations thereof herein, is meant to cover a non-exclusive inclusion; for example, a process, method, system, article, or apparatus that comprises a list of steps or elements (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as used herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
Example 1
Fig. 1 is a flowchart of a cancer risk prediction method according to an embodiment of the present invention (one). As shown in fig. 1, a method for predicting cancer risk includes the following steps:
s102, acquiring Raman detection data of a plasma sample;
s104, inputting the Raman detection data into a cancer risk prediction model which is trained in advance and based on a BP neural network artificial intelligence algorithm to obtain a cancer risk prediction result.
Wherein the cancer risk prediction result has an accuracy of greater than 91.48%, a sensitivity of greater than 88.70%, and a specificity of greater than 95.98%.
In step S102, the raman detection parameters of the raman detection data are:
excitation wavelength: 532 nm;
power: 10-14 mw;
grating: 1200 g/mm;
single spectrum integration time: 16-20 s;
an objective lens: 100 x/0.9.
The time of Raman detection data combined with a BP neural network algorithm is 1-2 min.
Further, the acquiring raman detection parameters of the raman detection data of the plasma sample further comprises:
5-8 spectra were taken for each plasma sample.
In step S104, the time for obtaining the cancer risk prediction result is 0-2 min.
Preferably, the time to obtain a prediction of cancer risk is 30s-1 min.
In some of these embodiments, the raman detection data comprises generic raman-like detection data comprising at least:
By using the above 7 metabolites, it is possible to predict whether cancer is at risk.
In some of these embodiments, the raman detection data further comprises:
renal cancer raman detection data comprising at least:
through the 10 metabolites, the risk of renal cancer can be predicted in an auxiliary manner.
In some of these embodiments, the raman detection data further comprises:
gastric cancer raman detection data comprising at least:
the 8 metabolites can assist in predicting whether the gastric cancer is at risk.
In some of these embodiments, the raman detection data further comprises:
cervical cancer raman detection data comprising at least:
through the 9 metabolites, the cervical cancer risk can be predicted in an auxiliary mode.
In some of these embodiments, the raman detection data further comprises:
raman data for rectal cancer comprising at least:
through the 13 metabolites, the colorectal cancer risk can be predicted in an auxiliary mode.
In some of these embodiments, the raman detection data further comprises:
raman detection data of prostate cancer comprising at least:
the 39 metabolites can assist in predicting whether the prostate cancer is at risk.
In some of these embodiments, the raman detection data further comprises:
lung cancer raman detection data comprising at least:
through the 9 metabolites, the risk of lung cancer can be predicted in an auxiliary manner.
In some of these embodiments, the raman detection data further comprises:
raman ovarian cancer detection data comprising at least:
through the 12 metabolites, the ovarian cancer risk can be predicted in an auxiliary mode.
In some of these embodiments, the raman detection data further comprises:
breast cancer raman detection data comprising at least:
through the 10 metabolites, the auxiliary prediction of whether the breast cancer is at risk can be realized.
Fig. 2 is a flowchart of a cancer risk prediction method according to an embodiment of the present invention (ii). As shown in fig. 2, prior to obtaining raman detection data of a plasma sample, the method further comprises the steps of:
step S202, obtaining a blood sample;
step S204, carrying out centrifugal treatment on the blood sample to obtain an initial plasma sample;
and step S206, performing centrifugal processing based on membrane filtration on the initial plasma sample to obtain the plasma sample.
Wherein the time for performing centrifugal treatment based on membrane filtration on the initial plasma sample is less than or equal to 8min, the volume of the initial plasma sample is less than or equal to 450 μ l, and the volume of the plasma sample is less than or equal to 5 μ l.
In step S202, obtaining a blood sample is placing the collected blood into an anticoagulation blood collection vessel.
In step S204, the blood sample is centrifuged to obtain an initial plasma sample with working parameters of 2000-5000 rpm for 3-5 min.
Preferably, the blood sample is centrifuged to obtain an initial plasma sample with operating parameters of 3000rpm for 2 min.
In step S206, the initial plasma sample is placed in an ultrafiltration centrifuge tube, and the membrane treatment is an ultrafiltration treatment.
In step S206, membrane treatment and centrifugation treatment are simultaneously carried out on the initial plasma sample so as to obtain the plasma sample with working parameters of 10000-15000 rpm and 3-6 min.
Preferably, the initial plasma sample is subjected to both membrane treatment and centrifugation to obtain a plasma sample with working parameters of 12000rpm for 5 min.
Preferably, the volume of the plasma sample is 2-5 μ l.
Preferably, the volume of the plasma sample is 2 μ l.
Further, after step S206, the method further comprises the steps of:
and step S208, placing the plasma sample on an aluminized glass slide for drying.
Wherein, the working parameter of placing the plasma sample on an aluminized glass slide for drying is 37 ℃.
Wherein the drying time is 30 s-2 min.
Through steps S202-S206, a plasma sample can be obtained within 8min (generally 7-12 min, preferably 7-8 min) for subsequent detection, and the time for obtaining the sample is greatly shortened.
Fig. 3 is a flowchart of a cancer risk prediction method according to an embodiment of the present invention (iii). As shown in fig. 3, inputting the raman detection data into a cancer risk prediction model trained in advance based on a BP neural network artificial intelligence algorithm to obtain a cancer risk prediction result includes:
step S302, Raman detection data are processed, and the Raman detection data are mapped into 1024-dimensional initial feature vectors;
step S304, inputting 1024-dimensional initial feature vectors into a cancer risk prediction model trained in advance and based on a BP neural network artificial intelligence algorithm to obtain 4-dimensional final feature vectors;
and S306, inputting the 4-dimensional final feature vector into a classification function for processing so as to obtain a cancer risk prediction result.
In step S302, the Raman detection data comprises wave number and intensity, wherein the wave number is in the range of 279--1A total of 1024 points.
In step S306, the classification function is a softmax classification function.
Fig. 4 is a flow chart of a method of training a cancer risk prediction model according to an embodiment of the present invention. As shown in fig. 4, the training method of the cancer risk prediction model includes:
Step S402, constructing a cancer risk prediction model, wherein the cancer risk prediction model comprises an input layer, a hidden layer and an output layer, the input of the input layer is 1024, the output of the input layer is 512, the input of the hidden layer is 512, the output of the hidden layer is 128, the input of the output layer is 128, and the output of the output layer is 4;
s404, constructing a training set and a testing set according to the Raman detection data, wherein the ratio of the training set to the testing set is 3: 1;
step S406, inputting the training set into a cancer risk prediction model for training to obtain an output result;
step S408, inputting the output result into a classification function for processing so as to obtain a cancer risk prediction result;
and S410, testing the cancer risk prediction result to detect the effectiveness of training, and iterating the cancer risk prediction model according to the effectiveness detection result until the training is finished.
In step S408, the classification function is a softmax classification function.
In step S410, a test is performed every 10 training rounds.
In step S410, the cancer risk prediction model is tested using a test set to detect the effectiveness of the training.
Further, the training method of the cancer risk prediction model further comprises:
Step S412, calculating a loss value by using a cross entropy objective function;
and S414, optimizing the weight of the cancer risk prediction model by using a stochastic gradient descent method and a back propagation algorithm.
Through the steps S402 to S414, the cancer risk prediction model with the accuracy of more than 91.48%, the sensitivity of more than 88.70% and the specificity of more than 95.98% of the cancer risk prediction result can be trained, and the accuracy, the sensitivity and the specificity are greatly improved under the condition of reducing the prediction time.
Fig. 5 is a block diagram of a cancer risk prediction system according to an embodiment of the present invention. As shown in fig. 5, the cancer risk prediction system 500 includes a cancer risk prediction device 510 for acquiring raman detection data of a plasma sample, and inputting the raman detection data into a cancer risk prediction model trained in advance to acquire a cancer risk prediction result.
In some embodiments, the cancer risk prediction device 510 includes, but is not limited to, a mobile terminal, a cloud server, a local server, a computer, a notebook, and the like.
Further, the cancer risk prediction system 500 further comprises a plasma sample acquiring device 520 and a raman detection device 530. Wherein, the plasma sample acquiring device 520 is used for performing centrifugal processing on the blood sample to acquire an initial plasma sample, and performing membrane processing and centrifugal processing on the initial plasma sample simultaneously to acquire the plasma sample; the raman detection device 530 is used for raman detecting the plasma sample to obtain raman detection data.
In some embodiments, the raman detection device 530 employs a confocal laser raman technique or a surface enhanced raman scattering technique.
Fig. 6 is a frame diagram of a plasma sample acquiring device according to an embodiment of the present invention. As shown in fig. 6, the plasma sample acquiring device 620 includes an anticoagulation unit 521, a centrifugation unit 522, and a membrane filtration unit 523. Wherein, the anticoagulation blood taking unit 521 is used for placing a blood sample; the centrifugal unit 522 is used for centrifuging the blood sample of the anticoagulation blood-taking unit 521 to obtain a primary plasma sample, and centrifuging the primary plasma sample to obtain a plasma sample; the membrane filtration unit 523 is used for performing a membrane treatment on the initial plasma sample to obtain the plasma sample while the centrifugation unit 522 performs a centrifugation treatment on the initial plasma sample.
In some of these embodiments, the anticoagulation unit 521 is anticoagulated blood vessel loaded with EDTA-K2 anticoagulation.
In some of these embodiments, the membrane filtration unit 523 is an ultrafiltration centrifuge tube.
Preferably, the membrane filtration unit 523 is a Millipore UFC503096 Amicon Ultra-30K ultrafiltration centrifuge tube.
In addition, the cancer risk prediction method of the embodiments of the present application may be implemented by a computer device. Components of the computer device may include, but are not limited to, a processor and a memory storing computer program instructions.
In some embodiments, the processor may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of embodiments of the present Application.
In some embodiments, the memory may include mass storage for data or instructions. By way of example, and not limitation, memory may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is a Non-Volatile (Non-Volatile) memory. In particular embodiments, the Memory includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory may be used to store or cache various data files that need to be processed and/or used for communication, as well as possibly computer program instructions, executed by the processor.
The processor reads and executes the computer program instructions stored in the memory to implement any one of the cancer risk prediction methods in the above embodiments.
In some of these embodiments, the computer device may also include a communication interface and a bus. The processor, the memory and the communication interface are connected through a bus and complete mutual communication.
The communication interface is used for realizing communication among units, devices, units and/or equipment in the embodiment of the application. The communication interface may also enable communication with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
A bus comprises hardware, software, or both that couple components of a computer device to one another. A bus includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, a Bus may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, abbreviated VLB) bus or other suitable bus or a combination of two or more of these. A bus may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The computer device may perform the cancer risk prediction method in the embodiments of the present application.
In addition, in combination with the cancer risk prediction method in the above embodiments, the present application may provide a computer readable storage medium to implement the method. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the above-described embodiments of a cancer risk prediction method.
The cancer risk prediction method, the cancer risk prediction system, the computer device and the computer storage medium have the following advantages:
1) the sample pretreatment is simple: compared with the conventional serum centrifugation method, the method only adds one membrane treatment process, can rapidly separate plasma filtrate in the centrifugation process so as to carry out subsequent Raman detection, has the centrifugation treatment time of only 7-12 minutes, and is shorter than the conventional serum centrifugation method;
2) the detection time is short: raman single-point detection requires about 20 seconds at most, even if multi-point detection (such as 3-5-point detection) takes no more than 2 minutes at most, which is far lower than the current serological detection time, and various indexes are different from half an hour to half a day;
3) the detection components are comprehensive: all metabolite micromolecules in serum can be detected, the components are comprehensive, the consistency of the result and the mass spectrum detection result is very high, and the result judgment is reliable;
4) The detection timeliness is good: the Raman detection can quickly analyze the overall change of the metabolite, early and quick early warning can be performed on the cancer in advance, and a doctor can perform subsequent accurate detection in a targeted manner;
5) the detection cost is low: the detection of the invention can be completed only by one sample processing tube and one detection aluminum sheet or surface enhanced substrate, and the cost is very low;
6) low carbon and little pollution: only few medical wastes can be generated in the detection process, and the detection method is far more environment-friendly and lower in carbon than the current detection method.
Example 2
This embodiment is a specific application of the present invention.
1. Standard calibration
1.1 sample treatment
Obtaining blood samples, respectively centrifuging the blood samples to obtain corresponding initial plasma samples, and transferring 1.5ml of the initial plasma samples into an EP tube, wherein the centrifugation conditions are 3000rpm and 2 min;
and (3) performing centrifugal treatment based on membrane filtration on the initial plasma sample to obtain a corresponding plasma sample, wherein 450 mu l of the initial plasma sample is put into an inner tube of a Millipore UFC503096 Amicon Ultra-30K ultrafiltration centrifugal tube, and the centrifugal conditions are 12000rpm and 2.5-3 min.
1.2 Raman detection
Taking 2 mul of the plasma sample, placing the plasma sample on an aluminized glass slide (which is heated at 37 ℃ in advance) for 30s-1 min, adjusting the focusing of a Raman instrument in the period of time, and immediately performing Raman detection, wherein the Raman detection conditions are as follows: witech alpha300R laser confocal Raman instrument, excitation wavelength: 532nm, power: 10-14mw, grating: 1200g/mm, single spectrum integration time: 16-20s, objective: 100x/0.9, and 5-8 maps are collected from each plasma sample;
Limit calibration and normalization processing were performed on all profiles, mean and PCA analysis was performed on all profiles for each plasma sample using the R language, and statistical analysis was performed on the difference peaks.
1.3 Mass spectrometric detection as a Standard for the methodology to calibrate
And (3) taking the plasma sample for mass spectrum detection, and obtaining a mass spectrum detection result.
Wherein, the mass spectrometric detection mainly detects non-target metabolites, and performs PCA analysis and LDA analysis, and signal path analysis of differential metabolites on the whole plasma sample. In a total of 126 samples, the accuracy of the raman and mass spectral analysis results were well consistent with the accuracy of 25% (i.e., 32 cases) of data, both 90.6% (29/32). Fully shows that the serum result based on Raman detection has the same accuracy as the mass spectrum, and the scientific result of the method is very reliable.
Analysis of the mechanisms of tumorigenesis of co-occurring and of individually occurring serum metabolites
1 lactose
Lactose is a carbohydrate peculiar to human and mammalian milk and is a disaccharide composed of glucose and galactose. Current research has focused on lactose intolerance, but for lactose and cancer correlations, current research is rarely involved. In this study, lactose concentration was increased by an average of 15.82-fold in 8 tumors, 29.40-fold in cervical cancer and 4.99-fold in ovarian cancer, fully accounting for its prevalence in tumors. The most predominant source of lactose is milk, and of course, this conclusion cannot be drawn to "lactose carcinogenesis", but at least suggests that elevated serum lactose has a significant correlation with carcinogenesis, and may be caused by certain metabolites ingested therein with milk, but certainly and inclusively. In the "chinese health survey report" called "21 st century epidemiological peak of the peaks", by the new york times, the professor of the american cornell university, called "einstein" in the nutritional kingdom, kolin, mr. campbell and chinese academy Chenjun Shi, started in the 1970 s, spanning more than 30 years, spanning more than 40 countries, a study called "the most comprehensive survey of nutriology since history" showed: animal proteins, including milk, are the initiating factors for a variety of chronic diseases including tumors, and casein in milk in particular, is an important serum metabolite for initiating and activating tumor gene expression.
B Blood group B trisaccharide (Blood group B trisaccharide).
Trisaccharides are a general term for compounds composed of three molecules of monosaccharides linked by glycosidic bonds. The blood group B trisaccharides were only mentioned in 8 documents from 2001 to 2021 when they were searched on Pubmed, and there was no mention of tumor-related associations. The polysaccharide produced in the sugar metabolism process is various, such as manninotriose, gentianose, and rutinose, and the like, and is basically blank in the literature at present.
3. Tryptophan
The research shows that: the kynurenine/tryptophan ratio can be used as a prostate cancer and ovarian cancer occurrence marker in serum; a combination of serum histidine and plasma tryptophan can be used as biomarkers for detecting renal clear cell carcinoma (ccRCC). In the serum of breast cancer patients, the tryptophan level is obviously increased, and in the process of culturing breast cancer cell strains, the tryptophan (100 mu M) is added into the culture medium to obviously inhibit CD4+T cells secrete IL-10. Tryptophan passage through CD4+Inhibition of IL-10 secretion by T cells is a potential pathogenesis of breast cancer. In this study, the concentrations of tryptophan in the serum of renal, breast and ovarian cancers were 4.32-fold, 4.13-fold and 3.791-fold, respectively, as compared with normal humans.
HC toxin
The HC toxin is a cyclic tetrapeptide that was first isolated from secondary metabolites of carbon filariasis and has superior anti-tumor activity to other Histone Deacetylase (HDAC) inhibitors. HDAC functions to regulate the degree of histone acetylation, by epigenetic modification, thereby regulating gene expression. Too high HDAC activity can lead to too tight a nucleosome structure and inhibit gene expression, which is also a cause of many cancers. Histone deacetylase inhibitors (HDACi) can effectively inhibit histone activity, and are potential cancer therapeutic drugs. HC toxins, HDAC inhibitors directed against I, IIa and IIb HDACs, play an important role in meningioma biology and as targeting mechanisms, and also have a very broad spectrum of actions in other tumors. The research shows that: HC toxins cause a benign phenotypic shift of Neuroblastoma (NB) cells to differentiation, which is associated with activation of the Retinoblastoma (RB) inhibitory network. Of the 8 cancers in this study, HC toxin was significantly reduced to 0.106 fold, with significantly reduced inhibition of HDAC.
5. Glycine-valine dimer
20 kinds of amino acids are the most important constitutional basic units constituting human proteins, and glycine, one of nonessential amino acids, is very essential for the growth of normal tissues and tumors. In the process of tumor growth, new blood vessels are important structures for providing energy for tumors, and the formation of the blood vessel walls also needs a large amount of elastin synthesized by glycine, valine and the like, particularly glycine, which almost accounts for 1/4 consisting of elastin. Thus, elevated serum glycine provides an important material basis for the growth of the vascular wall of tumors. In the journal "nature" of the last year, researchers at the british bissen cancer institute found in mouse models of colorectal, melanoma and lymphoma: the serine and glycine free diet improved the survival of mice by half, fully demonstrating the important role of glycine and valine in promoting the development of cancer. Two alpha-amino acids can be cyclized and derived in vivo to form 2, 6-Diketopiperazine (DKP), which is considered as an important compound because of the biological property of treating various diseases (including cancers), the novel furan-diketopiperazine derivatives can effectively inhibit microtubule polymerization and can be considered as a potential target for developing anticancer drugs, and the phenoxy diketopiperazine derivatives can obviously show very obvious anti-microtubule agent cell activity and have obvious application prospect in tumor promotion. The results of this study show that: glycine-valine dimer decreased significantly to an average of 0.174-fold in 8 tumors.
Dl-norleucine methyl ester:
the literature is almost blank. The input "norleucin methyl ester" can only find 10 documents in 1969-1979, such as one of them: diazoacetyl-DL-norleucine methyl ester and pepstatin are specific inhibitors of acid proteases from brain, kidney, skeletal muscle and insect-feeding plants. The results of this study show that: the average fold of Dl-norleucine methyl ester in 8 tumors was 0.02 fold.
7. Ezerimide c (Izenamide c)
Izenamides are linear depsipeptides extracted from marine blue algae, Izenamides A and B are novel inhibitors of cathepsin D (cathepsin D), and in the literature, depsipeptide inhibitors of cathepsin D, namely zenamides A, B and C, are synthesized for the first time in 2019. Cathepsins are a class of proteases found in the cells (particularly the lysosomal portion) of various animal tissues, and are major members of the cysteine protease family, and there are more than 20 kinds of cathepsins found in the biological world, and at least 10 kinds of cathepsins are mainly present in the human body, and are closely related to various major diseases in humans, such as tumors such as colon cancer, osteoporosis, arthritis, and the like. The study showed that α 1 antitrypsin (A1AT) and cathepsin d (CTSD) were significantly reduced and increased, respectively, at tissue and serum levels in colorectal cancer specimens compared to normal controls, and 96.77% of CRC tissues could be distinguished from normal tissues by immunohistochemical analysis of tissues (P <0.0001) by co-action of A1AT and CTSD, and it was found in this study that the concentration of iznamide c was significantly reduced to 0.008 fold.
In conclusion: the seven metabolites are taken as prediction markers for jointly predicting the occurrence of the tumor. The AUC curves were 0.9637, 0.9357, 0.9077, 0.9018, 0.8988, 0.8476, and 0.6363, respectively. In a whole, some can be attributed to providing metabolite basis in the tumorigenesis process, and providing raw materials for the growth and development of tumors; some metabolites act as inhibitors, significantly reducing. The unbalanced presence of metabolites formed by the two classes of substances provides a very high probability for accurate determination of the occurrence of tumors.
Among the specific tumors:
there are 10 specifically altered metabolites in kidney cancer, among which 1-palmitoyl lysophosphatidylcholine is increased 17.692-fold, 1-behenoyl-2-hydroxy-sn-glycero-3-phosphocholine is significantly increased 70.962-fold, and phosphatidyl choline Pc (16:1e/17-hdohe), Pe (18:1e/10-hdohe) Pc (16:0e/8-hepe) are significantly decreased, 0.014-fold, 0.029-fold and 0.016-fold, respectively.
There are 8 specifically altered metabolites in gastric cancer, among which 7.184-fold imidazole-2-carboxylate (Ethyl imidazole-2-carboxylate) and 0.071-fold decrease in 1-palmitoyl-2-thiopalmitoyl phosphatidylcholine (1-palmitoyl-2-thiopalmitoyl phosphatidylcholine) were significant.
There are 9 specifically altered metabolites in cervical cancer, among which Indoxyl sulfate (Indoxyl sulfate) is 0.119 and 0.193 fold higher than Cyclohexylamine. 2-tert-butyl-6-methylphenol (2-tert-butyl-6-methylphenol) Trans-4- (aminomethyl) cyclohexanecarboxylic acid (Trans-4- (aminomethyl) cyclic acid) was 30.653-and 32.896-fold higher, respectively.
There are 13 specifically altered metabolites, among which Deoxythymidine 5'-phosphate Deoxythymidine 5' -phosphate (dTMP), Stachyose (Stachyose), and 3'-fucosyllactose (3' -fucosylactose) are increased 87.35-fold, 70.57-fold, and 66.217-fold, respectively, in rectal cancer.
There are 9 specifically altered metabolites in lung cancer, among which β -hydroxyethyltheophylline (beta. -hydroxytheophylline) 3-ethylphenol is elevated 108.637-fold and 11.187-fold respectively, and the dipeptides Glu-Pro and Ala-Pro of the other two amino acids are elevated 14.76-fold and 11.646-fold respectively.
Ovarian cancer has a total of 12 specifically altered metabolites, of which Taurochenodeoxycholate (Taurochenodeoxycholate) N, N '-diacetylchitobiose (N, N' -diacetylchitobiose) and fumatohexadecanoic acid (Fumarprotocerric acid) are 153.807-fold, 162.187-fold and 17.183-fold, respectively.
There are a total of 10 specifically altered metabolites of breast cancer, with the tripeptides sweet coumarin b (Swietenooumarin b) cyclo (proline-leucine) and Pro-Phe-Arg being 116.380-, 4.490-, and 3.222-fold respectively.
There are a total of 39 specifically altered metabolites in prostate cancer, with Cellobiose (Cellobiose) Disialyllactose (Disialyllactose) and aconitic acid being 243.043-fold, 96.584-fold and 16.253-fold elevated, respectively.
The specific metabolite changes are shown in figure 7 and in tables 1-9 below:
TABLE 1 Change of 7 metabolites from general Raman-like assay data
NA: is shown without reference to
TABLE 2 Raman measurement of renal carcinoma 10 metabolite changes
TABLE 3 changes in 8 metabolites from Raman measurement data of gastric cancer
TABLE 4 Raman measurement of cervical cancer for 9 metabolite changes
TABLE 5 Raman measurement of rectal cancer 13 metabolite changes
TABLE 6 Raman measurement of prostate cancer for 39 metabolite changes
TABLE 7 Raman analysis data for Lung cancer 9 metabolite changes
TABLE 8 Raman measurements of ovarian cancer for 12 metabolite changes
TABLE 9 Raman analysis data for Breast cancer 10 metabolite changes
1.4 Standard calibration
And performing Raman detection on the standard substance of the metabolite, and forming a standard Raman peak spectrum collection for judging various tumors.
2. Cancer risk prediction model
2.1 construction of cancer Risk prediction model
And constructing a cancer risk prediction model by adopting a BP neural network, wherein the cancer risk prediction model comprises an input layer, a hidden layer and an output layer, the input of the input layer is 1024, the output of the input layer is 512, the input of the hidden layer is 512, the output of the hidden layer is 128, and the input of the output layer is 128, and the output of the output layer is 4.
2.2 construction of data sets
Uniformly mapping Raman detection data of different plasma samples and standard samples into 1024-dimensional initial feature vectors, and storing the initial feature vectors into a txt file, wherein the initial feature vectors comprise two rows of data, one row is wave number, the other row is intensity, the wave numbers of the data are consistent and are 279-2186cm-1A total of 1024 points;
dividing the Raman detection data into a training set and a testing set, wherein the proportion of the training set to the testing set is 3: 1.
the raman detection data can be classified into 4 categories, i.e., tumor group and Con (control group).
2.3 training cancer Risk prediction models
Inputting the training set into a cancer risk prediction model to obtain an output result;
and inputting the output result into a softmax classification function for processing so as to obtain a cancer risk prediction result.
2.4 iterative cancer Risk prediction model
Inputting the test set into the trained cancer risk prediction model to obtain an output result;
inputting the output result into a softmax classification function for processing so as to obtain a cancer risk prediction result;
the cancer risk prediction results are compared with the original results of the test set to verify the effectiveness of the training.
The cancer risk prediction model was trained using the training set for 300 rounds, with tests performed every 10 rounds to verify the effectiveness of the training.
2.5 trained cancer Risk prediction model
The experimental data of the cancer risk prediction model after continuous training are as follows:
brief description of predictive model Algorithm
It can be seen from this that: the sensitivity, specificity and accuracy of the eight tumors were 79.21%, 95.21% and 86.93% in raman-based assays, respectively, and after optimization by the AI algorithm of the BP neural network, the sensitivity and specificity and accuracy were 89.70%, 95.98% and 91.48%, respectively. Therefore, on one hand, the results of the Raman-based serological detection and the mass spectrum detection in the model training process are consistent, the methodology is completely reliable, and the occurrence of the cancer is judged comprehensively and accurately in comparison; and on the other hand, the AI-based neural network algorithm can obviously improve the sensitivity and specificity of the Raman-based serological detection result.
The main research process of the invention is as follows:
performing mass spectrum and Raman detection on each sample, fusing and improving by a BP neural network algorithm based on AI, integrating Raman, mass spectrum and artificial intelligence algorithms together, deducting the influence of clinical medication, establishing a cancer detection model by using 75% of data, and finally integrating the model into a Raman detection process in the form of AI algorithm, thereby finally achieving the effect of quickly detecting and predicting the occurrence or metastasis/recurrence probability of cancer based on Raman detection serology.
The method takes a normal control group as a cancer group, introduces the Raman detection data of patients or the serum of the normal control group after detection into a cancer patient risk prediction model which is trained in advance, namely obtains a cancer risk prediction result quickly (within 2 minutes); in a total of 126 samples, the accuracy of the raman and mass spectral analysis results were well consistent with the accuracy of 25% (i.e., 32 cases) of data, both 90.6% (29/32).
Therefore, there was little difference between direct detection of metabolites in serum using raman and detection of gold standard Mass spectrometry (Mass spectrometry), confirming the reliability of the method for predicting carcinogenesis according to the present invention.
The invention firstly establishes a pan-cancer model by using the data of a plurality of clinically definite diagnosed cancer patients, optimizes an AI algorithm based on a BP neural network, takes 7 serum metabolites as common metabolites, and can respectively combine the metabolites with significant differential expression in the prediction process of other cancers as auxiliary metabolites for predicting the cancers. The method has accurate and reliable cancer risk prediction result and wide application prospect.
The most important advantages of the invention are:
the Raman detection can completely detect the change of metabolites in a cancer sample, qualitatively analyze serum metabolites and facilitate early screening and early diagnosis of cancer. In our experiment, the detection of Raman and mass spectra is respectively carried out on the serum sample of the same cancer patient, and the deep neural network model is used for comparison, the result shows that the accuracy of the Raman detection result is completely consistent with that of the mass spectrum detection result, and the accuracy and reliability of the Raman detection serum are fully reflected. In addition, Raman detection can realize rapid large-scale detection of small molecule metabolites, has the advantages of high flux and high accuracy, has very bright application prospect in accurate diagnosis/metastasis/recurrence prediction of various cancers due to relatively low detection cost, and has the advantages of large-scale popularization and application. If the sample size can be enlarged, and the weight of Raman data and clinical information can be further optimized by combining the medical history of different cancer patients and different cancer-specific serological markers, the probability of occurrence/metastasis/recurrence and the like of different cancers can be more accurately , and a convenient, rapid and accurate prediction system is provided for the early warning of more patients and the development of tumor metastasis and recurrence and the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method for predicting cancer risk, comprising:
acquiring Raman detection data of the plasma sample;
inputting the Raman detection data into a cancer risk prediction model which is trained in advance and is based on a BP neural network artificial intelligence algorithm so as to obtain a cancer risk prediction result;
wherein the cancer risk prediction result has an accuracy of greater than 91.48%, a sensitivity of greater than 88.70%, and a specificity of greater than 95.98%.
2. A cancer risk prediction method according to claim 1, wherein the raman detection data comprises generic raman detection data comprising at least:
。
3. the method of cancer risk prediction according to claim 2, wherein the raman detection data further comprises:
renal cancer raman detection data comprising at least:
(ii) a And/or
Gastric cancer raman detection data comprising at least:
(ii) a And/or
Cervical cancer raman detection data comprising at least:
(ii) a And/or
Raman data for rectal cancer comprising at least:
(ii) a And/or
Raman detection data of prostate cancer comprising at least:
(ii) a And/or
Lung cancer raman detection data comprising at least:
(ii) a And/or
Raman ovarian cancer detection data comprising at least:
(ii) a And/or
Breast cancer raman detection data comprising at least:
4. a method for predicting cancer risk according to any of claims 1 to 3, wherein before acquiring raman measurement data of a plasma sample, the method further comprises:
obtaining a blood sample;
centrifuging the blood sample to obtain an initial plasma sample;
subjecting the initial plasma sample to a membrane filtration-based centrifugation process to obtain a plasma sample;
Wherein the time for performing centrifugal treatment based on membrane filtration on the initial plasma sample is less than or equal to 5min, the volume of the initial plasma sample is less than or equal to 450 μ l, and the volume of the plasma sample is less than or equal to 5 μ l.
5. The method for predicting cancer risk according to any one of claims 1 to 4, wherein inputting the Raman detection data into a pre-trained cancer risk prediction model based on a BP neural network artificial intelligence algorithm to obtain a cancer risk prediction result comprises:
processing the Raman detection data, and mapping the Raman detection data into 1024-dimensional initial feature vectors;
inputting 1024-dimensional initial feature vectors into a pre-trained cancer risk prediction model based on a BP neural network artificial intelligence algorithm to obtain 4-dimensional final feature vectors;
and inputting the 4-dimensional final feature vector into a classification function for processing so as to obtain a cancer risk prediction result.
6. The method for predicting cancer risk according to any one of claims 1 to 5, wherein the method for training the cancer risk prediction model based on the BP neural network artificial intelligence algorithm comprises the following steps:
constructing a cancer risk prediction model, wherein the cancer risk prediction model comprises an input layer, a hidden layer and an output layer, the input layer has 1024 inputs and 512 outputs, the hidden layer has 512 inputs and 128 outputs, and the output layer has 128 inputs and 4 outputs;
Constructing a training set and a test set according to Raman detection data, wherein the ratio of the training set to the test set is 3: 1;
inputting the training set into the cancer risk prediction model for training to obtain an output result;
inputting the output result into a classification function for processing to obtain a cancer risk prediction result;
and testing the cancer risk prediction result to detect the effectiveness of training, and iterating the cancer risk prediction model according to the effectiveness detection result until the training is finished.
7. A cancer risk prediction system comprising:
a plasma sample obtaining device for performing centrifugation on a blood sample to obtain an initial plasma sample, and performing centrifugation based on membrane filtration on the initial plasma sample to obtain a plasma sample; wherein the time for performing the membrane filtration-based centrifugation treatment on the initial plasma sample is less than or equal to 5 min;
the Raman detection device is used for carrying out Raman detection on the plasma sample so as to acquire Raman detection data;
the cancer risk prediction device is used for acquiring Raman detection data of a plasma sample, and inputting the Raman detection data into a cancer risk prediction model which is trained in advance and is based on a BP neural network artificial intelligence algorithm so as to acquire a cancer risk prediction result;
Wherein the cancer risk prediction result has an accuracy of greater than 91.48%, a sensitivity of greater than 88.70%, and a specificity of greater than 95.98%.
8. The cancer risk prediction system of claim 7, wherein the plasma sample acquiring device comprises:
the anticoagulation blood taking unit is used for placing a blood sample;
a centrifugation unit for centrifuging the blood sample of the anticoagulation blood-taking unit to obtain a primary plasma sample, and centrifuging the primary plasma sample to obtain a plasma sample;
a membrane filtration unit for performing a membrane treatment on the initial plasma sample to obtain the plasma sample while the centrifugation unit performs a centrifugation treatment on the initial plasma sample.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements a cancer risk prediction method as defined in any one of claims 1 to 6.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method for cancer risk prediction according to any one of claims 1 to 6.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN2022101448077 | 2022-02-17 | ||
| CN202210144807 | 2022-02-17 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN114678122A true CN114678122A (en) | 2022-06-28 |
Family
ID=82071829
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202210149664.9A Pending CN114678122A (en) | 2022-02-17 | 2022-02-18 | A cancer risk prediction method, system, device and medium |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN114678122A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115985503A (en) * | 2023-03-20 | 2023-04-18 | 西南石油大学 | Cancer Prediction System Based on Ensemble Learning |
| CN117476222A (en) * | 2023-09-28 | 2024-01-30 | 上海交通大学医学院附属瑞金医院 | A risk prediction method, system and electronic device for liver metastasis of intestinal cancer |
Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB201220573D0 (en) * | 2012-11-15 | 2013-01-02 | Univ Central Lancashire | Methods of diagnosing proliferative disorders |
| CN104142320A (en) * | 2013-06-08 | 2014-11-12 | 李龙江 | Serum surface enhanced Raman spectrum based parotid tumor diagnosis technology |
| CN107741416A (en) * | 2017-09-14 | 2018-02-27 | 复旦大学 | The SERS probes of Multiple Antibodies mark and substrate and its preparation method and application |
| US20180068083A1 (en) * | 2014-12-08 | 2018-03-08 | 20/20 Gene Systems, Inc. | Methods and machine learning systems for predicting the likelihood or risk of having cancer |
| CN109253998A (en) * | 2018-10-25 | 2019-01-22 | 珠海彩晶光谱科技有限公司 | Metal-wrappage-antibody composite nanoparticle quantitative detection tumor marker method based on Raman enhancing |
| CN111812078A (en) * | 2020-08-27 | 2020-10-23 | 上海交通大学医学院附属仁济医院 | Artificial intelligence-assisted early diagnosis of prostate tumors based on surface-enhanced Raman spectroscopy |
| CN112201346A (en) * | 2020-10-12 | 2021-01-08 | 哈尔滨工业大学(深圳) | Cancer survival prediction method, apparatus, computing device and computer readable storage medium |
| CN112820403A (en) * | 2021-02-25 | 2021-05-18 | 中山大学 | Deep learning method for predicting prognosis risk of cancer patient based on multiple groups of mathematical data |
| WO2021127814A1 (en) * | 2019-12-23 | 2021-07-01 | 深圳市人民医院 | Photo-nanovaccine for cancer treatment, preparation method therefor and application thereof |
| CN113801938A (en) * | 2021-10-28 | 2021-12-17 | 安徽同科生物科技有限公司 | Biomarker combination and application thereof in predicting prognosis of endometrial cancer patient |
-
2022
- 2022-02-18 CN CN202210149664.9A patent/CN114678122A/en active Pending
Patent Citations (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB201220573D0 (en) * | 2012-11-15 | 2013-01-02 | Univ Central Lancashire | Methods of diagnosing proliferative disorders |
| CN104142320A (en) * | 2013-06-08 | 2014-11-12 | 李龙江 | Serum surface enhanced Raman spectrum based parotid tumor diagnosis technology |
| US20180068083A1 (en) * | 2014-12-08 | 2018-03-08 | 20/20 Gene Systems, Inc. | Methods and machine learning systems for predicting the likelihood or risk of having cancer |
| CN107741416A (en) * | 2017-09-14 | 2018-02-27 | 复旦大学 | The SERS probes of Multiple Antibodies mark and substrate and its preparation method and application |
| CN109253998A (en) * | 2018-10-25 | 2019-01-22 | 珠海彩晶光谱科技有限公司 | Metal-wrappage-antibody composite nanoparticle quantitative detection tumor marker method based on Raman enhancing |
| WO2021127814A1 (en) * | 2019-12-23 | 2021-07-01 | 深圳市人民医院 | Photo-nanovaccine for cancer treatment, preparation method therefor and application thereof |
| CN111812078A (en) * | 2020-08-27 | 2020-10-23 | 上海交通大学医学院附属仁济医院 | Artificial intelligence-assisted early diagnosis of prostate tumors based on surface-enhanced Raman spectroscopy |
| CN112201346A (en) * | 2020-10-12 | 2021-01-08 | 哈尔滨工业大学(深圳) | Cancer survival prediction method, apparatus, computing device and computer readable storage medium |
| CN112820403A (en) * | 2021-02-25 | 2021-05-18 | 中山大学 | Deep learning method for predicting prognosis risk of cancer patient based on multiple groups of mathematical data |
| CN113801938A (en) * | 2021-10-28 | 2021-12-17 | 安徽同科生物科技有限公司 | Biomarker combination and application thereof in predicting prognosis of endometrial cancer patient |
Non-Patent Citations (3)
| Title |
|---|
| 刘伟;赵元黎;余发军;张凤秋;王雷鸣;刘丹;: "恶性骨肿瘤患者红细胞的拉曼光谱研究", 激光与红外, no. 02, 20 February 2008 (2008-02-20) * |
| 陈荣;李永增;冯尚源;黄祖芳;谢树森;: "人体组织拉曼光谱研究进展", 激光与光电子学进展, no. 01, 10 January 2008 (2008-01-10) * |
| 雷锦誌;张晗;李永志;康熙雄;: "基于AI技术的癌症风险动态预警系统的应用研究", 中国卫生检验杂志, no. 05, 10 March 2020 (2020-03-10) * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115985503A (en) * | 2023-03-20 | 2023-04-18 | 西南石油大学 | Cancer Prediction System Based on Ensemble Learning |
| CN115985503B (en) * | 2023-03-20 | 2023-07-21 | 西南石油大学 | Cancer Prediction System Based on Ensemble Learning |
| CN117476222A (en) * | 2023-09-28 | 2024-01-30 | 上海交通大学医学院附属瑞金医院 | A risk prediction method, system and electronic device for liver metastasis of intestinal cancer |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Lennon et al. | Single molecule characterization of individual extracellular vesicles from pancreatic cancer | |
| Song et al. | Oral squamous cell carcinoma diagnosed from saliva metabolic profiling | |
| Bayraktar et al. | MITO-Tag Mice enable rapid isolation and multimodal profiling of mitochondria from specific cell types in vivo | |
| Günther | Metabolomics biomarkers for breast cancer | |
| AU2015249113B2 (en) | Lung cancer biomarkers and uses thereof | |
| Baker et al. | Mass spectrometry for translational proteomics: progress and clinical implications | |
| Xu et al. | Quantitative proteomics study of breast cancer cell lines isolated from a single patient: discovery of TIMM17A as a marker for breast cancer | |
| Miller et al. | Evaluation of disease staging and chemotherapeutic response in non-small cell lung cancer from patient tumor-derived metabolomic data | |
| Ball et al. | Quantitative proteomic profiling of Cryptococcus neoformans | |
| Wang et al. | Optimized data-independent acquisition approach for proteomic analysis at single-cell level | |
| Mathé et al. | The omics revolution continues: the maturation of high-throughput biological data sources | |
| Zhu et al. | A multi-omics spatial framework for host-microbiome dissection within the intestinal tissue microenvironment | |
| Zhang et al. | When cancer drug resistance meets metabolomics (bulk, single-cell and/or spatial): Progress, potential, and perspective | |
| Fredman et al. | Towards precision dermatology: emerging role of proteomic analysis of the skin | |
| CN114678122A (en) | A cancer risk prediction method, system, device and medium | |
| Santana-Filho et al. | NMR metabolic fingerprints of murine melanocyte and melanoma cell lines: application to biomarker discovery | |
| qiang Weng et al. | Identification of Treg-related prognostic molecular subtypes and individualized characteristics in clear cell renal cell carcinoma through single-cell transcriptomes and bulk RNA sequencing | |
| Porreca et al. | Unveil intrahepatic cholangiocarcinoma heterogeneity through the lens of omics and multi-omics approaches | |
| Palmerini et al. | Tumor Immune Microenvironment–Associated Prognostic and Mifamurtide-Response Gene Signatures for Localized Osteosarcoma: A Correlative Study of the ISG/OS-2 Trial | |
| Guo et al. | Identification and validation of metabolism-related genes signature and immune infiltration landscape of rheumatoid arthritis based on machine learning | |
| Donovan et al. | Peptide-centric analyses of human plasma enable increased resolution of biological insights into non-small cell lung cancer relative to protein-centric analysis | |
| Zhang et al. | Applying proteomics in metabolic dysfunction-associated steatotic liver disease: From mechanism to biomarkers | |
| Engskog et al. | Metabolic profiling of epithelial ovarian cancer cell lines: evaluation of harvesting protocols for profiling using NMR spectroscopy | |
| Lv et al. | Comprehensive pan-cancer analysis of KRT6A as a prognostic and immune biomarker | |
| Mourad et al. | Proteomics in inflammatory bowel disease: approach using animal models |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |


















































