Disclosure of Invention
In order to solve the technical problems, the invention provides a traditional Chinese medicine tongue image identification method and system based on image processing, which are used for solving the problems existing in the prior art.
The invention provides a traditional Chinese medicine tongue image identification method based on image processing, which comprises the following steps:
s1, collecting tongue surface and tongue bottom images through image collecting equipment;
S2, performing first image preprocessing operation on the lingual surface and the lingual bottom image to obtain a first preprocessed lingual surface image and a first preprocessed lingual bottom image;
s3, performing second image preprocessing operation on the first preprocessed tongue surface image and the first preprocessed tongue bottom image to obtain a second preprocessed tongue surface image and a second preprocessed tongue bottom image;
The second image preprocessing operation specifically comprises the steps of adopting a linear change method to adjust the brightness of the first preprocessed tongue surface image for the first preprocessed tongue surface image;
the implementation of illumination correction on the first preprocessed tongue bottom image by adopting a block correction method is specifically as follows:
Sa dividing the first preprocessed tongue bottom image into a plurality of small blocks;
sb, adopting an improved Gaussian weight function formula to independently carry out illumination estimation on each small block;
Sc, carrying out image correction on each small block according to the pixel local illumination estimated value of each small block;
And Sd, carrying out image fusion on each corrected small block to realize illumination correction on the first preprocessed tongue bottom image by adopting a block correction method.
S4, inputting the second preprocessed tongue surface image and the second preprocessed tongue bottom image into a tongue image recognition model to obtain a tongue image recognition result.
Preferably, in Sb, a local weighted average method is used to calculate an illumination estimation component, different weights are given to the neighborhood of each pixel of each small block according to the distance between the pixel and the center pixel, and the illumination estimation component of the pixel is obtained by calculating the weighted average value of the gray values of the pixels in the neighborhood, where the specific formula is as follows:
;
Where N (x, y) is a neighborhood centered on pixel (x, y), I (I, j) is a gray value of pixel (I, j), w (I, j) is a weight of pixel (I, j), and I illumination (I, j) is an illumination estimation component of pixel (I, j).
Preferably, the modified gaussian weight function formula is:
;
Where w (i, j) is the weight, (x, y) is the position of the center pixel, (i, j) is the pixel position in the neighborhood, σ is the standard deviation of the Gaussian weight function, Is the gradient value at pixel (i, j) for each image,Is the maximum value of the gradient of each image.
Preferably, the Sd is specifically that different weights are given to the overlapping region of the small blocks according to the distance between each pixel and the center of the small block, smooth fusion is carried out, the weight is calculated by adopting the improved Gaussian weight function formula, and the fused pixel value I fused(i,j) of the overlapping region is calculated according to the weight;
The specific formula is as follows:
;
Where I corrected,k (I, j) is the gray value of the corrected image of the kth patch at pixel (I, j), and w k (I, j) is the weight of the kth patch at pixel (I, j).
Preferably, the first image preprocessing operation includes image denoising, image segmentation, and image enhancement.
Preferably, the image denoising is to remove random noise in an image by adopting median filtering, the image is segmented into extracting tongue contours in the tongue surface image by adopting a Canny edge detection algorithm, and the tongue contours in the tongue bottom image are extracted by adopting a segmentation method based on a color threshold value.
Preferably, the tongue outline in the tongue bottom image is extracted by a segmentation method based on a color threshold, specifically, the tongue bottom image is converted from RGB to HSV color space by using a color space conversion function, the color threshold is set according to the color characteristics of the tongue bottom image, the tongue bottom image is segmented by using the color threshold, the pixels meeting the conditions are set to be white, and other pixels are set to be black, so that a tongue area is extracted.
Preferably, the color threshold is set according to the color characteristics of the sublingual image, specifically, a histogram analysis tool is used for observing the color distribution of the sublingual image in an HSV color space, then a typical color range of the sublingual is selected according to the observation result, and the color threshold range is set according to the typical color range.
Preferably, the color threshold range is:
H (hue) 0 to 30 degrees;
s (saturation) 0 to 100 degrees;
V (brightness) 100 to 255 degrees.
According to another aspect of the present invention, there is provided an image processing-based tongue image recognition system of traditional Chinese medicine, the system adopting the above-mentioned image processing-based tongue image recognition method, the system comprising:
the image acquisition equipment acquires tongue surface and tongue bottom images;
The first preprocessing module is used for performing first image preprocessing operation on the lingual surface and the lingual bottom image to obtain a first preprocessed lingual surface image and a first preprocessed lingual bottom image;
the second preprocessing module is used for performing second image preprocessing operation on the first preprocessed tongue surface image and the first preprocessed tongue bottom image to obtain a second preprocessed tongue surface image and a second preprocessed tongue bottom image;
The tongue image recognition module is used for inputting the second preprocessed tongue image and the second preprocessed tongue bottom image into a tongue image recognition model to obtain a tongue image recognition result.
The embodiment of the invention has the following technical effects:
the invention firstly collects the tongue bottom image and the tongue surface image, then carries out conventional image preprocessing operation and illumination correction work on the tongue bottom image and the tongue surface image, wherein in the illumination correction process, according to the difference of sensitivity of illumination conditions caused by different structures and position characteristics of the tongue surface and the tongue bottom, different illumination correction works are adopted to adjust brightness of the tongue surface image by adopting a linear change method, illumination correction is realized on the tongue bottom image by adopting a block correction method, and illumination estimation is independently carried out on each small block by adopting an improved Gaussian weight function formula when the illumination correction is carried out on the tongue bottom image, gradient calculation is introduced into a specific application scene of illumination estimation of the block image, and when illumination estimation is carried out on the pixels of the block tongue bottom image, the improved Gaussian weight function can determine a Gaussian weight function improvement item according to image details of each block tongue bottom image by introducing the gradient item of the block image, thereby improving the processing pertinence of illumination estimation and further ensuring that detail information of the tongue bottom image better remains.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the invention, are within the scope of the invention.
Embodiment 1, fig. 1 shows a flowchart of a method for identifying a tongue image of traditional Chinese medicine based on image processing, and as shown in fig. 1, the method for identifying a tongue image of traditional Chinese medicine based on image processing comprises the following steps:
s1, collecting tongue surface and tongue bottom images through image collecting equipment;
The image acquisition equipment comprises a high-resolution camera, a uniform light source and image acquisition software;
The high-resolution camera is used for shooting tongue surface and tongue bottom images, the high-resolution camera is required to have high resolution so as to ensure that fine features of tongue images, such as tongue color, tongue coating color, tongue shape, tongue coating quality and the like, can be clearly captured, and meanwhile, the high-resolution camera is required to have an automatic focusing function so as to adapt to tongue positions and shapes of different patients.
The uniform light source is used for providing stable illumination conditions, can ensure the color accuracy and consistency of tongue image, and reduce the image quality degradation caused by uneven illumination.
The image acquisition software is used for controlling shooting parameters of the camera and the acquisition process of the image, and has the function of automatically adjusting shooting angles and light rays, so that the acquired tongue image is clear and stable.
Wherein, the process of collecting the lingual surface and the lingual bottom image comprises the following steps:
and initializing equipment, namely starting the image acquisition equipment before the tongue surface and tongue bottom images are acquired, and ensuring that the high-resolution camera and the uniform light source are in a normal working state.
And the tongue image acquisition is that a tongue image to be acquired stretches out of a tongue, an operator controls the camera to shoot tongue image through the image acquisition software, the tongue image comprises a tongue surface and a tongue bottom image, and the image acquisition software can automatically adjust the shooting angle of the camera and the intensity of the light source to ensure that the acquired tongue image is clear and stable.
S2, performing first image preprocessing operation on the lingual surface and the lingual bottom image to obtain a first preprocessed lingual surface image and a first preprocessed lingual bottom image;
The first image preprocessing operation comprises image denoising, image segmentation and image enhancement;
in the step, the image denoising is to remove random noise in the image by adopting median filtering, and the basic principle is that each pixel value in the image is replaced by the median in the neighborhood of the pixel, so that the purposes of smoothing the image and removing noise are achieved.
The image segmentation is to extract tongue contours in the tongue surface image by adopting a Canny edge detection algorithm, and the tongue contours in the tongue bottom image are extracted by adopting a segmentation method based on a color threshold;
More further, a Canny edge detection algorithm is adopted to extract the tongue outline in the tongue face image, specifically, gradient calculation is conducted on the tongue face image, gradient intensity and direction of each pixel in the tongue face image are calculated, for each pixel, a neighborhood pixel in the gradient direction of the pixel is checked, if the gradient value of the pixel is not a local maximum value, the gradient value of the pixel is set to be 0, two thresholds are set, a high threshold is used for detecting a strong edge, a low threshold is used for detecting a weak edge, the strong edge and the weak edge are connected through an edge tracking algorithm to form a complete edge, and the Canny edge detection algorithm can effectively extract the tongue outline in the tongue face image through the steps.
Specifically, gradient calculation is carried out on the lingual image, and the gradient intensity and the direction of each pixel in the lingual image are calculated by using a Sobel operator, namely, the gradient of the lingual image in the horizontal direction and the vertical direction is calculated respectively, and then the gradient intensity and the direction of each pixel in the lingual image are calculated.
The blood vessel texture and the color characteristics of the tongue bottom image are obvious, the tongue outline can be extracted through color threshold segmentation, and the tongue outline in the tongue bottom image is extracted by adopting a segmentation method based on the color threshold, specifically, the tongue bottom image is converted into an HSV color space from RGB by using a color space conversion function;
The color of the tongue bottom is usually lighter, pink or light red, the vein texture of the tongue bottom is more obvious, the color is darker, the tongue contour can be effectively extracted according to the color characteristics of the tongue bottom image, and in order to better perform color segmentation, the tongue bottom image is usually required to be converted from an RGB color space to an HSV color space, and the HSV color space is a common choice, because color information and brightness information are separated, so that the setting of a color threshold Value is convenient, wherein H (Hue) is Hue in the HSV color space, the color is represented by the type of the color, S (Saturation) is Saturation, the purity of the color is represented, and V (Value) is brightness, and the brightness of the color is represented by the brightness.
Observing the color distribution of the tongue bottom image in an HSV color space by using a histogram analysis tool, selecting a typical color range of the tongue bottom according to an observation result, and setting a proper color threshold range according to the typical color range;
exemplary, typical colors for the tongue base are typically between pink and reddish, corresponding to H values ranging from 0 to 30 degrees, S values ranging from 0 to 100, and V values ranging from 100 to 255, and thus, the suitable color threshold ranges are:
H (hue) 0 to 30 degrees;
s (saturation) 0 to 100 degrees;
v (brightness) 100 to 255 degrees;
The tongue region is extracted by dividing the tongue bottom image by using the color threshold value, setting the pixels conforming to the condition to be white and setting other pixels to be black, namely, dividing the tongue bottom image by using the color threshold value range, extracting the pixels in the color threshold value range, setting the extracted pixels to be white (255) and setting the other pixels to be black (0), generating a binary image, perfecting the extracted contour through morphological operations such as expansion, corrosion, open operation, closed operation and the like, and removing noise and irregular parts.
Through the steps, the tongue outline in the sublingual image can be effectively extracted, the outline is perfected through morphological operation, and more accurate data support is provided for subsequent feature extraction and analysis.
Image enhancement is an important step in image preprocessing, and aims to improve the visual effect of the lingual image and the lingual image, enhance useful information in the image, improve the identifiability of the image, and histogram equalization is a common image enhancement technology, and is particularly suitable for improving the contrast of the image, thereby enhancing the texture and color characteristics of the image. Further, the image enhancement of the lingual and sublingual images is specifically:
the method comprises the steps of calculating the pixel number of each gray value in the lingual surface and lingual bottom images, calculating a cumulative histogram, namely calculating a Cumulative Distribution Function (CDF) of the histogram to obtain the cumulative histogram, and mapping the gray value of an original image to a new gray value according to the cumulative histogram to enable the new gray value to be distributed more uniformly, so that image enhancement is achieved.
S3, performing second image preprocessing operation on the first preprocessed tongue surface image and the first preprocessed tongue bottom image to obtain a second preprocessed tongue surface image and a second preprocessed tongue bottom image;
The tongue surface and the tongue bottom have different structural characteristics, the tongue surface is relatively flat, the tongue bottom has more folds and vascular textures, so that when the tongue surface image and the tongue bottom image are shot, the tongue surface and the tongue bottom have different reflection and absorption characteristics on illumination, and the sensitivity of the tongue surface and the tongue bottom to illumination conditions is different, so that effective illumination correction is necessary for accurately extracting tongue image features.
The second image preprocessing operation specifically comprises the steps of adjusting the brightness of the first preprocessed tongue surface image by adopting a linear change method, and realizing illumination correction on the first preprocessed tongue bottom image by adopting a block correction method;
Setting a target brightness range, and adjusting the gray value of the first preprocessing tongue surface image to be within the target brightness range by adopting a linear transformation formula so as to realize the brightness adjustment of the first preprocessing tongue surface image;
In this embodiment, the target luminance range is [ mean-offset, mean+offset ], where mean is the target average luminance of the first preprocessed tongue image, and offset is a preset offset value;
the linear transformation formula is as follows:
;
Wherein, I old is the gray value of the first preprocessing tongue image, I new is the gray value of the adjusted first preprocessing tongue image, the brightness of the image can be adjusted to be within a target range by the formula, so that the whole brightness of the image is more uniform, and fig. 3 shows a comparison chart of the tongue image before preprocessing and after preprocessing, as can be seen from fig. 3, the accurate adjustment of the tongue image brightness can be realized by the first preprocessing flow of the embodiment.
Because the tongue bottom image has more folds and vascular textures, the illumination correction needs to process local illumination change more carefully, so as shown in fig. 2, the illumination correction is specifically implemented by adopting a block correction method on the first preprocessed tongue bottom image:
Sa dividing the first preprocessed tongue bottom image into a plurality of small blocks;
Specifically, in this step, the size of the small blocks may be adjusted according to the resolution of the first preprocessed tongue bottom image, and the image may be divided into small blocks of 32×32 or 64×64, for example, and a certain overlapping area is ensured between the small blocks during the blocking so as to perform smoothing in a subsequent fusion process, where the size of the overlapping area may be adjusted according to the size of the small blocks, and is typically 1/4 to 1/2 of the side length of the small blocks.
Sb, adopting an improved Gaussian weight function formula to independently carry out illumination estimation on each small block;
in the step, a local weighted average method is adopted to estimate illumination components, different weights are given in the neighborhood of each pixel of each small block according to the distance between the pixel and a central pixel, the closer the distance is, the larger the weight is, and the illumination estimated components of the pixel are obtained by calculating the weighted average value of the gray values of the pixels in the neighborhood, wherein the specific formula is as follows:
;
Where N (x, y) is a neighborhood centered on pixel (x, y), w (I, j) is the weight of pixel (I, j), and I illumination (I, j) is the illumination estimation component of pixel (I, j);
In this step, the determination of the weights has a large influence on the calculation of the illumination estimation values, and the weights are generally calculated by using a Gaussian function in the prior art, namely
;
However, for a sublingual image, the method comprises more detail information such as more folds and vascular textures, and the like, and the Gaussian weight function can cause loss of some detail information when processing the edge and detail parts of the image, so that in order to improve the point, the embodiment provides an improved technical scheme for determining the weight by the Gaussian weight function;
wherein the improved Gaussian weight function formula is:
;
where w (i, j) is the weight, (x, y) is the position of the center pixel, (i, j) is the pixel position in the neighborhood, σ is the standard deviation of the Gaussian weight function, Is the gradient value at pixel (i, j) for each image,Is the maximum value of each image gradient;
According to the improved Gaussian weight function formula, gradient calculation is introduced into a specific application scene of illumination estimation of the segmented image, when illumination estimation is carried out on pixels of the segmented sublingual image, the improved Gaussian weight function can determine Gaussian weight function improvement items according to image details of each sublingual image by introducing the gradient items of the pixels of the segmented image under the condition that the details contained in each segmented image are different, so that the processing pertinence in illumination estimation is improved, and the estimated pixels can better retain the detail information of the sublingual image.
Sc, carrying out image correction on each small block according to the pixel local illumination estimated value of each small block;
and obtaining the corrected image of each small block by dividing the pixel gray value of the original image by the illumination estimation component.
And Sd, carrying out image fusion on each corrected small block to realize illumination correction on the first preprocessed tongue bottom image by adopting a block correction method.
In the fusion process, the main point to be noted is the smooth processing of the edges of the small blocks to avoid obvious splicing marks, so that Sd is specifically that different weights are given to the overlapped areas of the small blocks according to the distance between each pixel and the center of the small blocks, smooth fusion is carried out, the weights are calculated by adopting the improved Gaussian weight function formula, and the fused pixel values of the overlapped areas are calculated according to the weights;
The specific formula is as follows:
;
Where I corrected,k (I, j) is the gray value of the corrected image of the kth patch at pixel (I, j), and w k (I, j) is the weight of the kth patch at pixel (I, j).
In the step, the weight is gradually reduced in the edge area of the small block, so that obvious splicing marks are avoided, and meanwhile, the gray value transition of the fused image in the edge area is ensured to be natural through the smooth attenuation characteristic of the improved Gaussian weight function.
S4, inputting the second preprocessed tongue surface image and the second preprocessed tongue bottom image into a tongue image recognition model to obtain a tongue image recognition result.
In tongue image recognition, a Convolutional Neural Network (CNN) is one of the most commonly used deep learning models, and the CNN has strong feature extraction capability and can automatically learn complex patterns and features in an image, so that the image recognition model of the embodiment is a convolutional neural network model.
The convolutional neural network model comprises a plurality of convolutional layers, a pooling layer and a full-connection layer, wherein the convolutional layers can automatically extract features in an image, the convolutional layers slide on the image through convolutional kernels to extract local features, each convolutional kernel extracts a specific feature such as edges, textures and the like, the pooling layer is used for reducing the space dimension of a feature map, reducing the calculated amount and simultaneously keeping important features, the full-connection layer flattens the feature map into a one-dimensional vector, classification tasks are carried out through a multi-layer neural network, and the output of the full-connection layer is the final prediction result of the image recognition model.
When the convolutional neural network model is adopted for tongue image recognition, the convolutional neural network model needs to be trained by using labeled tongue image data, and in the training process, the weight of the convolutional neural network model is updated through a back propagation algorithm, so that a loss function is minimized, and in the embodiment, the loss function adopts a cross entropy loss function;
In addition, in the training process, super parameters of the convolutional neural network model, such as learning rate, batch size, optimizer and the like, are adjusted so as to improve the training effect of the model.
The tongue image recognition result comprises pale red tongue, pale white tongue, red tongue, dark red tongue, white tongue coating, yellow tongue coating, gray tongue coating, black tongue coating, fat tongue shape, thin tongue shape, tooth trace and crack tongue shape.
By inputting tongue images into a trained deep learning model, automatic Chinese medicine image recognition can be realized, the convolutional neural network model has strong feature extraction capability, complex modes and features in the images can be automatically learned, the accuracy and the efficiency of image recognition are improved, and powerful support is provided for Chinese medicine image recognition.
Embodiment 2 of the present invention further provides an image processing-based tongue image recognition system, where the system adopts the image processing-based tongue image recognition method of embodiment 1, and the system includes:
the image acquisition equipment acquires tongue surface and tongue bottom images;
The first preprocessing module is used for performing first image preprocessing operation on the lingual surface and the lingual bottom image to obtain a first preprocessed lingual surface image and a first preprocessed lingual bottom image;
the second preprocessing module is used for performing second image preprocessing operation on the first preprocessed tongue surface image and the first preprocessed tongue bottom image to obtain a second preprocessed tongue surface image and a second preprocessed tongue bottom image;
The tongue image recognition module is used for inputting the second preprocessed tongue image and the second preprocessed tongue bottom image into a tongue image recognition model to obtain a tongue image recognition result.
Embodiment 3 the present invention also provides an electronic device comprising one or more processors and memory.
The processor may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device to perform the desired functions.
The memory may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium and executed by a processor to perform a method of recognition of a tongue image in traditional Chinese medicine and/or other desired functions based on image processing in accordance with any of the embodiments of the present application described above. Various content such as initial arguments, thresholds, etc. may also be stored in the computer readable storage medium.
In one example, the electronic device may also include an input device and an output device, which are interconnected by a bus system and/or other form of connection mechanism (not shown). The input means may comprise, for example, a keyboard, a mouse, etc. The output device can output various information to the outside, including early warning prompt information, braking force and the like. The output means may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, components such as buses, input/output interfaces, etc. are omitted for simplicity. In addition, the electronic device may include any other suitable components depending on the particular application.
In addition to the methods and apparatus described above, embodiments of the present application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to implement the functions of a method for identifying a tongue image of traditional Chinese medicine based on image processing provided by any of the embodiments of the present application.
The computer program product may write program code for performing operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
In addition, an embodiment of the present application may also be a computer readable storage medium, on which computer program instructions are stored, which when executed by a processor, cause the processor to implement a method for identifying a tongue image of traditional Chinese medicine based on image processing provided by any embodiment of the present application.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of a readable storage medium include an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
It should be noted that the above embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the above embodiments, it should be understood by those skilled in the art that the technical solution described in the above embodiments may be modified or some or all of the technical features may be equivalently replaced, and these modifications or substitutions do not deviate the essence of the corresponding technical solution from the technical solution of the embodiments of the present invention.