CN118317092A - Image encoding method, image encoding device, storage medium and electronic device - Google Patents
Image encoding method, image encoding device, storage medium and electronic device Download PDFInfo
- Publication number
- CN118317092A CN118317092A CN202410744396.4A CN202410744396A CN118317092A CN 118317092 A CN118317092 A CN 118317092A CN 202410744396 A CN202410744396 A CN 202410744396A CN 118317092 A CN118317092 A CN 118317092A
- Authority
- CN
- China
- Prior art keywords
- target
- determining
- region
- value
- interest
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 238000013139 quantization Methods 0.000 claims abstract description 125
- 230000033001 locomotion Effects 0.000 claims description 131
- 238000004364 calculation method Methods 0.000 claims description 78
- 238000001514 detection method Methods 0.000 claims description 54
- 238000013507 mapping Methods 0.000 claims description 41
- 238000004590 computer program Methods 0.000 claims description 25
- 239000002131 composite material Substances 0.000 claims description 17
- 230000002123 temporal effect Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 description 10
- 230000000694 effects Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 7
- 238000000638 solvent extraction Methods 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 5
- 230000006835 compression Effects 0.000 description 5
- 238000007906 compression Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000012886 linear function Methods 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
- H04N19/137—Motion inside a coding unit, e.g. average field, frame or block difference
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The embodiment of the invention provides an image coding method, an image coding device, a storage medium and an electronic device, wherein the method comprises the following steps: dividing the target image based on the image characteristics of the target image to obtain a target region of interest, an expanded region of interest and other regions; determining a target region of interest integrated value for indicating the importance degree of the target region when the target region is a non-other region, determining an initial offset value of the target region based on the target region of interest integrated value, determining a first calculated value when the target region is another region, determining the initial offset value of the target region based on the first calculated value, and adjusting the initial offset value based on the coding information of the coded image included in the video where the target image is located to obtain the target offset value; and adjusting the initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
Description
Technical Field
Embodiments of the present invention relate to the field of communications, and in particular, to an image encoding method, an image encoding device, an image encoding storage medium, an electronic device, and a computer program product.
Background
In the related art, an image is generally encoded using a predicted motion trajectory, however, encoding an image using a predicted motion trajectory may cause subjective problems such as tailing effect on a trajectory after motion.
As is clear from this, the related art has a problem of poor image encoding effect.
In view of the above problems in the related art, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides an image coding method, an image coding device, a storage medium, an electronic device and a computer program product, which are used for at least solving the problem of poor image coding effect in the related technology.
According to an embodiment of the present invention, there is provided an encoding method of an image, including: dividing a target image based on image characteristics of the target image to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image; the following operations are performed for each target region included in the target region of interest, the extended region of interest, and the other regions to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
According to another embodiment of the present invention, there is provided an encoding apparatus of an image, including: the dividing module is used for dividing the target image based on the image characteristics of the target image to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image; an encoding module, configured to perform the following operations for each target region included in the target region of interest, the extended region of interest, and the other region, to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
According to a further embodiment of the invention, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to yet another embodiment of the present application, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method described in the various embodiments of the application.
According to the invention, the target image is divided based on the image characteristics of the target image, so as to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image; the following operations are performed for each of the target region of interest, the extended region of interest, and each of the other regions included to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region in the case where the target region is a non-other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value based on the first calculated value in the case where the target region is another region, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in a target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value based on coding information of a coded image included in a video where the target image is located, to obtain a target offset value; and adjusting the initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter. The target image can be divided according to the image characteristics of the target image to obtain a target region of interest, an expanded region of interest and other regions, a target offset value for adjusting the initial quantization parameter is determined for each target region, the initial quantization parameter is adjusted by using the target offset value to obtain a target quantization parameter, and each target region is encoded by using the corresponding target quantization parameter to finish encoding the target image. The target region of interest, the extended region of interest and other regions can be encoded by using different target quantization parameters according to the self-adaptive adjustment quantization parameters of the image parameters, so that the problem of poor image encoding effect in the related technology can be solved, and the effect of improving the image encoding quality can be achieved.
Drawings
Fig. 1 is a block diagram of a hardware configuration of a mobile terminal of an image encoding method according to an embodiment of the present invention;
fig. 2 is a flowchart of an encoding method of an image according to an embodiment of the present invention;
FIG. 3 is a schematic view of sub-areas of each of N consecutive images according to an embodiment of the invention;
FIG. 4 is a schematic diagram of a combined area according to an embodiment of the invention;
FIG. 5 is a schematic diagram of a target region of interest according to an embodiment of the invention;
FIG. 6 is a schematic diagram of an extended region of interest in accordance with an embodiment of the present invention;
FIG. 7 is a schematic diagram of a historical region of interest of an encoded image according to an embodiment of the present invention;
FIG. 8 is a schematic view of a target region of interest of a target image according to an embodiment of the invention;
FIG. 9 is an expanded region of interest II according to an embodiment of the present invention;
FIG. 10 is a flowchart of a method of encoding an image according to an embodiment of the present invention;
fig. 11 is a block diagram of a structure of an image encoding apparatus according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal according to an embodiment of the present application. As shown in fig. 1, a mobile terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the mobile terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for encoding an image in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, to implement the above-described method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as a NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
In this embodiment, there is provided a method for encoding an image, and fig. 2 is a flowchart of a method for encoding an image according to an embodiment of the present invention, as shown in fig. 2, the flowchart including the steps of:
Step S202, dividing a target image based on image features of the target image to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image;
Step S204, for each target region included in the target region of interest, the extended region of interest, and the other regions, performing the following operations to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
In the above embodiment, the image encoding method may be applied to a video encoding scene, and the method of encoding each frame of image in video may be the image encoding method. The standards of video coding and decoding include H.264/AVC, H.265/HEVC, H.266/VVC, VP8, VP9, AV1, AVS and the like, and the main purpose of the video coding and decoding is to compress the collected video signals into data with different standard formats, so that the video coding and decoding are convenient to transmit or store. In order to apply the video coding technology to the actual scene and better improve the subjective quality while compressing the code rate, the technologies of code rate control, ROI, motion detection and the like are widely applied to the actual coding. Techniques for controlling the rate of an encoder to a certain extent may be used by controlling some of the encoder parameters. In general, the higher the code rate, the higher the subjective quality of the video, and the worse the compression rate; conversely, the worse the subjective quality of the video, the higher the compression rate. Rate control is a method of achieving a balance between subjective quality and compression rate. Common code controls are constant code rate (Constant Bit Rate, CBR), variable code rate (Variable Bit Rate, VBR), adaptive variable code rate (Adaptive Variable Bit Rate, AVBR), etc.
In the above embodiment, the target image may be any frame of image included in the video, and the image features may include texture features, for example, temporal texture information, spatial texture information, and image features may further include region of interest information, and may further include color information, shape information, depth information, motion information, and the like. Dividing the target image based on the image features of the target image, obtaining the target region of interest, expanding the region of interest, and other regions may include: and determining a target region of interest based on the region of interest information included in the image features, determining a region with a distance smaller than a set parameter from the target region of interest as an extended region of interest, and determining regions except the target region of interest and the extended region of interest in the target image as other regions. Wherein other regions may also be referred to as non-regions of interest. The target region of interest ROI in the image can be found by applying a threshold and an edge detection algorithm. ROI (Region of Interest ): a region in the image is selected from the image, which is the focus of analysis and attention to the image. Delineating the region for further processing can reduce processing time and increase accuracy. Deep learning and computer vision techniques may also be utilized, and the model may be trained to automatically detect and identify regions of interest in the image. Feature point detection algorithms may also be used, and salient feature points in the image may be found and the ROI determined therefrom. The salient feature points in the target region of interest may be greater than the first parameter value, and the salient feature points in the extended region of interest may be greater than the second parameter value and less than the first parameter value, wherein the first parameter value is greater than the second parameter value.
In the above embodiment, the ROI area, the extended region of interest, and the non-ROI area are divided by the image features, the QP offset value is calculated according to the areas, and after the QP offset value is adjusted and limited by using the encoded frame information, the QP offset value is used for encoding and ensuring that the objective result, subjective quality, and user requirement of the encoding reach the optimal scheme. The subjective quality around the ROI can be improved by determining the expansion region of interest, namely the expansion mode of planning the ROI, and the problems of tailing effect and blocking effect around the ROI are effectively reduced.
In the above embodiment, the target offset value of the target region of interest may be determined, the initial quantization parameter of the target region of interest may be adjusted according to the target offset value, the target quantization parameter may be obtained, and the target region of interest may be encoded according to the target quantization parameter. The target offset value of the extended region of interest may be determined, the initial quantization parameter of the extended region of interest may be adjusted according to the target offset value, the target quantization parameter may be obtained, and the extended region of interest may be encoded according to the target quantization parameter. The target offset value of the other region can be determined, the initial quantization parameter of the other region is adjusted according to the target offset value, the target quantization parameter is obtained, and the other region is encoded according to the target quantization parameter. The initial quantization parameter of the target region of interest, the initial quantization parameter of the extended region of interest, and the initial quantization parameter of the other regions may be predetermined parameters. The initial quantization parameter of the target region of interest, the initial quantization parameter of the extended region of interest, and the initial quantization parameter of the other regions may be initial parameters of a coding scheme in which the image is encoded. The three may be the same or different. The quantization parameter may be represented as QP, quantization Parameter, among others, the quantization parameter used by the encoding process.
In the above embodiment, the target motion may be included in the target region, and the target motion detection value of the target region may be determined according to the type of motion. And determining a target texture calculated value according to the complexity of the target texture, and determining a target region of interest comprehensive value according to the importance degree of the target region. And determining an initial offset value of the target region according to the target motion detection value, the target texture calculation value and the target region-of-interest comprehensive value.
In the above embodiment, the initial offset value may be adjusted according to the encoded image in the video where the target image is located, where the encoded image may include one frame of image or may include multiple frames of image, which is not limited in the present invention. Adjusting the initial offset value may include adjusting the initial offset value up and down, although adjusting the initial offset value may also include maintaining the initial offset value unchanged.
In the above embodiment, adjusting the initial quantization parameter based on the target offset value may include adding the initial quantization parameter to the target offset value to obtain the target quantization parameter, and subtracting the initial quantization parameter from the target offset value to obtain the target quantization parameter.
According to the invention, the target image is divided based on the image characteristics of the target image, so as to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image; the following operations are performed for each of the target region of interest, the extended region of interest, and each of the other regions included to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region in the case where the target region is a non-other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value based on the first calculated value in the case where the target region is another region, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in a target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value based on coding information of a coded image included in a video where the target image is located, to obtain a target offset value; and adjusting the initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter. The target image can be divided according to the image characteristics of the target image to obtain a target region of interest, an expanded region of interest and other regions, a target offset value for adjusting the initial quantization parameter is determined for each target region, the initial quantization parameter is adjusted by using the target offset value to obtain a target quantization parameter, and each target region is encoded by using the corresponding target quantization parameter to finish encoding the target image. The target region of interest, the extended region of interest and other regions can be encoded by using different target quantization parameters according to the self-adaptive adjustment quantization parameters of the image parameters, so that the problem of poor image encoding effect in the related technology can be solved, and the effect of improving the image encoding quality can be achieved.
Alternatively, the main body of execution of the above steps may be a background processor, or other devices with similar processing capability, and may also be a machine integrated with at least an image acquisition device and a data processing device, where the image acquisition device may include a graphics acquisition module such as a camera, and the data processing device may include a terminal such as a computer, a mobile phone, and the like, but is not limited thereto.
In one exemplary embodiment, dividing the target image based on image features of the target image to obtain a target region of interest, expanding the region of interest, and other regions includes: determining the target region of interest based on region of interest information included in the image features; determining the extended region of interest based on temporal texture information or spatial texture information included in the image features; and determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas. In this embodiment, the region of interest information may include information of a region of the ROI, a occupied area ratio, a time length, a type, a level, and the like. Temporal texture information may refer to image texture features that are present in temporal variations. The temporal texture information may describe features of the image that change over time, motion, action, etc., such as movement, deformation, vibration, etc., of the object. Spatial texture information may be spatial relationships and distribution characteristics between pixels in an image. Spatial texture typically describes the texture characteristics of an image by counting gray values or color distributions among pixels, and determining spatial texture information may include gray co-occurrence matrices, local binary patterns, directional gradient histograms, and the like. By analyzing and extracting the airspace texture of the image, the structure and the characteristics of the image can be understood, and the method is further used for image classification, target detection, image segmentation and other applications.
In the above embodiment, the ROI area may be divided by an algorithm, and the target image is analyzed to divide the ROI area, i.e., the target region of interest, the ROI extension area, i.e., the extension region of interest, and the non-ROI area, i.e., the other area. The region where the ROI is located can be divided by acquiring the information of the region of the ROI, the area ratio, the type, the grade and the like of the current frame according to the prior information of the ROI, namely the region information of the region of interest. The ROI region can be expanded according to priori information such as time domain texture information, space domain texture information and the like, and the ROI expansion region is divided; the remaining region is taken as a non-ROI region.
It should be noted that the target region of interest, the extended region of interest, and the other regions may be independent regions, and the three regions have no intersection.
In one exemplary embodiment, determining the extended region of interest based on temporal texture information included in the image features comprises: determining N frames of continuous images which are included in the video and are positioned in front of the target image, wherein the last frame of image which is included in the N frames of continuous images is adjacent to the target image, the N frames of continuous images all include target objects which are included in the target image, and N is an integer which is greater than or equal to 1; determining a motion trail of the target object based on the time domain texture information of the N frames of continuous images; determining a sub-region of the motion trail in each frame of images included in the N frames of continuous images; combining the subareas included in each frame of image to obtain a combined area; determining an overlapping region of the combined region and the target region of interest; and determining other areas except the overlapped area included in the combined area as the expansion interested area. In this embodiment, the extended region may be divided by a motion trajectory combining object, for example, a trajectory combining object which has been last several frames and is not planned within the ROI region may be divided into ROI extended regions. The first N frames of continuous images which are included in the video and are located before the target image are acquired, for example, in the video, the target image is the 7 th frame of image, and N frames of continuous images can be determined from the 1 st to 6 th frames of images in the video, and all the N frames of continuous images comprise the target object in the target image. For example, if the 4 th to 6 th frame images include the target object, the 4 th, 5 th and 6 th frame images are determined as N frame continuous images. The N-frame continuous image may be a multi-frame image or a single-frame image. When the target object is included in only one frame of the images located before the target image in the video, then the N frames of consecutive images are 1 frame.
In the above embodiment, the motion trail of the target object may be determined according to the temporal texture information of each frame of image included in the N frames of continuous images, the sub-regions of the motion trail in the image may be determined, the plurality of sub-regions may be combined to obtain a combined region, the overlapping region of the combined region and the target region of interest may be determined, and the region other than the overlapping region in the combined region may be determined as the extended region of interest.
In the above embodiment, n=3, the N continuous images may include 3 frames, the sub-area of each of the N continuous images may be the shaded portion in fig. 3, the combined area may be the shaded portion in fig. 4, the target region of interest may be the shaded portion in fig. 5, and the determined extended region of interest is the shaded portion in fig. 6.
In one exemplary embodiment, determining the extended region of interest based on spatial texture information included in the image features includes: determining that a first region, of which the distance from the target region of interest is smaller than a first threshold, is included in the target image; determining texture complexity for indicating a complexity level of a texture included in the first region based on the spatial texture information of the first region; and determining a second region, included in the first region, in which the texture complexity is greater than a second threshold, as the extended region of interest. In this embodiment, the extended region may be further divided according to the union of the ROI regions, so as to calculate the ROI region of the current frame, that is, the spatial texture information around the target region of interest; the texture complex region around the ROI region is expanded to the ROI extension region.
In one exemplary embodiment, dividing the target image based on image features of the target image to obtain a target region of interest, expanding the region of interest, and other regions includes: determining the target region of interest based on region of interest information included in the image features; determining a historical region of interest of an encoded image included in the video; determining a union of the historical interesting areas to obtain a third area; determining a fourth region which is included in the third region and coincides with the target region of interest; determining a region other than the fourth region included in the third region as the extended region of interest; and determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas. In this embodiment, the region of interest information may include information of a region of the ROI, a occupied area ratio, a time length, a type, a level, and the like. The target region of interest may be determined from the region of interest information. And determining a historical region of interest for the encoded image included in the video. The encoded image may be a multi-frame image or a single-frame image. And taking the union set of the ROIs of the encoded frames, and obtaining the coverage area of the encoded frames of the ROI areas to divide the coverage area into the expansion areas of the ROIs. Wherein, when the encoded image is 1 frame, the historical interested area of the encoded image can be seen in fig. 7, the target interested area of the target image can be seen in fig. 8, and the extended interested area can be seen in fig. 9.
In one exemplary embodiment, determining the object motion detection value for indicating a motion type of the object motion present in the object region includes: determining a target motion type of the target motion existing in the target region based on time domain texture information of the target region; determining a first correspondence between the motion type and the motion detection value; and determining the target motion detection value corresponding to the target motion type, which is included in the first corresponding relation. In this embodiment, the temporal texture may be divided into motion cases, including but not limited to four cases of stationary, small motion, medium motion, large motion, denoted by md. There may be a first correspondence between the type of motion and the motion detection value. The motion types may include stationary, small motion, medium motion, large motion, and the like. The first correspondence may be expressed asWhere md represents the motion detection value. For example, a target motion type in which there is motion in the target region may be determined according to temporal texture information of the target region, and when the target motion type is stationary, the target motion detection value may be 0.
In one exemplary embodiment, determining a target texture calculation value indicative of a target texture complexity of the target region comprises: determining target texture complexity of the target region based on airspace texture information of the target region; determining a second correspondence between texture complexity and texture calculated values; and determining the target texture calculated value corresponding to the target texture complexity, which is included in the second corresponding relation. In this embodiment, the target texture complexity of the target region may be determined according to the airspace texture information of the target region, and the airspace texture may be divided into four cases including, but not limited to, simple, small complex, medium complex, and large complex, and expressed by tex. The second correspondence may be expressed asWhere tex denotes a texture calculated value. For example, when the target texture complexity is simple, the target texture calculation value is 0.
In one exemplary embodiment, determining a target region of interest integrated value indicative of a degree of importance of the target region includes: determining a second calculated value of the target region of interest composite value, wherein the second calculated value comprises at least one of: the method comprises the steps of a first level corresponding to a target area of a target region of interest, a second level corresponding to a target duration of the target region of interest in the video, and a third level corresponding to an importance level of the target region of interest, wherein the importance level of the target region of interest is determined based on objects included in the target region of interest and a corresponding relation between the objects and the importance level; determining, for each first target calculation value included in the second calculation value, a first product of the first target calculation value and a first weight of the target calculation value; the target region of interest composite value is mapped based on the first product. In this embodiment, the target interest integrated value may be determined according to one or more of a target area of the target region of interest, a target duration of the target region of interest occurring in the video, and an importance degree of the target region of interest. When the target interest integrated value is determined according to one of the options, the target interest integrated value can be mapped according to the grade corresponding to the option, the target interest integrated value can be mapped according to the first product of the grade corresponding to the option and the first weight, and the target interest region integrated value can be mapped according to the first product of the grade corresponding to the option and the first weight and other values. Wherein the mapping includes, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like.
In the above embodiment, when the target interest integrated value is determined according to the target area of the target region of interest, the target duration of the target region of interest appearing in the video, and the plurality of options in the importance degree of the target region of interest, the sum of the values corresponding to the plurality of options may be determined as the target interest integrated value, and the first product of each option and its weight may be determined, and the target region of interest integrated value may be mapped according to the plurality of first products. Wherein the mapping may include, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like.
In one exemplary embodiment, determining a target region of interest integrated value indicative of a degree of importance of the target region includes: determining a first grade corresponding to a target area of the target region of interest; determining a second level corresponding to a target duration of the target region of interest in the video; determining a third level corresponding to the importance degree of the target region of interest, wherein the importance degree of the target region of interest is determined based on the objects included in the target region of interest and the correspondence between the objects and the importance degree; determining a target first weight corresponding to the first level, determining a target second weight corresponding to the second level, and determining a target third weight corresponding to the third level; determining a first product of the first level and the target first weight, determining a second product of the second level and the target second weight, and determining a third product of the third level and the target third weight; and determining the sum value of the first product, the second product and the third product as the target region of interest integrated value. In the present embodiment, ROI information acquisition includes, but is not limited to, the following information: its location area, occupied duration, ROI grade, etc. A target region of interest composite value may be determined from the region of interest information. For example, the start point of the ROI region coordinate at 1080P resolution is (0, 0), the end point is (300, 400), the divided region is determined to occupy 300×400=120000 pixels, the ratio is 120000/(1920×1080) =0.058, so the occupied area ratio is 5.8%, the set area ratio is divided into 10 levels of 1-10 in proportion to 10%, and the level of the area ratio is 1. The time period of the occupied area is set to be 30 seconds, the time period of the occupied area is set to be 50% in the time period of 1 minute, the time period is set to be divided into 10 grades from 1 to 10 according to the proportion of 10%, and then the grade of the time period is set to be 5.ROI importance level class n=3, which is classified into three types of important, unimportant, and neglect. The code rate investment can be increased for important types, unimportant areas are not processed, and the code rate investment is reduced for areas which are ignored. The set grade values can respectively correspond to-10,0,10 grades according to unimportance, neglect and importance. The medium class classification is important in this example. The value of the ROI needs to integrate the three conditions, and finally a comprehensive ROI value is calculated; assuming that the area ratio is 20%, the time ratio is 20% and the class classification ratio is 60% in the comprehensive value calculation, the calculation formula of the ROI is as follows: roi=1×0.2+5×0.2+10×0.6=7.2. Then after fusion the ROI composite value is 7.2.
It should be noted that the above-mentioned target first weight, target second weight, and target third weight are only exemplary, and the target first weight, target second weight, and target third weight may be set to other values, which is not limited by the present invention.
In one exemplary embodiment, determining the initial offset value for the target region based on the target region of interest composite value comprises: mapping the initial offset value based on the target integrated value of interest if the target region is the target region of interest; and under the condition that the target area is the extended interested area, mapping the initial offset value based on the target interested comprehensive value and a first preset constant. In this embodiment, mapping the initial offset value according to the target interest integrated value may include determining the target interest integrated value as the initial offset value, and may further include performing operations on the target interest integrated value, determining the operation result as the initial offset value, where the operations include, but are not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like. For example, the target interest integrated value plus the target interest integrated value, the target interest integrated value is multiplied by a certain coefficient, divided or subtracted, the target interest integrated value is convolved with the target interest integrated value, the target interest integrated value is rounded up, rounded down, and the like, and the method may further include comparing the operation result with the mapping table, and determining an initial offset value corresponding to the operation result in the mapping table. The mapping table may be a predetermined table, where the mapping table includes a one-to-one correspondence between each operation result and the offset value.
In the above embodiment, mapping the initial offset value based on the target integrated value of interest and the first preset constant may include performing an operation on the target integrated value of interest with the first preset constant, and determining the operation result as the initial offset value, where the operation includes, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like. For example, the method may further include comparing the operation result with a mapping table, and determining an initial offset value corresponding to the operation result in the mapping table. The mapping table may be a predetermined table, where the mapping table includes a one-to-one correspondence between each operation result and the offset value.
In one exemplary embodiment, determining the initial offset value for the target region based on the target region of interest composite value comprises: determining a third calculated value if the target region is not the other region, wherein the third calculated value includes at least one of: a second motion detection value for indicating a target motion type of a target motion present in the target region, a second texture calculation value for indicating a target texture complexity of the target region; determining, for each second target calculation value included in the third calculation value, a second product of the second target calculation value and a second weight of the second target calculation value, mapping the initial offset value based on the target region of interest integrated value and the second product, if the target region is the target region of interest; and if the target region is the extended region of interest, determining a third product of the second target calculated value and a third weight of the second target calculated value for each second target calculated value included in the third calculated value, and mapping the initial offset value based on the target region of interest integrated value, the third product and a second preset constant. In this embodiment, when the target region is the target region of interest or the extended region of interest, the initial offset value may be mapped according to the target region of interest integrated value and the second motion detection value, or according to the target region of interest integrated value and the second texture calculation value, or according to the target region of interest integrated value, the second motion detection value and the second texture calculation value. Wherein the mapping may include, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like.
In the above-described embodiment, the second motion detection value for indicating the target motion type of the target motion present in the target region or the second texture calculation value for indicating the target texture complexity of the target region may be determined, or the second motion detection value and the second texture calculation value may be determined. A second product of the second motion detection value and/or the second texture calculation value and its corresponding weight may be determined. The initial offset value is mapped according to the target region of interest integrated value and the second product. For example, the sum of the target region of interest integrated value and the second product may be determined as an initial offset value, the difference between the target region of interest integrated value and the second product may be determined as an initial offset value, the product of the target region of interest integrated value and the second product may be determined as an initial offset value, the ratio of the target region of interest integrated value and the second product may be determined as an initial offset value, the convolution of the target region of interest integrated value and the second product may be determined as an initial offset value, the operation result of the operation of the target region of interest integrated value and the second product may be rounded up, rounded down, or the like, to obtain an initial offset value. And comparing the operation result of the operation of the target region of interest comprehensive value and the second product with the mapping table, and determining an initial offset value corresponding to the operation result in the mapping table. The mapping table may be a predetermined table, where the mapping table includes a one-to-one correspondence between each operation result and the offset value.
In one exemplary embodiment, determining the initial offset value for the target region based on the target region of interest composite value comprises: determining a fourth product of a target fourth weight corresponding to the target motion detection value and the target motion detection value, and determining a fifth product of a target fifth weight corresponding to the target texture calculation value and the target texture calculation value under the condition that the target region is the target region of interest; determining a sum of the fourth product and the fifth product and the target region of interest integrated value as the initial offset value; determining a sixth product of a target sixth weight corresponding to the target motion detection value and the target motion detection value, and determining a seventh product of a target seventh weight corresponding to the target texture calculation value and the target texture calculation value when the target region is the extended region of interest; determining the sum of the sixth product and the seventh product, the target region of interest integrated value and a preset constant as the initial offset value; if the target area is other areas, determining an eighth product of a target eighth weight corresponding to the target motion detection value and the target motion detection value, and determining a ninth product of a target ninth weight corresponding to the target texture calculation value and the target texture calculation value; a sum of the eighth product and the ninth product is determined as the initial offset value. In this embodiment, the QP initial offset value is calculated by mapping in different manners according to the regions; reasons for primarily affecting QP offset value size include, but are not limited to, temporal texture, spatial texture, ROI level, etc. information: the method can divide the complex situation of the current frame according to the size of the airspace texture into the following steps
Class; the motion condition of the current frame can be distinguished according to the time domain texture and divided into
Class; ROI information may be targeted, including, but not limited to, the following: the method comprises the steps of setting a position area as an element s1, setting a occupied area as an element s2, setting a occupied time as an element s3, setting an ROI grade as an element s4, and obtaining classification conditions according to the following formula:
。
where f 1 is a mapped function curve, which may be a linear function or a nonlinear function, and N 3 is a classification result. The rest of the reference information is . From the above m kinds of information, the QP initial Offset value Offset can be calculated and acquired, respectively:
Where f2, f3, and f4 are mapped function curves, which may be linear functions or nonlinear functions, and Offset is the result of the calculated Offset value.
In the above embodiment, the target fourth weight, the target fifth weight, the target sixth weight, the target seventh weight, the target eighth weight, and the target ninth weight are predetermined weights. The fourth, fifth, sixth, seventh, eighth, and ninth weights may be the same, may be different, or may be partially the same, and the invention is not limited in this respect.
In the above embodiment, the initial offset value may be expressed asWherein the ROI region represents an initial offset value of the target region of interest, the ROI extension represents an initial offset value of the extended region of interest, and the non-ROI region represents offset values of other regions. md represents a target motion detection value, tex represents a target texture calculation value, ROI represents a target region of interest synthesis value, and d represents a preset constant. At this time, the fourth weight, the fifth weight, the sixth weight, the seventh weight, the eighth weight, and the ninth weight are 0.3, 0.2, 0.7, and 0.3, respectively.
When the md motion detection value is 3, the tex texture calculation value is 2, the ROI is 7.2, the value calculated by ROI integration is 7.2, and the d is constant-1, the QP offset value calculated according to the model is。
It should be noted that, the fourth weight, the fifth weight, the sixth weight, the seventh weight, the eighth weight, and the ninth weight in the above formula are only exemplary, and the fourth weight, the fifth weight, the sixth weight, the seventh weight, the eighth weight, and the ninth weight may be other values, which is not limited in this invention.
In one exemplary embodiment, determining an initial offset value for the target region based on the first calculated value includes: determining a fourth product of the third target calculation value and a fourth weight for each third target calculation value included in the first calculation value; the initial offset value is mapped based on the fourth product. In this embodiment, for other regions, the first motion detection value and/or the first texture calculation value of the other regions may be determined. The initial offset value may be mapped according to a fourth product of the first motion detection value and the fourth weight, may be mapped according to a fourth product of the first texture calculation value and the fourth weight, and may be mapped according to a fourth product of the first motion detection value and the fourth weight and a fourth product of the first texture calculation value and the fourth weight. Wherein the mapping may include, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like. For example, when there is a fourth product, that is, there is a fourth product of the first motion detection value and the fourth weight, or a fourth product of the first texture calculation value and the fourth weight, the fourth product may be determined as an initial offset value, and the fourth product may be further operated with a coefficient to obtain an operation result, and the operation result may be determined as an initial offset value, where the operation includes, but is not limited to, adding, subtracting, multiplying, dividing, convolving, rounding, and the like. When there are two fourth products, that is, there is a fourth product of the first motion detection value and the fourth weight and a fourth product of the first texture calculation value and the fourth weight, the two fourth products may be operated to obtain an operation result, the operation result is determined to be an initial offset value, and the operation includes, but is not limited to, addition, subtraction, multiplication, division, convolution, rounding, and the like. The operation result can be compared with the mapping table, and an initial offset value corresponding to the operation result in the mapping table is determined. The mapping table may be a predetermined table, where the mapping table includes a one-to-one correspondence between each operation result and the offset value.
In an exemplary embodiment, adjusting the initial offset value based on encoding information of an encoded image included in a video in which the target image is located, the obtaining the target offset value includes at least one of: determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as the target offset value if the first difference is greater than a first set value, determining a difference between the initial offset value and a second constant as the target offset value if the first difference is less than a second set value, Determining the initial offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value; determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as the target offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the target offset value if the second difference is less than a fourth set value, determining the initial offset value as the target offset value if the second difference is greater than the fourth set value and less than the third set value, Wherein the third set value is smaller than the fourth set value; Determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as a first intermediate offset value if the first difference is greater than a first set value, determining a difference between the initial offset value and a second constant as the first intermediate offset value if the first difference is less than a second set value, determining the initial offset value as the first intermediate offset value if the first difference is greater than the second set value and less than the first set value, Wherein the first set point is greater than the second set point, a first complexity of spatial texture of the target image is determined, and a second complexity of spatial texture of the encoded image is determined, a second difference of the first complexity and the second complexity is determined, the difference of the first intermediate offset value and a third constant is determined as the target offset value if the second difference is greater than a third set point, a sum of the first intermediate offset value and a fourth constant is determined as the target offset value if the second difference is less than a fourth set point, and the second difference is greater than the fourth set point and less than the third set point, determining the first intermediate offset value as the target offset value, wherein the third set value is smaller than the fourth set value; Determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as a second intermediate offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the second intermediate offset value if the second difference is less than a fourth set value, determining the initial offset value as the second intermediate offset value if the second difference is greater than the fourth set value and less than the third set value, Wherein the third set value is smaller than the fourth set value; determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the second intermediate offset value and a first constant as the target offset value if the first difference is greater than a first set value, determining a difference between the second intermediate offset value and a second constant as the target offset value if the first difference is less than a second set value, determining the second intermediate offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value. in this embodiment, the acquired QP initial offset value may be adjusted by the acquired encoded frame information, which includes, but is not limited to, PSNR, spatial texture, and encoded frame quantization parameter QP.
In the above embodiment, the second peak signal-to-noise ratio PSNR of the ROI area of the previous frame may be compared with the first peak signal-to-noise ratio PSNR of the ROI area of the current frame, and if the PSNR difference is higher than the maximum critical value, the QP offset value is adjusted upwards; if the PSNR difference is lower than the minimum critical value, the QP offset value is adjusted downwards; if the PSNR difference is between the maximum and minimum thresholds, no adjustment is required. The adjustment formula may be Offset new=f5 (offset,PSNRROI,PSNRmax, PSNRmin, PSNRlast). The offset is a QP initial offset value, PSNR ROI is a PSNR value of a ROI area of the current frame, that is, a first peak signal-to-noise ratio of the target image, PSNR max is a first set value, PSNR min is a second set value, is a preset critical value of the ROI area, and PSNR last is an average PSNR of the ROI area of the encoded frame, that is, a second peak signal-to-noise ratio.
In the above embodiment, the complexity of the spatial texture of the ROI area of the previous frame or several frames that are encoded may be compared with the complexity of the spatial texture in the ROI area, and if the texture complexity difference is higher than the maximum critical value of the complexity in the ROI area, the QP offset value is adjusted downward; if the texture complexity difference is below the minimum threshold of complexity within the ROI area, the QP offset value is adjusted upward. The adjustment formula may be Offset new=f6 (offset,TEXROI, TEXmax, TEXmin, TEXlast). The critical values of the maximum and minimum time-space domain textures are respectively represented as TEX max (third set value) and TEX min (fourth set value); the first complexity of the spatial texture within the ROI area is the average complexity of the spatial texture of the ROI area of the encoded frame, the second complexity, of the TEX ROI,TEXlast.
In the above embodiment, the initial Offset value adjusted by the Offset new=f5 (offset,PSNRROI,PSNRmax, PSNRmin, PSNRlast) may be determined as mode one and the initial Offset value adjusted by the Offset new=f6 (offset,TEXROI, TEXmax, TEXmin, TEXlast) may be determined as mode two. The initial offset value may be adjusted separately in a manner to obtain the target offset value. The initial offset value can also be independently adjusted in a second mode to obtain a target offset value. The initial offset value can be adjusted in a one-plus-two manner to obtain the target offset value. When the initial offset value is adjusted in the first mode and the second mode, the initial offset value after the adjustment in the first mode may be defined as a first intermediate offset value, the first intermediate offset value may be input into the second mode, and the first intermediate offset value may be adjusted in the second mode as the initial offset value in the second mode, to obtain the target offset value. Similarly, the initial offset value can be adjusted in a mode of a mode two plus a mode one to obtain the target offset value. When the initial offset value is adjusted in the mode of the mode two plus the mode one, the initial offset value after the mode two is defined as a second intermediate offset value, the second intermediate offset value is input into the mode one as the initial offset value in the mode one, and the second intermediate offset value is adjusted in the mode one to obtain the target offset value.
In the above embodiment, the adjustment formula Offset new=f5(offset,PSNRROI,PSNRmax, PSNRmin, PSNRlast between the PSNR and QP Offset values in mode one) can be expressed as
Wherein the maximum and minimum PSNR are denoted as PSNR max、PSNRmin; the first peak signal-to-noise ratio of the ROI region is PSNR ROI and the second peak signal-to-noise ratio is PSNR last. When the encoded image is a plurality of frames, the second peak signal-to-noise ratio may be an average of peak signal-to-noise ratios of the plurality of frames of the image.
In the above embodiment, the adjustment formula Offset new=f6 (offset,TEXROI, TEXmax, TEXmin, TEXlast between the mode two hollow domain texture and the QP) can be expressed as
The critical values of the maximum and minimum spatial textures are respectively represented as TEX max (third set value) and TEX min (fourth set value); the first complexity of the spatial texture within the ROI area is a second complexity of the spatial texture of the ROI area of the TEX ROI,TEXlast as the encoded image, and when the encoded image is multi-frame, the second complexity may be an average of the spatial texture complexity of the ROI area of the multi-frame image.
In one exemplary embodiment, adjusting the initial quantization parameter based on the target offset value, the deriving the target quantization parameter includes at least one of: determining a first quantization parameter of the encoded image, determining a third difference between the first quantization parameter and a fifth constant, and determining a first sum of the first quantization parameter and a sixth constant, determining a target sum of the initial quantization parameter and the target offset value, the third difference being determined as the target quantization parameter if the target sum is less than the third difference, the first sum being determined as the target quantization parameter if the target sum is greater than the first sum, and the target sum being determined as the target quantization parameter if the target sum is greater than or equal to the third difference and less than or equal to the first sum; determining a second quantization parameter at a macroblock level of the encoded image, determining a fourth difference between the second quantization parameter and a seventh constant, and determining a second sum of the second quantization parameter and an eighth constant, determining a target sum of the initial quantization parameter and the target offset value, determining the fourth difference as the target quantization parameter if the target sum is less than the fourth difference, determining the second sum as the target quantization parameter if the target sum is greater than the second sum, and determining the target sum as the target quantization parameter if the target sum is greater than or equal to the fourth difference and less than or equal to the second sum. In this embodiment, the QP offset value of the target area may be limited by an upper limit and a lower limit according to the first quantization parameter QP of the encoded image, the initial quantization parameter QP in the target area, and the size relationship of the macroblock QP in the target area, so as to ensure that the difference between the macroblock QP in the target area of the current frame and the macroblock QP in the target area of the previous frame and the whole frame is between the maximum and minimum critical values.
For example, taking the target region as the target region of interest as an example, the QP of the target region of the current frame is the initial quantization parameter QP ROI, the first quantization parameter of the QP of the previous frame is QP frame, the second quantization parameter of the QP of the macroblock is QP mb, and the following limitation is imposed :QPframe-10<QPROI+offset<QPframe+10;QPmb-5<QPROI+offset<QPmb+5.
In the above embodiment, after determining the target quantization parameter, the adjusted quantization parameter QP may be applied to the encoder for encoding, and finally the encoded image or video sequence is obtained.
The following describes an encoding method of an image with reference to the embodiment:
Fig. 10 is a flowchart of a method for encoding an image according to an embodiment of the present invention, as shown in fig. 10, the flowchart including:
in step S1002, a priori information (corresponding to the image features described above) is acquired.
The prior information for the current frame is obtained according to software, hardware, or other means. The obtained prior information of the current frame includes and is not limited to the following information: time domain texture information, space domain texture information, ROI information. The temporal texture information may include the result of motion detection; the spatial texture information may include spatial texture complexity of the current frame and spatial texture complexity of the ROI region; the ROI information may include information of a region, a occupied area ratio, a time length, a type, a level, and the like of the ROI.
Step S1004, dividing the region according to the prior information, and building a mapping model according to the divided region and calculating the QP offset value.
The method specifically comprises the following steps of:
(1) Acquiring information such as an ROI region, an area ratio, a type, a grade and the like of a current frame according to the priori information of the ROI, and dividing a region where the ROI is located;
(2) Expanding the ROI region according to priori information such as motion detection, airspace texture and the like, and dividing the ROI expanded region;
(3) The remaining region is taken as a non-ROI region.
The specific methods for dividing the ROI expansion region in the step (2) include, but are not limited to, the following three methods:
Method 1: the expansion area is divided by combining the motion trail with the object.
(1) The track route is obtained by accumulating the track of the motion detection;
(2) Dividing the track which is the last few frames and is not planned in the ROI area into an ROI extension area by combining the object;
Method 2: the extended region is divided by the union of the ROI regions.
(1) Acquiring an ROI area of the encoded frame according to the ROI information;
(2) The ROI of the encoded frame is obtained and the coverage area of the encoded frame of the ROI area is obtained and divided into the expansion area of the ROI;
method 3: the extended region is divided by the union of the ROI regions.
(1) Calculating texture information around the ROI area of the current frame;
(2) The texture complex region around the ROI region is expanded to the ROI extension region.
The specific steps of establishing a mapping model according to the divided areas and calculating the QP offset value are as follows:
calculating QP initial offset values, and mapping in different modes according to different areas; reasons for primarily affecting QP offset value size include, but are not limited to, temporal texture, spatial texture, ROI level, etc. information: the method can divide the complex situation of the current frame according to the size of the airspace texture into the following steps
Class; the motion condition of the current frame can be distinguished according to the time domain texture and divided into
Class; ROI information may be targeted, including, but not limited to, the following: the method comprises the steps of setting a position area as an element s1, setting a occupied area as an element s2, setting a occupied time as an element s3, setting an ROI grade as an element s4, and obtaining classification conditions according to the following formula:
。
where f 1 is a mapped function curve, which may be a linear function or a nonlinear function, and N 3 is a classification result. The rest of the reference information is . From the above m kinds of information, the QP initial Offset value Offset can be calculated and acquired, respectively:
Where f2, f3, and f4 are mapped function curves, which may be linear functions or nonlinear functions, and Offset is the result of the calculated Offset value.
Temporal texture can be divided into motion cases including, but not limited to, stationary, small motion, medium motion, large motion, denoted by md:
。
Airspace texture can be divided into complex cases including, but not limited to, simple, small complex, medium complex, large complex, expressed in tex: 。
ROI information acquisition includes, but is not limited to, the following: its location area, occupied duration, ROI grade, etc.
The start point of the ROI region coordinates at 1080P resolution is (0, 0), the end point is (300, 400), then the divided region is determined; the occupied area is 300×400=120000 pixels, the proportion is 120000/(1920×1080) =0.058, so the occupied area ratio is 5.8%, the occupied area ratio is divided into 10 grades 1-10 according to the proportion of 10%, and the grade of the occupied area ratio is 1.
The time period of the occupied area is set to be 30 seconds, the time period of the occupied area is set to be 50% in the time period of 1 minute, the time period is set to be divided into 10 grades from 1 to 10 according to the proportion of 10%, and then the grade of the time period is set to be 5.
ROI class classification n=3, is classified into three types, important, unimportant, and neglected. The code rate investment can be increased for important types, the unimportant areas are not processed, the area reduction code rate investment is ignored, the grade values are set to be-10, 0 and 10 grades, and the grade classification is important in the embodiment.
The value of the ROI needs to integrate the three conditions, and finally a comprehensive ROI value is calculated; assuming that the area ratio is 20%, the time ratio is 20% and the class classification ratio is 60% in the comprehensive value calculation, the calculation formula of the ROI is as follows: roi=1×0.2+5×0.2+10×0.6=7.2. Then after fusion the ROI composite value is 7.2.
The three information are fused and mapped into quantization parameter QP offset value: when the md motion detection value is 3, the tex texture calculation value is 2, the ROI is 7.2, the value calculated by ROI integration is 7.2, and the d is constant-1, the QP offset value calculated according to the above model is 。
Step S1006, adjusting QP initial offset value according to the encoded frame information.
Comparing the second peak signal-to-noise ratio PSNR of the encoded ROI area of the previous frame with the first peak signal-to-noise ratio PSNR in the ROI area of the current frame, and if the PSNR difference is higher than the maximum critical value, adjusting the QP offset value upwards; if the PSNR difference is lower than the minimum critical value, the QP offset value is adjusted downwards; if the PSNR difference is between the maximum and minimum thresholds, no adjustment is required. The adjustment formula may be Offset new=f5 (offset,PSNRROI,PSNRmax, PSNRmin, PSNRlast). The offset is a QP initial offset value, PSNR ROI is a PSNR value of a ROI area of the current frame, that is, a first peak signal-to-noise ratio of the target image, PSNR max is a first set value, PSNR min is a second set value, is a preset critical value of the ROI area, and PSNR last is an average PSNR of the ROI area of the encoded frame, that is, a second peak signal-to-noise ratio.
Comparing the complexity of the spatial texture of the ROI area of the last frame or a plurality of frames which are coded with the complexity of the spatial texture in the ROI area, and if the texture complexity difference value is higher than the maximum critical value of the complexity in the ROI area, adjusting the QP offset value downwards; if the texture complexity difference is below the minimum threshold of complexity within the ROI area, the QP offset value is adjusted upward. The adjustment formula may be Offset new=f6 (offset,TEXROI, TEXmax, TEXmin, TEXlast). The critical values of the maximum and minimum time-space domain textures are respectively represented as TEX max (third set value) and TEX min (fourth set value); the first complexity of the spatial texture within the ROI area is the average complexity of the spatial texture of the ROI area of the encoded frame, the second complexity, of the TEX ROI,TEXlast.
And limiting the upper limit and the lower limit of the QP offset value in the current ROI according to the size relation among the coded QP of the last frame, the QP in the ROI and the QP of the macro block in the current ROI, so as to ensure that the difference value between the QP of the macro block in the current ROI and the QP of the last frame and the QP of the whole frame is between the maximum critical value and the minimum critical value.
Specifically, the adjustment between PSNR and QP offset values is as follows:
Wherein the maximum and minimum PSNR are denoted as PSNR max、PSNRmin; the first peak signal-to-noise ratio of the ROI region is PSNR ROI and the second peak signal-to-noise ratio is PSNR last. When the encoded image is a plurality of frames, the second peak signal-to-noise ratio may be an average of peak signal-to-noise ratios of the plurality of frames of the image.
The adjustment between spatial texture and QP is as follows:
The critical values of the maximum and minimum spatial textures are respectively represented as TEX max (third set value) and TEX min (fourth set value); the first complexity of the spatial texture within the ROI area is a second complexity of the spatial texture of the ROI area of the TEX ROI,TEXlast as the encoded image, and when the encoded image is multi-frame, the second complexity may be an average of the spatial texture complexity of the ROI area of the multi-frame image.
If the QP of the current frame target region is the initial quantization parameter QP ROI, the first quantization parameter of the previous frame QP is QP frame, and the second quantization parameter of the macroblock QP is QP mb, the following constraint is imposed :QPframe-10<QPROI+offset<QPframe+10;QPmb-5<QPROI+offset<QPmb+5.
Step S1008, the data is sent to an encoder for encoding.
And applying the adjusted quantization parameter QP to the coder for coding, and finally obtaining the coded image or video sequence.
In the foregoing embodiment, the ROI area, the ROI extension area, and the non-ROI area are divided by the prior information, the QP offset value is calculated according to the areas, and after the QP offset value is adjusted and limited by using the encoded frame information, the QP offset value is used for encoding and ensuring that the objective result, subjective quality, and user requirement of the encoding reach the optimal scheme. The ROI region is expanded to a certain extent through the motion trail of the region of the ROI and the target, the ROI region, the texture characteristics of the airspace and the like, so that the purposes of improving subjective quality and removing tailing effect are achieved. According to the mapping method for mapping QP offset values according to the ROI information, corresponding mapping models are respectively established according to the divided different areas, the QP offset values are calculated according to the information such as the ROI grade, the time-space domain texture and the like, and the purpose of improving subjective quality under the premise that the compression rate is basically unchanged is achieved. According to the method, the quantization parameter QP is adaptively adjusted by comparing the PSNR, QP, texture and other parameters of the same region of the current frame ROI region and the encoded frame, so that the quantization parameter QP is more accordant with the current scene. The subjective quality is improved, and meanwhile, the improvement of the coding rate is not increased, so that the purposes of subjective quality, compression rate and user demand balance are achieved.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The present embodiment also provides an image encoding device, which is used to implement the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 11 is a block diagram of a structure of an apparatus for encoding an image according to an embodiment of the present invention, as shown in fig. 11, the apparatus including:
The dividing module 1102 is configured to divide a target image based on image features of the target image to obtain a target region of interest, an extended region of interest, and other regions, where the other regions are regions of the target image other than the target region of interest and the extended region of interest;
An encoding module 1104, configured to perform the following operations for each target region included in the target region of interest, the extended region of interest, and the other regions, to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
In an exemplary embodiment, the partitioning module 1102 may implement partitioning the target image based on image features of the target image to obtain a target region of interest, an extended region of interest, and other regions by: determining the target region of interest based on region of interest information included in the image features; determining the extended region of interest based on temporal texture information or spatial texture information included in the image features; and determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas.
In an exemplary embodiment, the partitioning module 1102 may enable determining the extended region of interest based on temporal texture information included in the image features by: determining N frames of continuous images which are included in the video and are positioned in front of the target image, wherein the last frame of image which is included in the N frames of continuous images is adjacent to the target image, the N frames of continuous images all include target objects which are included in the target image, and N is an integer which is greater than or equal to 1; determining a motion trail of the target object based on the time domain texture information of the N frames of continuous images; determining a sub-region of the motion trail in each frame of images included in the N frames of continuous images; combining the subareas included in each frame of image to obtain a combined area; determining an overlapping region of the combined region and the target region of interest; and determining other areas except the overlapped area included in the combined area as the expansion interested area.
In an exemplary embodiment, the partitioning module 1102 may enable determining the extended region of interest based on spatial texture information included in the image features by: determining that a first region, of which the distance from the target region of interest is smaller than a first threshold, is included in the target image; determining texture complexity for indicating a complexity level of a texture included in the first region based on the spatial texture information of the first region; and determining a second region, included in the first region, in which the texture complexity is greater than a second threshold, as the extended region of interest.
In an exemplary embodiment, the partitioning module 1102 may implement partitioning the target image based on image features of the target image to obtain a target region of interest, an extended region of interest, and other regions by: determining the target region of interest based on region of interest information included in the image features; determining a historical region of interest of an encoded image included in the video; determining a union of the historical interesting areas to obtain a third area; determining a fourth region which is included in the third region and coincides with the target region of interest; determining a region other than the fourth region included in the third region as the extended region of interest; and determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas.
In one exemplary embodiment, the encoding module 1104 may implement determining the target motion detection value for indicating the type of motion of the target motion present in the target region by: determining a target motion type of the target motion existing in the target region based on time domain texture information of the target region; determining a first correspondence between the motion type and the motion detection value; and determining the target motion detection value corresponding to the target motion type, which is included in the first corresponding relation.
In one exemplary embodiment, the encoding module 1104 may implement determining a target texture calculation value indicative of a target texture complexity of the target region by: determining target texture complexity of the target region based on airspace texture information of the target region; determining a second correspondence between texture complexity and texture calculated values; and determining the target texture calculated value corresponding to the target texture complexity, which is included in the second corresponding relation.
In one exemplary embodiment, the encoding module 1104 may implement determining a target region of interest composite value indicative of the importance of the target region by: determining a second calculated value of the target region of interest composite value, wherein the second calculated value comprises at least one of: the method comprises the steps of a first level corresponding to a target area of a target region of interest, a second level corresponding to a target duration of the target region of interest in the video, and a third level corresponding to an importance level of the target region of interest, wherein the importance level of the target region of interest is determined based on objects included in the target region of interest and a corresponding relation between the objects and the importance level; determining, for each first target calculation value included in the second calculation value, a first product of the first target calculation value and a first weight of the target calculation value; the target region of interest composite value is mapped based on the first product.
In one exemplary embodiment, the encoding module 1104 may implement determining a target region of interest composite value indicative of the importance of the target region by: determining a first grade corresponding to a target area of the target region of interest; determining a second level corresponding to a target duration of the target region of interest in the video; determining a third level corresponding to the importance degree of the target region of interest, wherein the importance degree of the target region of interest is determined based on the objects included in the target region of interest and the correspondence between the objects and the importance degree; determining a first weight corresponding to the first level, determining a second weight corresponding to the second level, and determining a third weight corresponding to the third level; determining a first product of the first level and the first weight, determining a second product of the second level and the second weight, and determining a third product of the third level and the third weight; and determining the sum value of the first product, the second product and the third product as the target region of interest integrated value.
In an exemplary embodiment, the encoding module 1104 may be configured to determine the initial offset value for the target region based on the target region of interest synthesis value: mapping the initial offset value based on the target integrated value of interest if the target region is the target region of interest; and under the condition that the target area is the extended interested area, mapping the initial offset value based on the target interested comprehensive value and a first preset constant.
In one exemplary embodiment, the encoding module 1104 may enable determining the initial offset value of the target region based on the target region of interest synthesis value by: determining a fourth product of a fourth weight corresponding to the target motion detection value and determining a fifth product of a fifth weight corresponding to the target texture calculation value and the target texture calculation value when the target region is the target region of interest; determining a sum of the fourth product and the fifth product and the target region of interest integrated value as the initial offset value; determining a sixth product of a sixth weight corresponding to the target motion detection value and determining a seventh product of a seventh weight corresponding to the target texture calculation value and the target texture calculation value when the target region is the extended region of interest; determining the sum of the sixth product and the seventh product, the target region of interest integrated value and a preset constant as the initial offset value; if the target area is other areas, determining an eighth product of an eighth weight corresponding to the target motion detection value and the target motion detection value, and determining a ninth product of a ninth weight corresponding to the target texture calculation value and the target texture calculation value; a sum of the eighth product and the ninth product is determined as the initial offset value.
In one exemplary embodiment, the encoding module 1104 may enable determining the initial offset value of the target region based on the target region of interest integrated value by: determining a third calculated value if the target region is not the other region, wherein the third calculated value includes at least one of: a second motion detection value for indicating a target motion type of a target motion present in the target region, a second texture calculation value for indicating a target texture complexity of the target region; determining, for each second target calculation value included in the third calculation value, a second product of the second target calculation value and a second weight of the second target calculation value, mapping the initial offset value based on the target region of interest integrated value and the second product, if the target region is the target region of interest; and if the target region is the extended region of interest, determining a third product of the second target calculated value and a third weight of the second target calculated value for each second target calculated value included in the third calculated value, and mapping the initial offset value based on the target region of interest integrated value, the third product and a second preset constant.
In an exemplary embodiment, the encoding module 1104 may determine the initial offset value of the target region based on the first calculated value by: determining a fourth product of the third target calculation value and a fourth weight for each third target calculation value included in the first calculation value; the initial offset value is mapped based on the fourth product.
In an exemplary embodiment, the encoding module 1104 may implement adjusting the initial offset value based on encoding information of an encoded image included in a video in which the target image is located, to obtain the target offset value by: determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as the target offset value if the first difference is greater than a first set value, and determining a sum of the initial offset value and a first constant as the target offset value if the first difference is less than a second set value, Determining a difference between the initial offset value and a second constant as the target offset value, and determining the initial offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value; determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as the target offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the target offset value if the second difference is less than a fourth set value, determining the initial offset value as the target offset value if the second difference is greater than the fourth set value and less than the third set value, Wherein the third set value is smaller than the fourth set value; Determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as a first intermediate offset value if the first difference is greater than a first set value, determining a difference between the initial offset value and a second constant as the first intermediate offset value if the first difference is less than a second set value, determining the initial offset value as the first intermediate offset value if the first difference is greater than the second set value and less than the first set value, Wherein the first set point is greater than the second set point, a first complexity of spatial texture of the target image is determined, and a second complexity of spatial texture of the encoded image is determined, a second difference of the first complexity and the second complexity is determined, the difference of the first intermediate offset value and a third constant is determined as the target offset value if the second difference is greater than a third set point, a sum of the first intermediate offset value and a fourth constant is determined as the target offset value if the second difference is less than a fourth set point, and the second difference is greater than the fourth set point and less than the third set point, determining the first intermediate offset value as the target offset value, wherein the third set value is smaller than the fourth set value; Determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as a second intermediate offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the second intermediate offset value if the second difference is less than a fourth set value, determining the initial offset value as the second intermediate offset value if the second difference is greater than the fourth set value and less than the third set value, Wherein the third set value is smaller than the fourth set value; determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the second intermediate offset value and a first constant as the target offset value if the first difference is greater than a first set value, determining a difference between the second intermediate offset value and a second constant as the target offset value if the first difference is less than a second set value, determining the second intermediate offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value.
In an exemplary embodiment, the encoding module 1104 may adjust the initial quantization parameter based on the target offset value to obtain a target quantization parameter by at least one of: determining a first quantization parameter of the encoded image, determining a third difference between the first quantization parameter and a fifth constant, and determining a first sum of the first quantization parameter and a sixth constant, determining a target sum of the initial quantization parameter and the target offset value, the third difference being determined as the target quantization parameter if the target sum is less than the third difference, the first sum being determined as the target quantization parameter if the target sum is greater than the first sum, and the target sum being determined as the target quantization parameter if the target sum is greater than or equal to the third difference and less than or equal to the first sum; determining a second quantization parameter at a macroblock level of the encoded image, determining a fourth difference between the second quantization parameter and a seventh constant, and determining a second sum of the second quantization parameter and an eighth constant, determining a target sum of the initial quantization parameter and the target offset value, determining the fourth difference as the target quantization parameter if the target sum is less than the fourth difference, determining the second sum as the target quantization parameter if the target sum is greater than the second sum, and determining the target sum as the target quantization parameter if the target sum is greater than or equal to the fourth difference and less than or equal to the second sum.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; or the above modules may be located in different processors in any combination.
Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic apparatus may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Embodiments of the application also provide a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method of the various embodiments of the application.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.
Claims (17)
1. A method of encoding an image, comprising:
Dividing a target image based on image characteristics of the target image to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image;
The following operations are performed for each target region included in the target region of interest, the extended region of interest, and the other regions to encode the target image:
Determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
2. The method of claim 1, wherein dividing the target image based on image features of the target image to obtain a target region of interest, expanding the region of interest, and other regions comprises:
Determining the target region of interest based on region of interest information included in the image features;
determining the extended region of interest based on temporal texture information or spatial texture information included in the image features;
And determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas.
3. The method of claim 2, wherein determining the extended region of interest based on temporal texture information included in the image features comprises:
Determining N frames of continuous images which are included in the video and are positioned in front of the target image, wherein the last frame of image which is included in the N frames of continuous images is adjacent to the target image, the N frames of continuous images all include target objects which are included in the target image, and N is an integer which is greater than or equal to 1;
Determining a motion trail of the target object based on the time domain texture information of the N frames of continuous images;
determining a sub-region of the motion trail in each frame of images included in the N frames of continuous images;
Combining the subareas included in each frame of image to obtain a combined area;
Determining an overlapping region of the combined region and the target region of interest;
and determining other areas except the overlapped area included in the combined area as the expansion interested area.
4. The method of claim 2, wherein determining the extended region of interest based on spatial texture information included in the image features comprises:
determining that a first region, of which the distance from the target region of interest is smaller than a first threshold, is included in the target image;
determining texture complexity for indicating a complexity level of a texture included in the first region based on the spatial texture information of the first region;
and determining a second region, included in the first region, in which the texture complexity is greater than a second threshold, as the extended region of interest.
5. The method of claim 1, wherein dividing the target image based on image features of the target image to obtain a target region of interest, expanding the region of interest, and other regions comprises:
Determining the target region of interest based on region of interest information included in the image features;
determining a historical region of interest of an encoded image included in the video;
determining a union of the historical interesting areas to obtain a third area;
determining a fourth region which is included in the third region and coincides with the target region of interest;
determining a region other than the fourth region included in the third region as the extended region of interest;
And determining the areas except the target region of interest and the extended region of interest included in the target image as the other areas.
6. The method of claim 1, wherein determining a target motion detection value for indicating a type of motion of a target motion present in the target region comprises:
Determining a target motion type of the target motion existing in the target region based on time domain texture information of the target region;
Determining a first correspondence between the motion type and the motion detection value;
And determining the target motion detection value corresponding to the target motion type, which is included in the first corresponding relation.
7. The method of claim 1, wherein determining a target texture calculation value indicative of a target texture complexity of the target region comprises:
Determining target texture complexity of the target region based on airspace texture information of the target region;
Determining a second correspondence between texture complexity and texture calculated values;
and determining the target texture calculated value corresponding to the target texture complexity, which is included in the second corresponding relation.
8. The method of claim 1, wherein determining a target region of interest composite value indicative of a degree of importance of the target region comprises:
Determining a second calculated value of the target region of interest composite value, wherein the second calculated value comprises at least one of: the method comprises the steps of a first level corresponding to a target area of a target region of interest, a second level corresponding to a target duration of the target region of interest in the video, and a third level corresponding to an importance level of the target region of interest, wherein the importance level of the target region of interest is determined based on objects included in the target region of interest and a corresponding relation between the objects and the importance level;
determining, for each first target calculation value included in the second calculation value, a first product of the first target calculation value and a first weight of the target calculation value;
the target region of interest composite value is mapped based on the first product.
9. The method of claim 1, wherein determining an initial offset value for the target region based on the target region of interest composite value comprises:
mapping the initial offset value based on the target integrated value of interest if the target region is the target region of interest;
And under the condition that the target area is the extended interested area, mapping the initial offset value based on the target interested comprehensive value and a first preset constant.
10. The method of claim 1, wherein determining an initial offset value for the target region based on the target region of interest composite value comprises:
Determining a third calculated value if the target region is not the other region, wherein the third calculated value includes at least one of: a second motion detection value for indicating a target motion type of a target motion present in the target region, a second texture calculation value for indicating a target texture complexity of the target region;
determining, for each second target calculation value included in the third calculation value, a second product of the second target calculation value and a second weight of the second target calculation value, mapping the initial offset value based on the target region of interest integrated value and the second product, if the target region is the target region of interest;
and if the target region is the extended region of interest, determining a third product of the second target calculated value and a third weight of the second target calculated value for each second target calculated value included in the third calculated value, and mapping the initial offset value based on the target region of interest integrated value, the third product and a second preset constant.
11. The method of claim 1, wherein determining an initial offset value for the target region based on the first calculated value comprises:
Determining a fourth product of the third target calculation value and a fourth weight for each third target calculation value included in the first calculation value;
the initial offset value is mapped based on the fourth product.
12. The method of claim 1, wherein adjusting the initial offset value based on encoding information of an encoded image included in a video in which the target image is located, the target offset value comprising at least one of:
Determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as the target offset value if the first difference is greater than a first set value, determining a difference between the initial offset value and a second constant as the target offset value if the first difference is less than a second set value, and determining the initial offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value;
Determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as the target offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the target offset value if the second difference is less than a fourth set value, and determining the initial offset value as the target offset value if the second difference is greater than the fourth set value and less than the third set value, wherein the third set value is less than the fourth set value;
Determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the initial offset value and a first constant as a first intermediate offset value if the first difference is less than a second set point, determining a difference between the initial offset value and a second constant as the first intermediate offset value if the first difference is greater than the second set point and less than the first set point, determining the initial offset value as the first intermediate offset value if the first difference is greater than the second set point, determining a first complexity of spatial texture of the target image if the first difference is greater than the second set point, determining a second difference of the first complexity and the second complexity if the second difference is greater than the second set point, determining a difference between the first offset value and the second offset value if the second difference is greater than the second set point, determining a second offset value if the first offset value is greater than the second offset value and the second offset value is less than the second set point, determining a second offset value if the first offset value is greater than the second set point, determining a second offset value if the second offset value is greater than the second offset value is less than the second offset value;
determining a first complexity of the spatial texture of the target image, and determining a second complexity of the spatial texture of the encoded image, determining a second difference of the first complexity and the second complexity, determining a difference of the initial offset value and a third constant as a second intermediate offset value if the second difference is greater than a third set value, determining a sum of the initial offset value and a fourth constant as the second intermediate offset value if the second difference is less than a fourth set value, and determining the initial offset value as the second intermediate offset value if the second difference is greater than the fourth set value and less than the third set value, wherein the third set value is less than the fourth set value; determining a first peak signal-to-noise ratio of the target image, and determining a second peak signal-to-noise ratio of the encoded image, determining a first difference between the first peak signal-to-noise ratio and the second peak signal-to-noise ratio, determining a sum of the second intermediate offset value and a first constant as the target offset value if the first difference is greater than a first set value, determining a difference between the second intermediate offset value and a second constant as the target offset value if the first difference is less than a second set value, and determining the second intermediate offset value as the target offset value if the first difference is greater than the second set value and less than the first set value, wherein the first set value is greater than the second set value.
13. The method of claim 1, wherein adjusting the initial quantization parameter based on the target offset value to obtain a target quantization parameter comprises at least one of:
Determining a first quantization parameter of the encoded image, determining a third difference between the first quantization parameter and a fifth constant, and determining a first sum of the first quantization parameter and a sixth constant, determining a target sum of the initial quantization parameter and the target offset value, the third difference being determined as the target quantization parameter if the target sum is less than the third difference, the first sum being determined as the target quantization parameter if the target sum is greater than the first sum, and the target sum being determined as the target quantization parameter if the target sum is greater than or equal to the third difference and less than or equal to the first sum;
Determining a second quantization parameter at a macroblock level of the encoded image, determining a fourth difference between the second quantization parameter and a seventh constant, and determining a second sum of the second quantization parameter and an eighth constant, determining a target sum of the initial quantization parameter and the target offset value, determining the fourth difference as the target quantization parameter if the target sum is less than the fourth difference, determining the second sum as the target quantization parameter if the target sum is greater than the second sum, and determining the target sum as the target quantization parameter if the target sum is greater than or equal to the fourth difference and less than or equal to the second sum.
14. An image encoding apparatus, comprising:
the dividing module is used for dividing the target image based on the image characteristics of the target image to obtain a target region of interest, an extended region of interest and other regions, wherein the other regions are regions except the target region of interest and the extended region of interest in the target image;
An encoding module, configured to perform the following operations for each target region included in the target region of interest, the extended region of interest, and the other region, to encode the target image: determining a target region of interest integrated value indicating a degree of importance of the target region, if the target region is not the other region, determining an initial offset value of the target region based on the target region of interest integrated value, and determining a first calculated value, if the target region is the other region, determining the initial offset value of the target region based on the first calculated value, wherein the first calculated value includes at least one of: a first motion detection value for indicating a target motion type of a target motion existing in the target region, a first texture calculation value for indicating a target texture complexity of the target region, and an initial offset value adjusted based on coding information of a coded image included in a video in which the target image is located, to obtain a target offset value; and adjusting an initial quantization parameter based on the target offset value to obtain a target quantization parameter, and encoding the target region based on the target quantization parameter.
15. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 13 when run.
16. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 13.
17. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the steps of the method as claimed in any one of claims 1 to 13.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410744396.4A CN118317092B (en) | 2024-06-11 | 2024-06-11 | Image encoding method, image encoding device, storage medium and electronic device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410744396.4A CN118317092B (en) | 2024-06-11 | 2024-06-11 | Image encoding method, image encoding device, storage medium and electronic device |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN118317092A true CN118317092A (en) | 2024-07-09 |
| CN118317092B CN118317092B (en) | 2024-08-30 |
Family
ID=91731780
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410744396.4A Active CN118317092B (en) | 2024-06-11 | 2024-06-11 | Image encoding method, image encoding device, storage medium and electronic device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118317092B (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119854508A (en) * | 2025-01-15 | 2025-04-18 | 杭州萧山国际机场有限公司 | Airport monitoring video transmission method and device, storage medium and electronic equipment |
| CN119946261A (en) * | 2024-12-31 | 2025-05-06 | 北京算能科技有限公司 | Video adaptive encoding method, device, equipment and medium based on dynamic information |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101945275A (en) * | 2010-08-18 | 2011-01-12 | 镇江唐桥微电子有限公司 | Video coding method based on region of interest (ROI) |
| WO2011140211A2 (en) * | 2010-05-04 | 2011-11-10 | Texas Instruments Incorporated | Coding unit quantization parameters in video coding |
| CN106791856A (en) * | 2016-12-28 | 2017-05-31 | 天津天地伟业生产力促进有限公司 | A kind of method for video coding based on self adaptation area-of-interest |
| CN110177277A (en) * | 2019-06-28 | 2019-08-27 | 广东中星微电子有限公司 | Image encoding method, device, computer readable storage medium and electronic equipment |
| WO2020243906A1 (en) * | 2019-06-04 | 2020-12-10 | SZ DJI Technology Co., Ltd. | Method, device, and storage medium for encoding video data base on regions of interests |
| WO2022036678A1 (en) * | 2020-08-21 | 2022-02-24 | Alibaba Group Holding Limited | Multi-level region-of-interest quality controllable video coding techniques |
| CN116962701A (en) * | 2023-08-17 | 2023-10-27 | 北京洛塔信息技术有限公司 | A video encoding processing method, device, equipment, storage medium and product |
-
2024
- 2024-06-11 CN CN202410744396.4A patent/CN118317092B/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2011140211A2 (en) * | 2010-05-04 | 2011-11-10 | Texas Instruments Incorporated | Coding unit quantization parameters in video coding |
| CN101945275A (en) * | 2010-08-18 | 2011-01-12 | 镇江唐桥微电子有限公司 | Video coding method based on region of interest (ROI) |
| CN106791856A (en) * | 2016-12-28 | 2017-05-31 | 天津天地伟业生产力促进有限公司 | A kind of method for video coding based on self adaptation area-of-interest |
| WO2020243906A1 (en) * | 2019-06-04 | 2020-12-10 | SZ DJI Technology Co., Ltd. | Method, device, and storage medium for encoding video data base on regions of interests |
| CN110177277A (en) * | 2019-06-28 | 2019-08-27 | 广东中星微电子有限公司 | Image encoding method, device, computer readable storage medium and electronic equipment |
| WO2022036678A1 (en) * | 2020-08-21 | 2022-02-24 | Alibaba Group Holding Limited | Multi-level region-of-interest quality controllable video coding techniques |
| CN116962701A (en) * | 2023-08-17 | 2023-10-27 | 北京洛塔信息技术有限公司 | A video encoding processing method, device, equipment, storage medium and product |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119946261A (en) * | 2024-12-31 | 2025-05-06 | 北京算能科技有限公司 | Video adaptive encoding method, device, equipment and medium based on dynamic information |
| CN119854508A (en) * | 2025-01-15 | 2025-04-18 | 杭州萧山国际机场有限公司 | Airport monitoring video transmission method and device, storage medium and electronic equipment |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118317092B (en) | 2024-08-30 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN118317092B (en) | Image encoding method, image encoding device, storage medium and electronic device | |
| CN113766226B (en) | Image coding method, device, equipment and storage medium | |
| JP5969389B2 (en) | Object recognition video coding strategy | |
| KR101528895B1 (en) | Method and apparatus for adaptive feature of interest color model parameters estimation | |
| JP4153202B2 (en) | Video encoding device | |
| CN104539962A (en) | Layered video coding method fused with visual perception features | |
| CN105516720B (en) | A kind of self-adaptation control method of monitor camera code stream | |
| CN110365983B (en) | A macroblock-level code rate control method and device based on human visual system | |
| CN110312131B (en) | Content self-adaptive online video coding method based on deep learning | |
| CN101867799A (en) | A video frame processing method and video encoder | |
| CN114466189B (en) | Bit rate control method, electronic device and storage medium | |
| CN105898306A (en) | Code rate control method and device for sport video | |
| US20170374361A1 (en) | Method and System Of Controlling A Video Content System | |
| EP3545677A1 (en) | Methods and apparatuses for encoding and decoding video based on perceptual metric classification | |
| CN112165620B (en) | Video encoding method and device, storage medium and electronic equipment | |
| US10313693B2 (en) | Method and apparatus for controlling a degree of compression of a digital image | |
| US20180048897A1 (en) | Method and apparatus for coding a video into a bitstream | |
| CN106686383A (en) | Depth map intra-frame coding method capable of preserving edge of depth map | |
| CN115955564A (en) | A video coding method, device, device and medium | |
| CN114173131A (en) | Video compression method and system based on inter-frame correlation | |
| CN105141967B (en) | Based on the quick self-adapted loop circuit filtering method that can just perceive distortion model | |
| CN118509595A (en) | Video image code rate allocation method, system, equipment and storage medium | |
| CN117201792A (en) | Video encoding method, video encoding device, electronic equipment and computer readable storage medium | |
| CN112153381B (en) | Method, device and medium for rapidly dividing CU in dynamic 3D point cloud compression frame | |
| CN111050175A (en) | Method and apparatus for video coding |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |