CN117061825B - Method and device for detecting bad frames of streaming media video and computer equipment - Google Patents

Method and device for detecting bad frames of streaming media video and computer equipment Download PDF

Info

Publication number
CN117061825B
CN117061825B CN202311316549.7A CN202311316549A CN117061825B CN 117061825 B CN117061825 B CN 117061825B CN 202311316549 A CN202311316549 A CN 202311316549A CN 117061825 B CN117061825 B CN 117061825B
Authority
CN
China
Prior art keywords
video
frames
local area
video frame
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311316549.7A
Other languages
Chinese (zh)
Other versions
CN117061825A (en
Inventor
王曜
刘琦
许亦
贺国超
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yuntian Changxiang Information Technology Co ltd
Original Assignee
Shenzhen Yuntian Changxiang Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yuntian Changxiang Information Technology Co ltd filed Critical Shenzhen Yuntian Changxiang Information Technology Co ltd
Priority to CN202311316549.7A priority Critical patent/CN117061825B/en
Publication of CN117061825A publication Critical patent/CN117061825A/en
Application granted granted Critical
Publication of CN117061825B publication Critical patent/CN117061825B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440281Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by altering the temporal resolution, e.g. by frame skipping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4662Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms
    • H04N21/4666Learning process for intelligent management, e.g. learning user preferences for recommending movies characterized by learning algorithms using neural networks, e.g. processing the feedback provided by the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method, a device and a computer device for detecting bad frames of streaming media video, which comprise the following steps: acquiring streaming media video; obtaining a first local area for the video frame by dividing a mapping relation; carrying out characteristic information quantity statistics on the first local area, and determining a target area in the first local area according to the characteristic information quantity statistics result; and determining bad frames in the plurality of video frames through a detection model according to the target area of the video frames. According to the method, the detection model is constructed, the bad frames in the video frames are determined before the video frames are rendered, invalid processing is avoided, hardware operation resources are saved, rendering time is shortened, and the method is used for dividing the target characteristics of the video frames.

Description

Method and device for detecting bad frames of streaming media video and computer equipment
Technical Field
The present invention relates to the field of video frame processing technologies, and in particular, to a method and apparatus for detecting bad frames of streaming media video, and a computer device.
Background
Streaming media is used for playing video and cloud video in various terminals such as televisions, mobile phones and notebooks in all aspects of people's work and life. Accordingly, requirements of people on video playing quality, including definition, smoothness, real-time performance and the like, are also increasing. Many streaming media scenes, such as cloud Rendering (Rendering) of a cloud game, are rendered at the cloud end, and then video image encoding streams obtained by Rendering are transmitted to the end side, and the end side decodes the received code streams. In this way, the end side can acquire high-quality rendering content to realize video playing.
In the prior art, when video frame super-resolution preprocessing is performed for maintaining the video frame rendering effect, indiscriminate super-resolution processing is performed on all video frames, so that bad frames in a hybrid are easily processed, invalid processing is generated, hardware operation resources are wasted, and rendering time is prolonged.
Disclosure of Invention
The invention aims to provide a method, a device and computer equipment for detecting bad frames of streaming media video, which are used for solving the technical problems that invalid processing is generated in the prior art, hardware operation resources are wasted and rendering time is prolonged.
In order to solve the technical problems, the invention specifically provides the following technical scheme:
in a first aspect of the present invention, the present invention provides a method for detecting bad frames of streaming media video, comprising the following steps:
acquiring a streaming media video, wherein the streaming media video comprises a plurality of video frames;
obtaining a first local area for the video frame through dividing the mapping relation, wherein the first local area corresponds to an area dividing result of the video frame;
carrying out feature information quantity statistics on the first local area, and determining a target area in the first local area according to the feature information quantity statistics result, wherein the target area corresponds to a local image area containing shooting target object features in a video frame;
and determining bad frames in the plurality of video frames through a detection model according to the target area of the video frames, wherein the detection model is a neural network.
As a preferred embodiment of the present invention, the determining of the first local area includes:
determining the dividing number m of the first local area through the dividing mapping relation;
and carrying out equal-area division on the video frame according to the division number m to obtain m first local areas.
As a preferred embodiment of the present invention, the construction of the partition mapping relationship includes:
setting the dividing number m of the first local areas, dividing the video frame into m first local areas according to the equal area, and calculating the image discreteness among the m first local areas, wherein the image discreteness is measured by a variance formula, and the quantization formula of the image discreteness is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Where, delta is characterized by the image discreteness,S k characterized as the firstkAn image matrix of the first partial region,S E characterized bymA matrix of mean images between the first local regions,x E i, characterized as a mean image matrixS E Middle (f)iThe pixel values of the individual pixel points,x k i, characterized as the firstkImage matrix of first partial regionS k Middle (f)iThe pixel values of the individual pixel points,Nfor the number of pixels of the image matrix,ikis a metering constant; maximum solving of image discreteness among first local areasmTo obtain the value ofmAnd carrying out video frame division on the values, and intensively dividing the effective pixel points representing the characteristics of the shooting target object into the same first local area, and intensively dividing the ineffective pixel points representing the characteristics of the non-shooting target object into the same first local area.
As a preferred embodiment of the present invention, the determining of the target area includes:
carrying out feature information quantity statistics on each first local area of the video frame by using the histogram to obtain feature information quantity of each first local area;
comparing the characteristic information amount of the first partial region with a preset threshold, wherein,
when the characteristic information quantity of the first local area is larger than or equal to a preset threshold value, the first local area is marked as a target area;
and when the characteristic information quantity of the first local area is smaller than a preset threshold value, the first local area is marked as a non-target area.
As a preferred aspect of the present invention, determining a bad frame of a plurality of video frames includes:
inputting all target areas of the video frame into a detection model, and outputting classification labels of the video frame by the detection model;
the classification labels include bad frame labels and non-bad frame labels.
As a preferred embodiment of the present invention, the construction of the detection model includes:
selecting a group of video frames as sample video frames in streaming media video with known shooting standard objects, and acquiring all target areas in the sample video frames;
and comparing all target areas in the sample video frame with standard images of the shot target object characteristics, wherein,
if all target areas in the sample video frame are consistent with the standard images of the shooting target object characteristics, marking the sample video frame as a non-bad frame label;
if all the target areas in the sample video frame are inconsistent with the standard images of the shooting target object characteristics, marking the sample video frame as a bad frame label;
learning and training all target areas of the sample video frames and classification labels of the sample video frames by using a neural network to obtain the detection model;
the model expression of the detection model is as follows:
Label=CNN(g);
in the formula, label is a classification Label, g is all target areas of a sample video frame, and CNN is a neural network.
As a preferred embodiment of the present invention, the consistency is quantified using image similarity.
As a preferred embodiment of the present invention, all target areas in the sample video frame are consistent with the standard image of the feature of the photographed target.
In a second aspect of the present invention, the present invention provides a device for detecting bad frames of streaming video, including:
the data acquisition module is used for acquiring streaming media video, wherein the streaming media video comprises a plurality of video frames;
the data processing module is used for obtaining a first local area for the video frame by dividing the mapping relation;
carrying out characteristic information quantity statistics on the first local area, and determining a target area in the first local area according to the characteristic information quantity statistics result;
according to the target area of the video frames, determining bad frames in a plurality of video frames through a detection model;
and the data storage module is used for storing the detection model.
In a third aspect of the invention, the invention provides a computer device, at least one processor; and
a memory communicatively coupled to the at least one processor;
the memory stores instructions executable by the at least one processor to cause the computer device to perform a streaming video bad frame detection method.
In a fourth aspect of the present invention, a computer readable storage medium is provided, where computer executable instructions are stored, and when a processor executes the computer executable instructions, a method for detecting bad frames of a streaming video is implemented. Compared with the prior art, the invention has the following beneficial effects:
according to the method, the detection model is constructed, the bad frames in the video frames are determined before the video frames are rendered, invalid processing is avoided, hardware operation resources are saved, rendering time is shortened, the method is used for dividing the target features of the video frames, the effective pixel points representing the features of the shooting target objects are divided into the same local area in a concentrated mode, all the features of the video frames are not required to be detected, and targeted detection is achieved, and detection efficiency and accuracy are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.
FIG. 1 is a flow chart of a method for detecting bad frames of streaming media video according to an embodiment of the present invention;
fig. 2 is a block diagram of a bad frame detection device for streaming media video according to an embodiment of the present invention;
fig. 3 is an internal structure diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
As shown in fig. 1, in a first aspect of the present invention, the present invention provides a method for detecting bad frames of streaming video, comprising the following steps:
acquiring a streaming media video, wherein the streaming media video comprises a plurality of video frames;
for a video frame, obtaining a first local area through dividing the mapping relation, wherein the first local area corresponds to an area division result of the video frame;
carrying out feature information quantity statistics on the first local area, and determining a target area in the first local area according to a feature information quantity statistics result, wherein the target area corresponds to a local image area containing shooting target object features in a video frame;
and determining bad frames in the plurality of video frames through a detection model according to the target area of the video frames, wherein the detection model is a neural network.
In order to obtain the optimal video rendering effect, the method and the device perform super-resolution processing on the video frames before rendering, improve the resolution of the video frames, and enable the rendered video frames to have high resolution.
Furthermore, before the super-resolution processing of the video frames is carried out, the method detects the video frames, selects the video frames commonly called as bad frames from the video frames, does not carry out the super-resolution processing, and can avoid invalid or meaningless super-resolution processing due to the fact that the bad frames have no meaning of rendering processing, thereby realizing the purposes of saving hardware operation resources and reducing rendering time.
Furthermore, when the invention detects the bad frame, the image area which represents the characteristic of the shooting target object in the video frame, namely the target area, is marked by the area segmentation and the information quantity statistics, and the bad frame detection is carried out by utilizing the target area, so that the pixel data quantity of the image to be processed in the bad frame detection is reduced, and the bad frame detection efficiency is improved.
The invention is characterized in that the object region is an effective picture feature for showing the object to the audience in the video frame, therefore, the quality detection of the image feature in the object region belongs to effective quality detection, but the quality detection of the background feature and the noise feature contained in the non-object region belongs to ineffective redundant detection, therefore, the invention is carried out in the bad frame detection aiming at the local region (object region) of the video frame representing the feature of the object to be shot, the detection pertinence is strong, the video frame which is not displayed by the feature of the object to be shot is filtered, and the accuracy of the bad frame detection is improved.
In the invention, in order to intensively divide the local image of the video frame representing the characteristics of the shooting target object into one or a few local areas, so as to obtain that the content mainly contained in the one or a few local areas is the shooting target object, and the content mainly contained in the rest local areas in the convergence of the background color part can be obtained, so as to obtain that the content mainly contained in the rest local areas is the background color part, the difference of the heights represented among all the local areas obtained by the segmentation is utilized, namely, the higher the image variance is, the larger the difference of the representing pixel representing content among the local areas is, namely, the expected result is realized, and the method comprises the following steps:
the determining of the first local area includes:
determining the dividing number m of the first local area through the dividing mapping relation;
and carrying out equal-area division on the video frame according to the dividing number m to obtain m first local areas.
The construction of the partition mapping relation comprises the following steps:
setting the dividing number m of the first local areas, dividing the video frame into m first local areas according to the equal area, and calculating the image discreteness among the m first local areas, wherein the image discreteness is measured by a variance formula, and the quantization formula of the image discreteness is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Where, delta is characterized by the image discreteness,S k characterized as the firstkAn image matrix of the first partial region,S E characterized bymA matrix of mean images between the first local regions,x E i, characterized as a mean image matrixS E Middle (f)iThe pixel values of the individual pixel points,x k i, characterized as the firstkImage matrix of first partial regionS k Middle (f)iThe pixel values of the individual pixel points,Nfor image matrixIs used for the number of pixels of a display device,ikis a metering constant; maximum solving of image discreteness among first local areasmTo obtain the value ofmAnd carrying out video frame division on the values, and intensively dividing the effective pixel points representing the characteristics of the shooting target object into the same first local area, and intensively dividing the ineffective pixel points representing the characteristics of the non-shooting target object into the same first local area.
The determination of the target area comprises the following steps:
carrying out feature information quantity statistics on each first local area of the video frame by using the histogram to obtain feature information quantity of each first local area;
comparing the characteristic information amount of the first partial region with a preset threshold, wherein,
when the characteristic information quantity of the first local area is larger than or equal to a preset threshold value, the first local area is marked as a target area;
and when the characteristic information quantity of the first local area is smaller than a preset threshold value, the first local area is marked as a non-target area.
The invention determines the bad frame in the video frame before the video frame is rendered by constructing the detection model, avoids generating invalid processing, saves hardware operation resources and reduces rendering time, and is concretely as follows:
determining a bad frame of a plurality of video frames, comprising:
inputting all target areas of the video frame into a detection model, and outputting classification labels of the video frame by the detection model;
the classification labels include bad frame labels and non-bad frame labels.
The construction of the detection model comprises the following steps:
selecting a group of video frames as sample video frames in streaming media video with known shooting standard objects, and acquiring all target areas in the sample video frames;
and comparing all target areas in the sample video frame with standard images of the shot target object characteristics, wherein,
if all target areas in the sample video frame are consistent with the standard images of the shooting target object characteristics, marking the sample video frame as a non-bad frame label;
if all the target areas in the sample video frame are inconsistent with the standard images of the shooting target object characteristics, marking the sample video frame as a bad frame label;
learning and training all target areas of the sample video frames and classification labels of the sample video frames by using a neural network to obtain a detection model;
the model expression of the detection model is:
Label=CNN(g);
in the formula, label is a classification Label, g is all target areas of a sample video frame, and CNN is a neural network.
Consistency is quantified using image similarity.
All target areas in the sample video frame are consistent with the standard image of the shot target object characteristic.
As shown in fig. 2, in a second aspect of the present invention, the present invention provides a bad frame detection device for streaming video, including:
the data acquisition module is used for acquiring streaming media video, wherein the streaming media video comprises a plurality of video frames;
the data processing module is used for obtaining a first local area for the video frame by dividing the mapping relation;
carrying out characteristic information quantity statistics on the first local area, and determining a target area in the first local area according to a characteristic information quantity statistics result;
according to the target area of the video frames, determining bad frames in the plurality of video frames through a detection model;
and the data storage module is used for storing the detection model.
In a third aspect of the invention, as shown in FIG. 3, the invention provides a computer device, at least one processor; and
a memory communicatively coupled to the at least one processor;
the memory stores instructions executable by the at least one processor to cause the computer device to perform a streaming video bad frame detection method. In a fourth aspect of the present invention, a computer readable storage medium is provided, where computer executable instructions are stored, and when a processor executes the computer executable instructions, a method for detecting bad frames of a streaming video is implemented. Compared with the prior art, the invention has the following beneficial effects:
according to the method, the detection model is constructed, the bad frames in the video frames are determined before the video frames are rendered, invalid processing is avoided, hardware operation resources are saved, rendering time is shortened, the method is used for dividing the target features of the video frames, the effective pixel points representing the features of the shooting target objects are divided into the same local area in a concentrated mode, all the features of the video frames are not required to be detected, and targeted detection is achieved, and detection efficiency and accuracy are improved.
The bad frame detection is applied to a streaming media video frame rendering method, wherein the streaming media video frame rendering method adopts a multi-factor fusion mode to conduct super-resolution rendering, and the method specifically comprises the following steps of:
acquiring a streaming media video, wherein the streaming media video comprises a plurality of video frames;
evaluating the video frame quality of the video frame to obtain a video frame with high video frame quality and a video frame with low video frame quality;
performing super-resolution processing on the video frames with high video frame quality to obtain the video frames with high video frame quality with super-resolution;
according to the video frames with high video frame quality of the super resolution, performing video frame quality compensation on the video frames with low video frame quality to obtain the video frames with low video quality of the super resolution;
and rendering the video frames with high video frame quality and low video frame quality of the super resolution to obtain a super resolution rendering result of the video frames.
In order to obtain the optimal video rendering effect, the method and the device perform super-resolution processing on the video frames before rendering, improve the resolution of the video frames, and enable the rendered video frames to have high resolution.
In order to improve the effect of super-resolution processing of video frames, the invention highlights the characteristics of important areas in the video frames, suppresses noise, has the best resolution improving effect, utilizes the multi-factor fusion idea, applies various attention models to the super-resolution processing, comprises a channel attention model, a space attention model and a multi-head self-attention model, fuses the advantages of the three models to complement each other, and achieves the purpose of resolution improving effect.
According to the invention, when the advantages of the channel attention model, the spatial attention model and the multi-head self-attention model are fused, the neural network is utilized to determine the fusion weight, and the optimal fusion weight is objectively and automatically determined, so that the advantages of the three models can be exerted to the maximum in fusion, and the aim of optimizing the resolution improvement effect is fulfilled.
Furthermore, before the super-resolution processing of the video frames, the method detects the video frames, and selects the video frames with low video quality (commonly called bad frames) from the video frames, namely selects the video frames commonly called bad frames from the video frames, and does not perform the super-resolution processing, so that the bad frames have no rendering processing meaning, and the detection of the video frames can avoid invalid or nonsensical super-resolution processing.
Before super-resolution processing of video frames, the method detects the video frames, and selects video frames with low video quality (commonly called bad frames) from the video frames, wherein the method comprises the following steps:
obtaining a first local area of the video frame through dividing the mapping relation, wherein the first local area corresponds to an area dividing result of the video frame, and a plurality of dividing areas of the video frame;
carrying out feature information quantity statistics on the first local area, and determining a target area in the first local area according to the feature information quantity statistics result, wherein the target area corresponds to a local image area containing shooting target object features in a video frame;
according to the target area of the video frames, determining bad frames or video frames with low video quality in a plurality of video frames through a detection model, wherein the detection model is a neural network.
Further, determining a bad frame of the plurality of video frames includes:
inputting all target areas of the video frame into a detection model, and outputting classification labels of the video frame by the detection model;
the classification labels comprise bad frame labels and non-bad frame labels;
the video frames with bad frame labels are used as low video quality video frames, and the video frames with non-bad frame labels are used as high video quality video frames.
In the video frame rendering process of the multi-factor fusion mode, the steps of performing subsequent video frame super-resolution processing, video frame quality compensation, video frame rendering and the like on a video frame with low video quality can be avoided, invalid steps are reduced, and the accuracy of rendering effect is ensured.
The above embodiments are only exemplary embodiments of the present application and are not intended to limit the present application, the scope of which is defined by the claims. Various modifications and equivalent arrangements may be made to the present application by those skilled in the art, which modifications and equivalents are also considered to be within the scope of the present application.

Claims (9)

1. The method for detecting the bad frames of the streaming media video is characterized by comprising the following steps:
acquiring a streaming media video, wherein the streaming media video comprises a plurality of video frames;
obtaining a first local area for the video frame through dividing the mapping relation, wherein the first local area corresponds to an area dividing result of the video frame;
carrying out feature information quantity statistics on the first local area, and determining a target area in the first local area according to the feature information quantity statistics result, wherein the target area corresponds to a local image area containing shooting target object features in a video frame;
according to the target area of the video frames, determining bad frames in a plurality of video frames through a detection model, wherein the detection model is a neural network;
the construction of the partition mapping relation comprises the following steps:
setting the dividing number m of the first local areas, dividing the video frame into m first local areas according to the equal area, and calculating the image discreteness among the m first local areas, wherein the image discreteness is measured by a variance formula, and the quantization formula of the image discreteness is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Where, delta is characterized by the image discreteness,S k characterized as the firstkAn image matrix of the first partial region,S E characterized bymA matrix of mean images between the first local regions,x E i, characterized as a mean image matrixS E Middle (f)iThe pixel values of the individual pixel points,x k i, characterized as the firstkImage matrix of first partial regionS k Middle (f)iThe pixel values of the individual pixel points,Nfor the number of pixels of the image matrix,ikis a metering constant; maximum solving of image discreteness among first local areasmTo obtain the value ofmAnd carrying out video frame division on the values, and intensively dividing effective pixel points representing the characteristics of the shooting target object into the same first local area and intensively dividing ineffective pixel points representing the characteristics of the non-shooting target object into the same first local area.
2. The method for detecting bad frames of streaming media video according to claim 1, wherein the method comprises the following steps:
the determining of the first local area includes:
determining the dividing number m of the first local area through the dividing mapping relation;
and carrying out equal-area division on the video frame according to the division number m to obtain m first local areas.
3. The method for detecting bad frames of streaming media video according to claim 1, wherein the method comprises the following steps: the determining of the target area includes:
carrying out feature information quantity statistics on each first local area of the video frame by using the histogram to obtain feature information quantity of each first local area;
comparing the characteristic information amount of the first partial region with a preset threshold, wherein,
when the characteristic information quantity of the first local area is larger than or equal to a preset threshold value, the first local area is marked as a target area;
and when the characteristic information quantity of the first local area is smaller than a preset threshold value, the first local area is marked as a non-target area.
4. The method for detecting bad frames of streaming media video according to claim 3, wherein the method comprises the following steps:
determining a bad frame of a plurality of video frames, comprising:
inputting all target areas of the video frame into a detection model, and outputting classification labels of the video frame by the detection model;
the classification labels include bad frame labels and non-bad frame labels.
5. The method for detecting bad frames of streaming media video according to claim 4, wherein the method comprises the following steps:
the construction of the detection model comprises the following steps:
selecting a group of video frames as sample video frames in streaming media video with known shooting standard objects, and acquiring all target areas in the sample video frames;
and comparing all target areas in the sample video frame with standard images of the shot target object characteristics, wherein,
if all target areas in the sample video frame are consistent with the standard images of the shooting target object characteristics, marking the sample video frame as a non-bad frame label;
if all the target areas in the sample video frame are inconsistent with the standard images of the shooting target object characteristics, marking the sample video frame as a bad frame label;
learning and training all target areas of the sample video frames and classification labels of the sample video frames by using a neural network to obtain the detection model;
the model expression of the detection model is as follows: label=cnn (g); in the formula, label is a classification Label, g is all target areas of a sample video frame, and CNN is a neural network.
6. The method for detecting bad frames of streaming media video according to claim 5, wherein the method comprises the steps of:
the consistency is quantified using image similarity.
7. The method for detecting bad frames of streaming media video according to claim 5, wherein the method comprises the steps of:
and all target areas in the sample video frame are consistent with the specification of a standard image of the characteristic of the shot target object.
8. The bad frame detection device of the streaming media video is characterized by comprising the following components:
the data acquisition module is used for acquiring streaming media video, wherein the streaming media video comprises a plurality of video frames;
the data processing module is used for obtaining a first local area for the video frame through dividing the mapping relation, wherein the first local area corresponds to an area division result of the video frame;
carrying out characteristic information quantity statistics on the first local area, and determining a target area in the first local area according to the characteristic information quantity statistics result;
according to the target area of the video frames, determining bad frames in a plurality of video frames through a detection model;
the data storage module is used for storing the detection model;
the construction of the partition mapping relation comprises the following steps:
setting the dividing number m of the first local areas, dividing the video frame into m first local areas according to the equal area, and calculating the image discreteness among the m first local areas, wherein the image discreteness is measured by a variance formula, and the quantization formula of the image discreteness is as follows:the method comprises the steps of carrying out a first treatment on the surface of the Where, delta is characterized by the image discreteness,S k characterized as the firstkAn image matrix of the first partial region,S E characterized bymA matrix of mean images between the first local regions,x E i, characterized as a mean image matrixS E Middle (f)iThe pixel values of the individual pixel points,x k i, characterized as the firstkImage matrix of first partial regionS k Middle (f)iThe pixel values of the individual pixel points,Nfor the number of pixels of the image matrix,ikis a metering constant; maximum solving of image discreteness among first local areasmTo obtain the value ofmAnd carrying out video frame division on the values, and intensively dividing effective pixel points representing the characteristics of the shooting target object into the same first local area and intensively dividing ineffective pixel points representing the characteristics of the non-shooting target object into the same first local area.
9. A computer device characterized by at least one processor; and
a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to cause a computer device to perform the method of any of claims 1-7.
CN202311316549.7A 2023-10-12 2023-10-12 Method and device for detecting bad frames of streaming media video and computer equipment Active CN117061825B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311316549.7A CN117061825B (en) 2023-10-12 2023-10-12 Method and device for detecting bad frames of streaming media video and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311316549.7A CN117061825B (en) 2023-10-12 2023-10-12 Method and device for detecting bad frames of streaming media video and computer equipment

Publications (2)

Publication Number Publication Date
CN117061825A CN117061825A (en) 2023-11-14
CN117061825B true CN117061825B (en) 2024-01-26

Family

ID=88663049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311316549.7A Active CN117061825B (en) 2023-10-12 2023-10-12 Method and device for detecting bad frames of streaming media video and computer equipment

Country Status (1)

Country Link
CN (1) CN117061825B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013141872A1 (en) * 2012-03-23 2013-09-26 Hewlett-Packard Development Company, L.P. Method and system to process a video frame using prior processing decisions
GB201909447D0 (en) * 2019-07-01 2019-08-14 Sony Interactive Entertainment Inc Method and device for generating video frames
CN111222487A (en) * 2020-01-15 2020-06-02 浙江大学 Video target behavior recognition method and electronic device
US11032511B1 (en) * 2020-04-15 2021-06-08 Novatek Microelectronics Corp. Frame interpolation method and related video processor
CN114742992A (en) * 2022-04-07 2022-07-12 展讯通信(天津)有限公司 Video abnormity detection method and device and electronic equipment
US11516538B1 (en) * 2020-09-29 2022-11-29 Amazon Technologies, Inc. Techniques for detecting low image quality
CN115908142A (en) * 2023-01-06 2023-04-04 诺比侃人工智能科技(成都)股份有限公司 Contact net tiny part damage testing method based on visual recognition
CN116708753A (en) * 2022-12-19 2023-09-05 荣耀终端有限公司 Method, device and storage medium for determining the cause of preview freezing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11533427B2 (en) * 2021-03-22 2022-12-20 International Business Machines Corporation Multimedia quality evaluation

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013141872A1 (en) * 2012-03-23 2013-09-26 Hewlett-Packard Development Company, L.P. Method and system to process a video frame using prior processing decisions
GB201909447D0 (en) * 2019-07-01 2019-08-14 Sony Interactive Entertainment Inc Method and device for generating video frames
CN111222487A (en) * 2020-01-15 2020-06-02 浙江大学 Video target behavior recognition method and electronic device
US11032511B1 (en) * 2020-04-15 2021-06-08 Novatek Microelectronics Corp. Frame interpolation method and related video processor
US11516538B1 (en) * 2020-09-29 2022-11-29 Amazon Technologies, Inc. Techniques for detecting low image quality
CN114742992A (en) * 2022-04-07 2022-07-12 展讯通信(天津)有限公司 Video abnormity detection method and device and electronic equipment
CN116708753A (en) * 2022-12-19 2023-09-05 荣耀终端有限公司 Method, device and storage medium for determining the cause of preview freezing
CN115908142A (en) * 2023-01-06 2023-04-04 诺比侃人工智能科技(成都)股份有限公司 Contact net tiny part damage testing method based on visual recognition

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于超分辨率重建的低质量视频人脸识别方法;陆要要等;万方平台;全文 *

Also Published As

Publication number Publication date
CN117061825A (en) 2023-11-14

Similar Documents

Publication Publication Date Title
CN112435244B (en) Quality evaluation method and device for live video, computer equipment and storage medium
CN113298779B (en) Video redirection quality objective evaluation method based on reverse reconstruction grid
CN108462878B (en) Teaching video compression algorithm based on key frame and indicator motion model
CN113327234A (en) Video redirection quality evaluation method based on space-time saliency classification and fusion
CN110620924A (en) Method and device for processing coded data, computer equipment and storage medium
CN112712569A (en) Skin color detection method, device, mobile terminal and storage medium
CN111524110B (en) Video quality evaluation model construction method, evaluation method and device
CN120182155A (en) A video image color correction method and system based on artificial intelligence
CN113784118A (en) Video quality evaluation method and device, electronic equipment and storage medium
CN117896552A (en) Video conference processing method, video conference system and related device
CN119967112A (en) A system and method for intelligent frame interpolation and fluency improvement of video content
CN111311584A (en) Video quality assessment method and device, electronic device, readable medium
Lin et al. Toward efficient video compression artifact detection and removal: A benchmark dataset
CN112686965A (en) Skin color detection method, device, mobile terminal and storage medium
WO2020087434A1 (en) Method and device for evaluating resolution of face image
CN117061825B (en) Method and device for detecting bad frames of streaming media video and computer equipment
CN116800953A (en) Video quality assessment method and device
CN116471262A (en) Video quality evaluation method, apparatus, device, storage medium, and program product
CN115862131A (en) A Video Data Screening Method for Human Motion Recognition
Watanabe et al. Full reference point cloud quality assessment using support vector regression
CN117061791B (en) Cloud video frame self-adaptive collaborative rendering method and device and computer equipment
CN110401832B (en) An objective quality assessment method for panoramic video based on spatiotemporal pipeline modeling
Qu et al. NVS-SQA: Exploring Self-Supervised Quality Representation Learning for Neurally Synthesized Scenes without References
CN117425047A (en) Dynamic adjustment method of client video bit rate, storage media and electronic equipment
CN117201792A (en) Video encoding method, video encoding device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20231114

Assignee: Lenovo (Beijing) Co.,Ltd.

Assignor: Shenzhen Yuntian Changxiang Information Technology Co.,Ltd.

Contract record no.: X2025980007188

Denomination of invention: Method, device, and computer equipment for detecting bad frames in streaming video

Granted publication date: 20240126

License type: Exclusive License

Record date: 20250416