WO2020224428A1 - 在视频中植入信息的方法、计算机设备及存储介质 - Google Patents
在视频中植入信息的方法、计算机设备及存储介质 Download PDFInfo
- Publication number
- WO2020224428A1 WO2020224428A1 PCT/CN2020/085939 CN2020085939W WO2020224428A1 WO 2020224428 A1 WO2020224428 A1 WO 2020224428A1 CN 2020085939 W CN2020085939 W CN 2020085939W WO 2020224428 A1 WO2020224428 A1 WO 2020224428A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frame
- implanted
- detected
- model
- pixel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/272—Means for inserting a foreground image in a background image, i.e. inlay, outlay
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
- G06T11/60—Creating or editing images; Combining images with text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/28—Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/63—Control of cameras or camera modules by using electronic viewfinders
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/66—Remote control of cameras or camera parts, e.g. by remote control devices
- H04N23/661—Transmitting camera control signals through networks, e.g. control via the Internet
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/64—Circuits for processing colour signals
- H04N9/68—Circuits for processing colour signals for controlling the amplitude of colour signals, e.g. automatic chroma control circuits
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/70—Circuitry for compensating brightness variation in the scene
- H04N23/71—Circuitry for evaluating the brightness variation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/2628—Alteration of picture size, shape, position or orientation, e.g. zooming, rotation, rolling, perspective, translation
Definitions
- This application relates to graphics and image technology, and in particular to a method for embedding information in a video, a computer device and a storage medium.
- Video is the current mainstream information carrier. With the development of the Internet, especially the mobile Internet, the speed of video transmission has increased rapidly, making video an important channel for information transmission.
- Video information implantation refers to superimposing various information, such as promotional information, including images, text, or a combination of the two, in the background of the video without affecting the main content of the video (for example, foreground content).
- the main content of the video (such as the characters in the video, the special effects added in the post-production of the video, etc.) is presented in the form of foreground content.
- the information needs to be integrated into In the background content of the video.
- Related technologies lack effective solutions.
- the embodiments of the present application provide a method, computer equipment, and storage medium for embedding information in a video, which can efficiently integrate information into the background content of the video.
- the embodiment of the application provides a method for embedding information in a video, including:
- the information to be implanted after applying the template is overlaid on the implanted area in the frame to be detected, so that the foreground is highlighted relative to the information to be implanted.
- An embodiment of the present application provides a device for embedding information in a video, including:
- a model building module is used to build a model that conforms to the pixel distribution characteristics of the implanted area in the reference frame, and to control the update of the model based on the frame to be detected subsequent to the reference frame;
- a template generating module configured to identify the background and foreground of the implanted area in the frame to be detected based on the model, and generate a template for occluding the background and revealing the foreground;
- a template application module configured to apply the information to be implanted to the template to shield the content in the information to be implanted that would obscure the foreground;
- the information covering module is used for covering the information to be implanted after applying the template to the implantation area in the frame to be detected, so that the foreground is highlighted relative to the information to be implanted.
- the device further includes:
- the parameter initialization module is configured to correspond to each pixel of the implanted area in the reference frame, and initialize at least one sub-model corresponding to the pixel and the weight corresponding to the at least one sub-model;
- the weight mixing module is used to mix the sub-models constructed corresponding to each pixel based on the initialized weights to form a model corresponding to the pixel.
- the device further includes:
- a weight retention module configured to reduce the rate at which the model is fitted to the implanted area in the to-be-detected frame in response to the implanted area in the to-be-detected frame being blocked by the foreground;
- the fitting acceleration module is used to respond to the implanted area in the to-be-detected frame not being blocked by the foreground, and the illumination of the implanted area in the to-be-detected frame changes, to transfer the model to the to-be-detected frame
- the fitting rate of the implanted area in the frame is increased.
- the device further includes:
- the parameter update module is used to respond to the pixel points of the implanted area in the frame to be detected matching at least one sub-model in the corresponding model, update the parameters of the matched sub-model, and keep the corresponding model unmatched The parameters of the sub-model remain unchanged.
- the device further includes:
- the first matching module is configured to match the color value of each pixel in the implanted area in the frame to be detected with the sub-model in the model corresponding to the pixel;
- the recognition module is used for recognizing the pixels that are successfully matched as the pixels of the background, and the pixels that are not matched as the pixels of the foreground.
- the device further includes:
- the filling module is used to correspond to the pixels identified as background in the implanted area in the frame to be detected, and fill binary ones in the corresponding positions in the empty template, and
- binary zeros are filled in the corresponding positions in the template filled with binary ones.
- the device further includes:
- the arithmetic module is used to multiply the information to be implanted with the binary number filled in each position in the template.
- the device further includes:
- the second matching module is configured to match the features extracted from the implanted region in the reference frame of the video with the features extracted from the frame to be detected in response to the video being formed using a motion lens;
- the area determining module is configured to determine that the frame to be detected includes the implanted area corresponding to the implanted area in the reference frame in response to the successful matching.
- the device further includes:
- the area transformation module is used to respond to the video being formed with a motion lens
- the template inverse transformation module is used to perform the inverse transformation of the template on the template before applying the information to be implanted, so that the position of each binary number in the transformed template is the same as the frame to be detected The positions of the corresponding pixels in the implanted area are the same.
- the device further includes:
- the region positioning module is used to respond to the video being formed by using a static lens, and locate the region of the corresponding position in the frame to be detected based on the position of the implanted region in the reference frame to determine the implanted region to be detected .
- the device further includes:
- the first determining module is configured to determine the first difference condition in response to the first color space distribution of the implanted area in the frame to be detected and the first color space distribution of the implanted area in the reference frame The implanted area in the frame to be detected is blocked by the foreground;
- the second determining module is configured to determine that the second color space distribution of the implanted area in the frame to be detected meets a second difference condition with the second color space distribution of the implanted area in the reference frame The illumination of the implanted area in the frame to be detected changes.
- An embodiment of the present application provides a computer device, including:
- Memory used to store executable instructions
- the processor is configured to implement the method provided in the embodiment of the present application when executing the executable instructions stored in the memory.
- An embodiment of the present application provides a storage medium that stores executable instructions for causing a processor to execute, to implement the method provided in the embodiment of the present application.
- the embodiment of the present application provides a computer program product, and the computer program product stores a computer program, which is used to implement the method provided in the embodiment of the present application when it is loaded and executed by a processor.
- FIG. 1A is a schematic diagram of processing an image using a mask in an embodiment of the present application
- FIG. 1B is a schematic diagram of an application scenario provided by an embodiment of the application.
- FIG. 2 is a schematic diagram of an optional structure of a device provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of the implementation process of a method for embedding information in a video according to an embodiment of the present application
- Figure 4 is a schematic diagram of the implementation process of constructing and updating a model in an embodiment of the present application
- FIG. 5 is a schematic diagram of another implementation process of the method for embedding information in a video according to an embodiment of the application;
- FIG. 6 is a schematic diagram of another implementation process of the method for embedding information in a video according to an embodiment of the application;
- FIG. 7 is a schematic diagram of another implementation process of embedding information in a video according to an embodiment of the application.
- 8A is a schematic diagram of the effect of embedding information in a video formed by using a static lens according to an embodiment of the application;
- FIG. 8B is a schematic diagram of another effect of embedding information in a video formed by using a static lens according to an embodiment of the application.
- 8C is a schematic diagram of the effect of embedding information in a video formed by a dynamic lens according to an embodiment of the application.
- 8D is a schematic diagram of another effect of embedding information in a video formed by a dynamic lens according to an embodiment of the application.
- FIG. 9A is a schematic diagram of using morphology to improve the mask in an embodiment of the application.
- FIG. 9B is another schematic diagram of using morphology to improve the mask according to the embodiment of the application.
- first ⁇ second ⁇ third is only used to distinguish similar objects, and does not represent a specific order of objects. Understandably, “first ⁇ second ⁇ third” Where permitted, the specific order or sequence can be interchanged, so that the embodiments of the present application described herein can be implemented in a sequence other than those illustrated or described herein.
- a mask also called a filter or a template, is an image used to mask (part or all) pixels in the image to be processed, so as to highlight a part of a specific image.
- the mask can be a two-dimensional matrix array, and sometimes a multi-value image is also used.
- the image mask is mainly used to shield certain areas of the image.
- a 3*3 image shown in 101 in FIG. 1A is calculated with a 3*3 mask shown in 102 in FIG. 1A to obtain a result image shown in 103 in FIG. 1A.
- a static lens that is, a fixed lens (Fixed Shot, FS), is a lens with a fixed camera position, lens optical axis, and focal length.
- the objects in the video of static shots can be static or dynamic (in and out of the screen), but the frame to which the screen is attached does not move, that is, the screen range and the field of view area Is consistent.
- a sports lens is a lens that uses various movements (such as changes in camera position, optical axis, and focal length) to take pictures.
- the frame to which the picture in the video of the motion lens is attached can be changed, that is, the picture range and the area of the field of view can be changed, for example, the distance, size, and angle of the image.
- the scene behind the subject in the video frame can express the space-time environment where the character or event is located, such as buildings, walls, and ground behind the character.
- the content of the video screen that is closer to the lens than the background is the main body of the video, such as a person standing in front of a building.
- Background subtraction that is, manually setting a fixed threshold, subtracting the new potential foreground area in the video from the original background area, and comparing with the threshold to determine whether the background is occluded by the foreground, and then forming the corresponding occluded part of the mask membrane.
- the solution's judgment on the foreground and background relies on manually selected thresholds, so the degree of automation is low and frequent adjustments are required; when the colors of the current and background are similar, the subtraction between the foreground and the background is not complete and the accuracy is low.
- Gaussian mixture background modeling of static shots is to model the background of static shots with no occlusion, and use the model for subsequent image frames to determine whether the background is occluded by the foreground to form a mask for the occluded part.
- the solution can only be used for fixed-lens video. If it is a moving-lens video, it is easy to recognize the background as the foreground, and the accuracy is also low.
- Trajectory classification is to calibrate the target point of interest in the initial frame, use the motion tracking model to obtain the trajectory of the feature points in the implanted information, and to distinguish the foreground and background based on the trajectory.
- the solution is sensitive to the noise in the image frame, and the accuracy depends on the motion tracking model. If the selected motion tracking model is not suitable, the discrimination accuracy of foreground and background will be greatly affected.
- the embodiments of the present application provide a method for embedding information in a video, which combines video sequence and full-pixel statistics to model, and realizes automatic selection of still shot videos Background modeling, subsequent frames automatically update the learning rate to optimize the model, use statistical features to determine whether there is occlusion and form a mask; use transformation technology to map the standard picture of the reference frame to perform pixel statistical modeling for the video of the motion shot, and then return The occluded mask is obtained from the picture of the subsequent frame, without a motion tracking model, with high real-time performance, wide application range, strong robustness, and automatic and efficient use.
- the devices provided in the embodiments of the present application can be implemented as mobile phones (mobile phones), tablet computers, notebook computers and other mobile terminals with wireless communication capabilities, and can also be implemented as inconvenient mobile terminals.
- the device provided in the implementation of this application can also be implemented as a server, and the server can refer to one server, or can be a server cluster composed of multiple servers, a cloud computing center, etc., which is not limited herein.
- Figure 1B is a schematic diagram of an application scenario provided by an embodiment of the application.
- the terminal 400 is connected to the server 200 through a network 300.
- the network 300 may be a wide area network or a local area network, or a combination of the two. Use wireless link to realize data transmission.
- the implanted information may be an advertisement, and the video may be a video recorded by the terminal.
- the terminal 400 may send the video and the information to be implanted to the server 200,
- the server 200 is requested to implant information in the video.
- the server 200 uses the method of implanting information in the video provided in the embodiments of the present application to add the information to be implanted into the video.
- encapsulation is performed to obtain the encapsulated video file, and finally the encapsulated video file is sent to the terminal 400.
- the terminal 400 can publish the video embedded with the advertisement.
- the terminal 400 after the terminal 400 has recorded the video and determined the information to be implanted, the terminal 400 itself uses the method of implanting information in the video provided by the embodiments of this application to transfer the information to be implanted Add to each frame of the video, and encapsulate to obtain the video file, and then publish the video embedded with the advertisement through the APP for watching the video. It should be noted that, in order to reduce the amount of calculations and implantation efficiency of the terminal, when the terminal itself performs information implantation, it is generally for a relatively short video.
- the video is the video stored in the server 200.
- the terminal 400 may send the information to be implanted and the identification information of the video to the server 200 to request the server 200 adds the information to be implanted into the video corresponding to the identification information.
- the server 200 determines the corresponding video file based on the identification information, embeds the information to be implanted into the video file, and finally encapsulates to obtain the encapsulated video file, and then sends the encapsulated video file to the terminal 400.
- the device provided in the embodiment of the present application may be implemented in a hardware or a combination of software and hardware.
- the following describes various exemplary implementations of the device provided in the embodiment of the present application.
- FIG. 2 is a schematic diagram of an optional structure of a server 200 provided in an embodiment of the present application.
- the server 200 may be a desktop server, or a server cluster composed of multiple servers, a cloud computing center, etc.
- an exemplary structure when the device is implemented as a server can be foreseen. Therefore, the structure described here should not be regarded as a limitation. For example, some components described below may be omitted, or components not described below may be added to Adapt to the special needs of some applications.
- the server 200 shown in FIG. 2 includes: at least one processor 210, a memory 240, at least one network interface 220, and a user interface 230. Each component in the terminal 200 is coupled together through the bus system 250. It can be understood that the bus system 250 is used to implement connection and communication between these components. In addition to the data bus, the bus system 250 also includes a power bus, a control bus, and a status signal bus. However, for clear description, various buses are marked as the bus system 250 in FIG. 2.
- the user interface 230 may include a display, a keyboard, a mouse, a trackball, a click wheel, keys, buttons, a touch panel or a touch screen, etc.
- the memory 240 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memory.
- the non-volatile memory may be a read only memory (ROM, Read Only Memory).
- the volatile memory may be a random access memory (RAM, Random Access Memory).
- RAM Random Access Memory
- the memory 240 in the embodiment of the present application can store data to support the operation of the server 200.
- Examples of these data include: any computer programs used to operate on the server 200, such as operating systems and application programs.
- the operating system contains various system programs, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks.
- Applications can include various applications.
- the method provided by the embodiments of the present application may be directly embodied as a combination of software modules executed by the processor 210.
- the software modules may be located in a storage medium, and the storage medium is located in a memory. 240.
- the processor 210 reads the executable instructions included in the software module in the memory 240, and combines necessary hardware (for example, including the processor 210 and other components connected to the bus 250) to complete the method provided in the embodiment of the present application.
- the processor 210 may be an integrated circuit chip with signal processing capabilities, such as a general-purpose processor, a digital signal processor (DSP, Digital Signal Processor), or other programmable logic devices, discrete gates, or transistor logic devices , Discrete hardware components, etc., where the general-purpose processor may be a microprocessor or any conventional processor.
- DSP Digital Signal Processor
- the general-purpose processor may be a microprocessor or any conventional processor.
- the method for implementing the embodiment of the present application will be described in conjunction with the foregoing exemplary application and implementation of the apparatus for implementing the embodiment of the present application.
- the method provided in the embodiment of the present application is applied to an execution device, and the execution device may be a server or a terminal. That is to say, the method provided in the embodiment of the present application may be executed by the server, or may also be executed by the terminal.
- the server can be a desktop server, a server cluster composed of multiple servers, a cloud computing center, etc.
- the terminal can be a mobile terminal with wireless communication capabilities such as a mobile phone (mobile phone), a tablet computer, a notebook computer, etc., and can also be implemented as a desktop computer or desktop computer with computing functions that is inconvenient to move.
- FIG. 3 is a schematic diagram of the implementation flow of the method for embedding information in a video according to an embodiment of the present application, which will be described in conjunction with the steps shown in FIG.
- Step S101 Construct a model that conforms to the pixel distribution characteristics of the implanted area in the reference frame, and control the update of the model based on the frame to be detected subsequent to the reference frame.
- the reference frame may be an image after the information has been implanted, and the area where the information is implanted is the implanted area.
- the reference frame and the implanted area can be artificially set, or can be automatically filtered using technologies such as machine learning and deep learning.
- the reference frame may be an image frame in the video that includes at least an implanted area, the implanted area is implanted with information to be implanted, and the information to be implanted is not blocked. For example, it may be the first time that the implantation area appears in the video, and the implantation area is implanted with the image frame with the information to be implanted and the information to be implanted is not blocked.
- the reference frame may be an image frame where a complete advertisement area (for example, a specific area on the wall or the ground, which is sufficient to display the advertisement completely) appears for the first time in the video.
- the reference frame may be an image frame in which a target object related to the information to be implanted appears, or an image frame in which keywords related to the information to be implanted appear in the displayed caption.
- the information to be implanted is an advertisement for a certain brand of air conditioner
- it can be an image frame where the air conditioner appears in the video as a reference frame, or an image with keywords like "cold" and "hot” Frame as a reference frame.
- the implantation area can be artificially delineated, for example, it can be an area in the upper right corner of the image frame, or an area in the upper middle of the image frame, of course, it can also be a specific area automatically recognized, such as the ground, the wall , Sky and other related areas. It should be noted that the implanted area in the reference frame is not obstructed by the foreground, so that when the model is initialized, the pixel distribution of the implanted area can be fully learned.
- the model When constructing a model that conforms to the pixel distribution characteristics of the implanted area in the reference frame, it is to construct a model of each pixel in the implanted area.
- it may be a Gaussian mixture model of each pixel in the implanted area.
- the Gaussian mixture model predefined for each pixel is initialized according to each pixel of the implanted area in the reference frame, which includes multiple Gaussian modes (in some embodiments, the Gaussian mode is also It can be called mode/Gaussian component/sub-model), initialize the parameters in the Gaussian mode, and find the parameters to be used later.
- each pixel in the implanted area in each subsequent frame to be detected to determine whether the pixel matches a certain pattern (ie Gaussian mode), if it matches, the pixel is classified into the pattern, and according to The new pixel value updates the weight of the model. If it does not match, a Gaussian model is established with pixels and the parameters are initialized to replace the model with the smallest weight in the original model.
- a certain pattern ie Gaussian mode
- Step S102 Recognizing the background and the foreground of the implanted region in the frame to be detected based on the model, and generating a template for occluding the background and revealing the foreground.
- each pixel in the implanted area in the frame to be detected may be sequentially matched with each mode in the corresponding model. If a pixel has a matching pattern, the pixel is considered to be the background Pixel, if there is no pattern that matches the pixel, then the pixel is considered to be the foreground pixel.
- the recognition result can be used to generate a template for occluding the background and revealing the foreground. Further, when a pixel is recognized as the background, the corresponding value of the pixel in the template can be set to 1, and if the pixel is recognized as the foreground, the corresponding value of the pixel in the template is set to 0. It should be noted that 0 and 1 here are binary numbers, that is, the template is a mask composed of binary 0 and 1.
- Step S103 Apply the template to the information to be implanted, so as to shield the content in the information to be implanted that would obscure the foreground.
- step S103 the information to be implanted and the template may be multiplied.
- multiplying the information to be implanted with the template may refer to multiplying the information to be implanted with a binary number filled in each position in the template. This can be achieved by multiplying the pixel to be implanted with the binary number at the corresponding position in the template.
- the value corresponding to the background pixel is 1 and the value corresponding to the foreground pixel is 0. Therefore, when the information to be implanted is multiplied with the template, the content that will obscure the foreground in the information to be implanted will be shielded. It will not affect the content that does not obscure the foreground in the embedded information.
- Step S104 covering the information to be implanted after applying the template to the implantation area in the frame to be detected, so that the foreground is highlighted relative to the information to be implanted.
- the implanted information after applying the template is overwritten to the implanted information in the frame to be detected In the area, the information to be implanted does not obscure the foreground part of the frame to be detected, thereby bringing a better viewing experience.
- a model When using the method provided by the embodiments of this application to embed information to be implanted in a video, a model must be constructed for each pixel based on the pixel distribution characteristics of the implanted area in the reference frame, and the parameters of the model can be based on the frame to be detected
- Each pixel in the implanted area is updated, and then based on the foreground pixels and background pixels of the implanted area in the frame to be detected, a template that can block the background and not the foreground is generated, and then the template is applied to the implanted information, and finally The information to be implanted after applying the template is covered to the implanted area in the frame to be detected.
- the generated template can block the background but not the foreground.
- the template after applying the template to the information to be implanted, it can block the foreground from the information to be implanted Therefore, after the information is embedded in the frame to be detected, the foreground part of the frame to be detected will not be blocked, thereby ensuring the video viewing experience.
- Step S101 can be implemented through the steps shown in FIG. 4:
- Step S1011 corresponding to each pixel of the implanted area in the reference frame, initialize at least one sub-model corresponding to the pixel and a weight corresponding to the at least one sub-model.
- the granularity is pixel points, that is, a model is constructed for each pixel point, and a pixel point model may correspond to at least one sub-model.
- a pixel model can correspond to one sub-model or multiple sub-models.
- the pixel model may be a Gaussian mixture model, and the model includes two or more sub-models, generally three to five.
- the sub-model may be a Gaussian probability distribution function
- the initializing sub-model is at least the parameters of the initializing sub-model, where the parameters of the sub-model may be parameters such as mean, variance, and weight.
- the parameters of the sub-model may be set to preset values.
- the variance is generally set as large as possible, and the weight is set as small as possible. This setting is because the initialized Gaussian model is an inaccurate model. It is necessary to constantly reduce its range and update parameter values to obtain the most likely Gaussian model.
- the variance is set to be larger in order to make as many as possible
- the pixel points are matched with the sub-models, so as to obtain a model that accurately represents the distribution characteristics of the color values of the pixel points in each frame of the video.
- the model may also be a single Gaussian model. At this time, only one sub-model is required, and the parameters of the sub-model may be the mean value, the variance, and the like. Since the single Gaussian model is suitable for scenes with a single constant background, a Gaussian mixture model is usually constructed for subsequent processing.
- step S1012 the sub-models constructed corresponding to each pixel are mixed based on the initialized weights to form a model corresponding to the pixel.
- step S1012 can be implemented by formula (1-1) :
- F m K 1 *F 1 +K 2 *F 2 +K 3 *F 3 (1-1);
- F m is the model corresponding to the pixel.
- a simple mathematical transformation can also be performed on the formula (1-1) to form a model corresponding to the pixel.
- step S1011 and step S1012 a model conforming to the pixel distribution characteristics of the implanted area in the reference frame is completed.
- Step S1013 Determine whether the implanted area in the frame to be detected is blocked by the foreground.
- the first color space distribution of the implanted area in the frame to be detected and the first color space distribution of the implanted area in the reference frame may be obtained first, and then the implanted area of the frame to be detected and the value of the reference frame are determined.
- the difference degree of the first color space distribution of the implanted area is further determined whether the implanted area in the frame to be detected is occluded by the foreground by determining whether the difference degree satisfies the first difference condition.
- step S1014 is entered; if the first color space distribution of the implanted area in the frame to be detected is the same as that of the implanted area in the reference frame. The first color space distribution of the region does not satisfy the first difference condition, indicating that the difference between the two is small, and then it indicates that the implanted region in the frame to be detected is not blocked by the foreground, and step S1015 is entered at this time.
- the first color spatial distribution may be Red Green Blue (RGB) spatial distribution.
- RGB Red Green Blue
- Obtaining the first color space distribution of the implanted area can be obtained by obtaining the RGB histogram of the implanted area. For example, 256 gray levels can be divided into 32 intervals, and the distribution of pixels in the implanted area can be counted in these 32 intervals. Get the RGB histogram.
- the first difference condition may be used to indicate the maximum degree of difference in the first color space distribution between the implanted area of the reference frame and the implanted area of the frame to be detected when it is determined that there is no occlusion in the implanted area of the frame to be detected. For example, assuming that there are a total of M intervals, the first difference condition may be that the difference in the number of pixels in 30%*M intervals is outside the number threshold range. For example, there are 32 intervals, then the first difference condition may be that there is at least a difference in the number of pixels in 9 intervals that exceeds 10.
- Step S1014 in response to the implanted area in the frame to be detected being occluded by the foreground, decelerate the fitting of the model to the implanted area in the frame to be detected, and the weight of the sub-model in the model remains unchanged .
- Decelerating the fitting of the model to the implanted area in the frame to be detected means that the rate of fitting the model to the implanted area in the frame to be detected is reduced.
- the learning rate related to the fitting speed in the model can be set to 0 to keep the weight of the sub-model in the model unchanged.
- Step S1015 It is judged whether the illumination condition of the implanted area in the frame to be detected has changed.
- the second color space distribution of the implanted area in the frame to be detected and the second color space distribution of the implanted area in the reference frame may be obtained first, and then the implanted area of the frame to be detected and the value of the reference frame are determined.
- the second difference condition may be used to indicate the maximum degree of difference in the second color space distribution between the implanted area of the reference frame and the implanted area of the frame to be detected when it is determined that the illumination condition of the implanted area in the frame to be detected changes.
- step S1016 is entered; in response to the second color space distribution of the implanted area in the frame to be detected, the second color space distribution of the implanted area in the reference frame does not meet the first
- the second difference condition is that it is determined that the illumination condition of the implanted area in the frame to be detected has not changed, and the original learning rate is maintained at this time, and the weight is updated.
- the second color spatial distribution may be a Hue Saturation Value (HSV) spatial distribution.
- HSV Hue Saturation Value
- Step S1016 Accelerate the fitting of the model to the implanted area in the frame to be detected.
- the fitting of the model to the implanted area in the frame to be detected is accelerated, that is, the rate at which the model is fitted to the implanted area in the frame to be detected is increased.
- the prerequisite for performing step S1016 is that the implanted area in the frame to be detected is not occluded by the foreground, and the illumination of the implanted area in the frame to be detected has changed, so in order to avoid recognizing the new illumination as the foreground It is necessary to speed up the fitting speed so that the model can be fitted to the implanted area of the frame to be detected as soon as possible, so as to ensure that the model can represent the pixel distribution characteristics of the implanted area. For example, for the model of each pixel in the implanted area, the learning rate related to the fitting speed in the model can be set to -1.
- step S1013 to step S1016 the weight of each sub-model in the model is updated. At this time, the parameters of the sub-model need to be further updated.
- Step S1017 It is determined whether each pixel of the implanted area in the frame to be detected matches the sub-model in the corresponding model.
- the pixel is considered to match the sub-model .
- the threshold may be related to the standard deviation, and may be 2.5 times the standard deviation of the sub-model. If a pixel matches at least one sub-model in the model, then step S1018 is entered; if a pixel does not match any sub-model in the model, then step S1019 is entered.
- Step S1018 in response to the pixel points of the implanted area in the frame to be detected matching at least one sub-model in the corresponding model, update the parameters of the matched sub-model.
- Step S1019 In response to the pixel points of the implanted area in the frame to be detected that do not match any of the sub-models in the corresponding model, a new sub-model is initialized based on the pixels, and the sub-model with the smallest weight is replaced.
- steps S1017 to 1019 the update of the sub-model parameters is completed.
- the background and the template that reveals the foreground so that when the information is implanted in the implanted area of the frame to be detected, it is better integrated with the background, and the foreground can not be blocked.
- step S102 can be implemented through the following steps:
- Step S1021 Match the color value of each pixel in the implanted area in the frame to be detected with each sub-model in the model corresponding to the pixel.
- step S1021 the color value of each pixel in the implanted area in the frame to be detected may be compared with each sub-model corresponding to the pixel, and the color value of one pixel is compared with the mean value of at least one sub-model. When the deviation of is within a certain threshold, it indicates that the sub-model matches the pixel.
- Step S1022 Identify the pixels that are successfully matched as the pixels of the background, and identify the pixels that have failed to match as the pixels of the foreground.
- the implanted area in the reference frame is an area that does not block the foreground, it can be a background, and when constructing the model, it is constructed based on the pixel distribution characteristics of the implanted area in the reference frame, then if in the frame to be detected The pixel in the implanted area matches a sub-model in the model corresponding to the pixel, then the pixel is determined to be a background pixel; if the pixel in the implanted area in the frame to be inspected matches any of the pixels in the model corresponding to the pixel If none of the sub-models match, then the pixel is determined to be the foreground pixel.
- Step S1023 corresponding to the pixel points identified as the background in the implanted area in the frame to be detected, fill a binary one in the corresponding position in the empty template.
- Step S1024 corresponding to the pixel points identified as the foreground in the implanted area in the frame to be detected, filling binary zeros at the corresponding positions in the template filled with binary ones.
- a binary template is generated.
- the corresponding template position is 1, and for pixels identified as foreground, the corresponding template position Therefore, after multiplying this template and the information to be implanted, the information to be implanted after the template is applied is obtained.
- the pixel value of the pixel identified as the foreground in the information to be implanted after the template is applied is 0,
- the pixel value of the pixel identified as the background remains unchanged. In this way, when the applied information to be implanted covers the implanted area in the to-be-detected frame, it can be ensured that the foreground is not occluded and is highlighted relative to the implanted information.
- step S101 it is also necessary to determine the implantation area in the frame to be detected. If the video is formed by using a static lens, the frame range and the area of view in the video are unchanged. At this time, when it is determined that the implanted area in the frame to be detected is actually implemented, it may be based on the position of the implanted area in the reference frame, and locate the area at the corresponding position in the frame to be detected to obtain the Implant area in the frame.
- determining the implanted area in the frame to be detected can be achieved through the following steps:
- Step 21 Match the feature extracted from the implanted region in the reference frame of the video with the feature extracted from the frame to be detected.
- step 21 you can first extract the feature points from the implanted area in the reference frame, then extract the feature points in the frame to be detected, and then combine the feature points extracted from the implanted area in the reference frame with The feature points in the frame to be detected are matched.
- the feature points when extracting the feature points, it can be the features from the oriented corner test (Features from Accelerated Segment Test, FAST) and the rotated binary robust independent elementary features (Binary Robust Independent Elementary Features, BRIEF) feature points (Oriented). FAST and Rotated Brief, ORB), or Scale-Invariant Feature Transform (SIFT) feature points.
- FAST Accelerated Segment Test
- BRIEF Binary Robust Independent Elementary Features
- ORB Rotated Brief
- SIFT Scale-Invariant Feature Transform
- Step 22 In response to the successful matching, it is determined that the frame to be detected includes an implanted area corresponding to the implanted area in the reference frame.
- the feature points of the implanted area in the reference frame and the feature points in the frame to be detected can be matched successfully, which can mean that all the feature points are successfully matched, or a part of the feature points are successfully matched, for example, 80 % Of feature points are successfully matched.
- step 21 to step 22 the implanted area in the frame to be detected is tracked by matching the feature points of the implanted area in the reference frame with the feature points in the frame to be detected. Compared with the realization of motion tracking in real-time High, wide range of application, strong robustness, automatic and efficient use.
- the position, optical axis, and focal length of the lens may change, so the position of the implanted area in each image frame of the video formed by using the moving lens It will change.
- step S1013 the following steps need to be performed:
- Step 31 Transform the implanted area in the frame to be detected so that the position of each pixel in the transformed implanted area is consistent with the position of the corresponding pixel in the implanted area in the reference frame.
- step 31 when step 31 is implemented, it can first track the implanted area (that is, the background area for implanting information) to generate a homography matrix H, and then calculate the implanted area in the frame to be detected according to the homography.
- the matrix H is transformed into the reference frame, so that the position of each pixel in the implanted area after transformation is consistent with the position of the corresponding pixel in the implanted area in the reference frame. Further, it can be implemented according to formula (2-1):
- x t , y t represent a pixel point in the current frame
- x 0 , y 0 represent a pixel point corresponding to the pixel point in the reference frame.
- the implanted area that has undergone homography matrix transformation is actually used, so the to-be-detected frame is identified in step S102
- the background and foreground of the implanted area in the frame, as well as the implanted area that undergoes homography matrix transformation are also used when generating a template for blocking the background and revealing the foreground.
- it is also necessary to perform the inverse transformation of the transformation on the template so that the position of each binary number in the transformed template corresponds to the pixel point of the implanted area in the frame to be detected. The location is consistent.
- the pixel distribution characteristics of each pixel in the implanted area in the frame to be detected are used to fit the background pixel distribution of the implanted area in the reference frame, and Gaussian mixture is used to construct Model, automatically learn and update the model, and determine the template that can shield the background and display the foreground according to the occlusion detection result, so as to avoid the embedded information to block the foreground.
- the transformation technology is used to map the position of the pixel in the implanted area in the frame to be detected to a position consistent with the implanted area in the reference frame, and the pixel in the implanted area in the frame to be inspected is also mapped Occlusion detection, generate a template, and then inversely transform the template to form a template that can shield the background and display the foreground, so as to ensure that the foreground can not be blocked after the information is implanted.
- FIG. 5 is a schematic diagram of another implementation process of the method for embedding information in a video according to an embodiment of the present application. As shown in Figure 5, the method includes:
- Step S401 The terminal obtains the video to be processed and the information to be implanted.
- the video to be processed may be a video recorded by the terminal, or a video downloaded by the terminal from a server, of course, it may also be a video sent to the terminal by other terminals.
- the information to be implanted may be image information to be implanted, and the image information to be implanted may be advertisement image information, or publicity information.
- the video to be processed may be a video file that includes many image frames.
- the video to be processed may also refer to the identification information of the video to be processed, for example, it may include the video to be processed. The title, starring and other information.
- Step S402 The terminal sends an implantation request carrying at least the video and the information to be implanted to the server.
- the implantation request may also include the identification of the reference frame and the information of the implantation area in the reference frame.
- the implantation request may include the frame number of the reference frame and the coordinates of the four vertices of the implantation area in the reference frame.
- Step S403 the server determines the reference frame and the implantation area in the reference frame based on the received implantation request.
- the received implantation request may be parsed to obtain the set reference frame and the implantation area set in the reference frame.
- the image frame of the video file can be analyzed by means of image recognition, so as to determine the reference frame that meets the information implantation condition and the implantation area in the reference frame.
- information implantation conditions it may include at least one of the following: type of implantation area (for example, wall, floor), size of implantation area (for example, width and height to fit the information to be implanted), implantation area The color (for example, to form a certain contrast with the information to be implanted), and the exposure time of the implanted area (that is, the cumulative duration of appearance in the video).
- type of implantation area for example, wall, floor
- size of implantation area for example, width and height to fit the information to be implanted
- implantation area The color for example, to form a certain contrast with the information to be implanted
- the exposure time of the implanted area that is, the cumulative duration of appearance in the video.
- step S404 the server constructs a model that conforms to the pixel distribution characteristics of the implanted area in the reference frame, and controls the update of the model based on the frame to be detected subsequent to the reference frame.
- Step S405 The server recognizes the background and the foreground of the implanted region in the frame to be detected based on the model, and generates a template for blocking the background and revealing the foreground.
- Step S406 The server applies the template to the information to be implanted, so as to shield the content in the information to be implanted that would obscure the foreground.
- Step S407 covering the information to be implanted after applying the template to the implantation area in the frame to be detected, so that the foreground is highlighted relative to the information to be implanted.
- step S404 to step S407 can be understood with reference to the description of similar steps above.
- step S408 the server encapsulates the video after the information is implanted, and sends the encapsulated video to the terminal.
- the server before the server embeds information on each image frame in the video, it first divides the video into frames to obtain individual image frames, and then embeds information on each image frame. After the information is implanted, in order to obtain a normal video file, each image frame, audio, subtitles, etc. need to be concentrated, so that the audio, image frames, and subtitles become a whole.
- the server may also publish the video with the embedded information in the video-watching application.
- Step S409 The terminal publishes the video with the information implanted.
- it can be published in the application for watching the video, or sent to other terminals, for example, it can be published in a friend group of an instant messaging application.
- the terminal when the terminal wants to embed information in the video, it sends the to-be-processed video and the to-be-embedded information to the server, and the server builds a model according to the pixel distribution characteristics of the implanted area in the reference frame. Since the implanted area in the reference frame will not occlude the foreground of the video, the background and foreground of the pixels in the implanted area in the subsequent frame to be detected can be identified based on the constructed model, and the background can be further generated. A template that does not obscure the foreground.
- the content that will obscure the foreground in the information to be implanted can be shielded, so that after the information is implanted in the frame to be detected, the foreground part of the frame to be detected will not be occluded, thereby ensuring the video quality The viewing experience.
- the embodiment of the present application further provides a method for embedding information in a video.
- the method includes two stages in the implementation process: a background modeling learning stage and an occlusion prediction stage.
- FIG. 6 is a schematic diagram of another implementation process of the method for embedding information in a video according to an embodiment of the application. As shown in FIG. 6, the method includes:
- Step S501 Obtain a background picture.
- Step S502 Perform Gaussian mixture modeling according to the background image.
- step S501 The background modeling process is completed through step S501 and step S502.
- Step S503 framing the video.
- Step S504 Obtain a picture to be predicted.
- Step S505 Inversely transform the image to be predicted based on background modeling to obtain an inversely transformed picture.
- Step S506 Perform forward transformation on the inverse transformed picture to obtain an occlusion mask.
- the flowchart shown in Figure 6 constructs an adaptive Gaussian mixture model for background modeling. Based on the initial frame of the business opportunity implanted in the video advertisement, the background model is adaptively selected for subsequent frames, and the learning rate is adaptively selected. Iterative updates to optimize the model.
- Fig. 7 is a schematic diagram of another implementation process of embedding information in a video according to an embodiment of the application. As shown in Fig. 7, in this embodiment, information can be embedded in a video through the following steps:
- Step S601 Deframe the video.
- the input video is divided into frames through image processing technology, and the video is split into each frame as the picture to be predicted.
- Step S602 Locate the initial frame of the business opportunity (that is, the frame where the advertisement is to be implanted), and the corresponding implantation area.
- the initial frame of the business opportunity and the corresponding implantation area can be manually set.
- image recognition technology based on neural networks can be used to determine the initial frames and plants of business opportunities. Into the area and a specific location (for example, the middle area, consistent with the size of the advertisement), the specific location corresponds to the corresponding implant area.
- Step S603 According to the image of the implanted area in the initial frame of the business opportunity, initialize the Gaussian mixture model corresponding to each pixel of the implanted area.
- Step S604 the subsequent frame (that is, the subsequent frame of the video including the implanted area) is processed as follows:
- Step S6041 Compare the distribution characteristics of the implanted area of the subsequent frame with the implanted area of the initial frame to determine whether occlusion occurs; when occlusion occurs, the learning rate is updated.
- Step S6042 Adjust the learning rate according to whether there is a change in illumination.
- step S6043 the background/foreground pixels are recognized, and the model is updated based on the recognition result and the updated learning rate, and the mask is further determined.
- the weight of the model is updated according to the updated learning rate; for the parameter, the mean and standard deviation of the unmatched pattern remain unchanged, and the mean and standard deviation of the matched pattern are updated according to the updated learning rate and weight. If no pattern matches, the pattern with the smallest weight is replaced.
- the models are arranged in descending order of ⁇ / ⁇ 2, with the highest weight and the lowest standard deviation.
- ⁇ is the weight and ⁇ is the learning rate.
- step S6044 the information to be implanted after applying the mask is implanted into the implantation area of the subsequent frame.
- step S6044 can be understood with reference to the description of similar steps above.
- Step S605, step S604 is repeated, and after all subsequent frame processing is completed, the image frame is encapsulated.
- the embedded advertisement will not occlude the foreground part of the image frame, thereby bringing a better viewing experience.
- steps S601 to S603 correspond to the background modeling learning part
- steps S604 to S605 correspond to the occlusion prediction part.
- the reference frame (that is, the initial frame of the business opportunity including the implanted area) of the video implanted advertisement item may be obtained as a background modeling to model the prior implanted area (video In the background area of the reference frame, the specific area used to implant the advertisement, namely the implantation area, is initialized with the Gaussian Mixture Model (GMM).
- GMM Gaussian Mixture Model
- the implanted area in the initial frame of the business opportunity satisfies the condition: the implanted area in the initial frame of the business opportunity is not blocked by the foreground. Therefore, when the model is initialized, the pixel distribution of the implanted area can be fully learned.
- the Gaussian mixture model uses K patterns for the color values of the pixels (in some embodiments, the patterns may also be called Gaussian mode/Gaussian component/sub-model). Means, usually K is between 3-5.
- the Gaussian mixture model represents the color value X presented by the pixel as a random variable, and the color value of the pixel in each frame of the video is the sampling value of the random variable X.
- the color value of each pixel in the scene can be represented by a mixed distribution composed of K Gaussian components, that is, the probability that the pixel j in the image takes the value x j at time t is:
- ⁇ represents the Gaussian probability density function
- d is the dimension of x j .
- the covariance matrix is defined as:
- ⁇ represents the standard deviation
- I represents the identity matrix
- the initialization of the Gaussian mixture model may be the initialization of various parameters.
- An initialization method is: in the initialization phase, if the initialization speed of the mixed Gaussian parameters is not high, then the range of each color channel of the pixel is [0, 255], and the K Gaussian components can be directly initialized.
- the mean value is the color value of the pixel
- the variance is a preset empirical value.
- Another way of initialization is to initialize the first Gaussian component corresponding to each pixel in the first frame of image, assign the mean value to the color value of the current pixel, and assign the weight to 1, except for the first Gaussian component
- the mean and weight of Gaussian components other than those are initialized to zero.
- the variance is a preset empirical value.
- step S6041 For a video formed by using a still lens, when step S6041 is implemented, it may be:
- the implanted area of each subsequent frame of the initial frame of the business opportunity For the implanted area of each subsequent frame of the initial frame of the business opportunity, compare the RGB color space distribution of the implanted area of the subsequent frame and the initial implanted area (ie the implanted area of the initial frame), and determine whether there is occlusion based on the difference in RGB distribution . That is, it is determined whether the advertisement implanted in the implantation area of the initial frame of the business opportunity will block the foreground that appears in the implantation area in the subsequent frames, for example, the "baby type" in FIG. 8B. If the difference of RGB distribution satisfies the difference condition, it is considered that the background of the implanted area is occluded by the foreground.
- Judging whether the difference in RGB distribution satisfies the difference condition can be achieved by comparing histogram distributions. For example, 0-255 gray levels can be divided into 16 intervals, and the distribution of pixels in each frame in the 16 intervals can be counted and compared. If the difference between the histogram distribution of the implanted area in the subsequent frame and the histogram distribution of the initial implanted area exceeds a certain threshold, it means that the difference in the RGB distribution meets the difference condition. At this time, the background of the implanted area in the subsequent frame is considered to be The foreground is blocked.
- the difference between the histogram distribution of the implanted area of the subsequent frame and the histogram distribution of the initial implanted area does not exceed the threshold, it means that the difference of the RGB distribution does not meet the difference condition.
- the background is not obscured by the foreground.
- the updated learning rate is set to 0 (that is, the weight of the model in the model is not updated with subsequent frames); if there is no occlusion, the original learning rate can be maintained.
- step S6042 when step S6042 is implemented, it may be:
- HSV can reflect the change of illumination, if the illumination of the background changes, you can increase the weight of the mode that conforms to the new illumination change by adjusting the learning rate to -1 to avoid the new illumination being recognized as the foreground Circumstances: If there is no lighting change, the original learning rate can be maintained.
- step S6043 when step S6043 is implemented, it may be to identify the pixel type of the implanted area of the subsequent frame, update the model, and further determine the mask.
- the color value X t in the subsequent frame is compared with the current K modes (ie K Gaussian components) of the pixel. If it is compared with at least one mode If the deviation of the mean is within 2.5 ⁇ of the pattern (that is, 2.5 times the standard deviation), it is considered that the pattern matches the pixel, and the pixel belongs to the background of the video; if it does not match, the pixel belongs to the foreground.
- K modes ie K Gaussian components
- the mask After determining whether a pixel is the foreground or the background, the mask is determined and the morphology is improved.
- the corresponding value of the pixel in the mask is 1; if the pixel belongs to the foreground of the video, then the corresponding value of the pixel in the mask is 0.
- the morphological improvement mask is mainly used to repair some errors in the judgment foreground and occlusion of the pattern, including eliminating holes and connection faults in the mask, and avoiding the appearance of the exposed video foreground after the occlusion processing Noise.
- 9A and 9B are schematic diagrams of using morphology to improve the mask. As shown in FIG. 9A, the holes in the white area in 901 can be eliminated by morphology, forming a completely connected area as shown in 902. As shown in FIG. 9B, the faults in 911 can be connected together through morphology, and a connected complete area as shown in 912 can also be formed.
- Updating the model may be updating the weight of the model according to the updated learning rate.
- the mean and standard deviation of the unmatched pattern of a pixel remain unchanged, and only the mean and standard deviation of the matched pattern are updated. If no pattern matches the pixel, a new pattern is initialized based on the pixel and the pattern with the smallest weight is replaced; each pattern is arranged in descending order of ⁇ / ⁇ 2, with the highest weight and the smallest standard deviation.
- ⁇ is the weight and ⁇ is the learning rate.
- the i-th pattern is updated by x j , and the rest of the patterns remain unchanged.
- the update method is as follows:
- ⁇ is the learning rate of the model
- ⁇ is the learning rate of the parameters, reflecting the convergence speed of the model parameters.
- the weight of each mode needs to be normalized.
- the parameter update in order to determine that the mode in the Gaussian mixture model of the pixel is generated by the background, according to each mode according to ⁇ / ⁇ 2 from large to small Sort, select the first B patterns as the distribution of the background, B satisfies the following formula, and the parameter Q represents the proportion of the background;
- the larger one indicates that the pixel value has a larger variance and a higher probability of occurrence, which exactly reflects the characteristics of the background pixel value of the scene.
- FIG. 8A and 8B are schematic diagrams of the effect of embedding information into a video formed by using a static lens according to an embodiment of the application.
- the image shown in FIG. 8A may be a certain frame before the image shown in FIG. 8B (that is, a certain frame before the video does not explain "baby style").
- FIG. 8A in the image frame
- the wall area 801 does not show the "baby style". If the wall area is used as the advertising placement area, then in the subsequent frame, that is, in the image frame shown in Figure 8B, the foreground "baby style" appears.
- Embedded ads are used as a layer overlay, where the "baby" part will be obscured.
- the three words "baby" float on the advertisement that is, the embedded advertisement 811 will not block the foreground content of the video , So as to ensure the integrity of the original video's foreground content at the ad placement location.
- step S604 For a video formed by a moving lens, when step S604 is implemented, the following steps need to be performed before step S6041:
- Step 71 Track subsequent frames including the implanted area.
- Template matching is performed by feature tracking technology (a template of feature points, such as feature points found using the orb method), or the sift method is used to track subsequent frames including the implanted area.
- the implanted area that is, the background area for implanting information
- the homography matrix H Since the background modeling is to model each pixel, the initial frame of the business opportunity (refer to Frame) and the pixel positions of the implanted area in the subsequent frames have a one-to-one correspondence. Because if the camera moves, then the initial frame of the business opportunity and the pixel position of the implant area of the current frame do not correspond.
- x t , y t represent a pixel in the current frame
- x 0 , y 0 represent the pixel corresponding to the pixel in the initial frame of the business opportunity.
- steps S6041 and S6042 are similar to the implementation process of steps S6041 and S6042 of a video formed by a static lens, and can be understood with reference to the description of similar steps above.
- step S6043 it is also necessary to identify the pixel type of the implanted area of the subsequent frame to update the model and determine the mask.
- the homography matrix H is used to inversely transform the mask (mask) into the position of the subsequent frame, as shown in the following formula (3-11):
- the advertisement is implanted in the implantation area of the subsequent frame, and the corresponding mask and video encapsulation are applied to the implantation area for the image frame judged to be occluded.
- FIG. 8C and FIG. 8D are schematic diagrams of the effect of embedding information in a video formed by a dynamic lens according to an embodiment of the application.
- FIG. 8C is a frame where the character does not appear. If the ground is used as the advertisement placement area 821 at this time, the image frame after the advertisement placement is shown in FIG. 8C. In the subsequent frames, if the embedded advertisement "Hello Qin Pro" is directly covered with a layer, the legs of the characters appearing in the area will be blocked. After applying the solution of embedding information in the video provided by this embodiment, as shown in FIG. 8D, the legs of the character are displayed on top of the implanted advertisement, so that the advertisement implantation area 831 will not block the foreground of the video.
- the software module in the apparatus 240 may include:
- the model construction module 241 is configured to construct a model that conforms to the pixel distribution characteristics of the implanted area in the reference frame, and control the update of the model based on the subsequent to-be-detected frames of the reference frame;
- the template generation module 242 is configured to identify the background and foreground of the implanted region in the frame to be detected based on the model, and generate a template for occluding the background and revealing the foreground;
- the template application module 243 is configured to apply the information to be implanted to the template, so as to shield the content in the information to be implanted that would obstruct the foreground;
- the information covering module 244 is used for covering the information to be implanted after applying the template to the implantation area in the frame to be detected, so that the foreground is highlighted relative to the information to be implanted.
- the device further includes:
- the parameter initialization module is configured to correspond to each pixel of the implanted area in the reference frame, and initialize at least one sub-model corresponding to the pixel and the weight corresponding to the at least one sub-model;
- the weight mixing module is used to mix the sub-models constructed corresponding to each pixel based on the initialized weights to form a model corresponding to the pixel.
- the device further includes:
- the weight holding module is configured to reduce the rate at which the model is fitted to the implanted area in the frame to be detected in response to the implanted area in the frame to be detected being blocked by the foreground, in the model The weight of the sub-model remains unchanged;
- the fitting acceleration module is used to respond to the implanted area in the to-be-detected frame not being blocked by the foreground, and the illumination of the implanted area in the to-be-detected frame changes, to transfer the model to the to-be-detected frame
- the fitting rate of the implanted area in the frame is increased.
- the device further includes:
- the parameter update module is used to respond to the pixel points of the implanted area in the frame to be detected matching at least one sub-model in the corresponding model, update the parameters of the matched sub-model, and keep the corresponding model unmatched The parameters of the sub-model remain unchanged.
- the device further includes:
- the first matching module is configured to match the color value of each pixel of the implanted area in the frame to be detected with the sub-model in the model corresponding to the pixel;
- the recognition module is used for recognizing the pixels that are successfully matched as the pixels of the background, and the pixels that are not matched as the pixels of the foreground.
- the device further includes:
- the filling module is used to correspond to the pixels identified as background in the implanted area in the frame to be detected, and fill binary ones in the corresponding positions in the empty template, and
- binary zeros are filled in the corresponding positions in the template filled with binary ones.
- the device further includes:
- the arithmetic module is used to multiply the information to be implanted with the binary number filled in each position in the template.
- the device further includes:
- the second matching module is configured to match the features extracted from the implanted region in the reference frame of the video with the features extracted from the frame to be detected in response to the video being formed using a motion lens;
- the area determining module is configured to determine that the frame to be detected includes the implanted area corresponding to the implanted area in the reference frame in response to the successful matching.
- the device further includes:
- the area transformation module is used to respond to the video being formed with a motion lens
- the template inverse transformation module is used to perform the inverse transformation of the template on the template before applying the information to be implanted, so that the position of each binary number in the transformed template is the same as the frame to be detected The positions of the corresponding pixels in the implanted area are the same.
- the device further includes:
- the region positioning module is used to respond to the video being formed by using a static lens, and locate the region of the corresponding position in the frame to be detected based on the position of the implanted region in the reference frame to determine the implanted region to be detected .
- the device further includes:
- the first determining module is configured to determine the first difference condition in response to the first color space distribution of the implanted area in the frame to be detected and the first color space distribution of the implanted area in the reference frame The implanted area in the reference frame is blocked by the foreground;
- the second determining module is configured to determine that the second color space distribution of the implanted area in the frame to be detected meets a second difference condition with the second color space distribution of the implanted area in the reference frame The implanted area in the reference frame is occluded by the foreground.
- the method provided by the embodiment of the application can be directly executed by the processor 410 in the form of a hardware decoding processor, for example, by one or more application specific integrated circuits.
- ASIC Application Specific Integrated Circuit
- DSP Programmable Logic Device
- PLD Programmable Logic Device
- CPLD Complex Programmable Logic Device
- FPGA Field-Programmable Gate Array
- the embodiment of the present application provides a storage medium storing executable instructions, and the executable instructions are stored therein.
- the processor will cause the processor to execute the method provided in the embodiments of the present application, for example, as shown in FIG. 3 to 6 shows the method.
- the storage medium may be FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM, etc.; it may also be a variety of devices including one or any combination of the foregoing memories. .
- executable instructions may be in the form of programs, software, software modules, scripts or codes, written in any form of programming language (including compiled or interpreted languages, or declarative or procedural languages), and their It can be deployed in any form, including being deployed as an independent program or deployed as a module, component, subroutine or other unit suitable for use in a computing environment.
- executable instructions may but do not necessarily correspond to files in the file system, and may be stored as part of a file that saves other programs or data, for example, in a HyperText Markup Language (HTML, HyperText Markup Language) document
- HTML HyperText Markup Language
- One or more of the scripts in are stored in a single file dedicated to the program in question, or in multiple coordinated files (for example, a file storing one or more modules, subroutines, or code parts).
- executable instructions can be deployed to be executed on one computing device, or on multiple computing devices located in one location, or on multiple computing devices that are distributed in multiple locations and interconnected by a communication network Executed on.
- the embodiment of the application can construct a model based on the pixel distribution characteristics of the implanted area in the reference frame, perform occlusion detection on the implanted area in the frame to be detected, and update the model parameters based on the occlusion detection result.
- the implanted area of the frame to be detected is fitted to the background pixel distribution of the implanted area in the reference frame, so that the implanted information can be better integrated into the background of the video without obstructing the foreground, thereby bringing a better viewing experience.
- the feature points are used to determine the implantation area, and the pixel points of the implantation area in the frame to be detected are mapped to the position consistent with the reference frame through transformation, without the need for motion tracking, which is more real-time High and robust.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (14)
- 一种在视频中植入信息的方法,其特征在于,所述方法应用于执行设备,包括:构建符合参考帧中植入区域的像素分布特性的模型,基于所述参考帧后续的待检测帧控制所述模型的更新;基于所述模型识别所述待检测帧中植入区域的背景和前景,生成用于遮挡所述背景、用于显露所述前景的模板;将待植入信息应用所述模板,以屏蔽所述待植入信息中会遮挡所述前景的内容;将应用所述模板后的所述待植入信息覆盖到所述待检测帧中植入区域,以使所述前景相对于所述待植入信息突出显示。
- 根据权利要求1所述的方法,其特征在于,所述构建符合参考帧中植入区域的像素分布特性的模型,包括:对应所述参考帧中植入区域的每个像素点,初始化与所述像素点对应的至少一个子模型以及所述至少一个子模型对应的权重;将对应每个像素点构建的子模型基于初始化的权重混合,以形成与所述像素点对应的模型。
- 根据权利要求1所述的方法,其特征在于,所述基于所述参考帧后续的待检测帧控制所述模型的更新,包括:响应于所述待检测帧中植入区域被所述前景遮挡,将所述模型向所述待检测帧中植入区域进行拟合的速率进行减小;响应于所述待检测帧中植入区域未被所述前景遮挡、且所述待检测帧中植入区域的光照情况发生变化,将所述模型向所述待检测帧中植入区域进行拟合的速率进行提升。
- 根据权利要求1所述的方法,其特征在于,所述基于所述参考帧后续的待检测帧控制所述模型的更新,包括:响应于所述待检测帧中植入区域的像素点与对应模型中的至少一个子模型匹配,更新所述匹配的子模型的参数,以及保持所述对应模型中未匹配的子模型的参数不变。
- 根据权利要求1所述的方法,其特征在于,所述基于所述模型识别所述待检测帧中植入区域中的背景和前景,包括:将所述待检测帧中植入区域的每个像素点的颜色值,与所述像素点对应模型中的子模型匹配;将匹配成功的像素点识别为所述背景的像素点,将匹配失败的像素点识别为所述前景 的像素点。
- 根据权利要求1所述的方法,其特征在于,所述生成用于遮挡所述背景、用于显露所述前景的模板,包括:对应所述待检测帧中植入区域中被识别为背景的像素点,在空的所述模板中对应的位置填充二进制一,以及对应所述待检测帧中植入区域中被识别为前景的像素点,在填充二进制一的所述模板中对应的位置填充二进制零。
- 根据权利要求1所述的方法,其特征在于,所述将待植入信息应用所述模板,包括:将所述待植入信息,与所述模板中每个位置填充的二进制数进行乘法操作。
- 根据权利要求1至7任一项所述的方法,其特征在于,所述方法还包括:响应于视频为采用运动镜头形成,将从所述视频的参考帧中植入区域提取的特征,与从所述待检测帧中提取的特征匹配;响应于匹配成功,确定所述待检测帧中包括与所述参考帧中植入区域对应的植入区域。
- 根据权利要求1至7任一项所述的方法,其特征在于,所述方法还包括:响应于视频为采用运动镜头形成,基于所述参考帧后续的待检测帧控制所述模型的更新之前,将所述待检测帧中植入区域进行变换,以使变换后的植入区域中每个像素点的位置,与所述参考帧中植入区域相应像素点的位置一致;将待植入信息应用所述模板之前,将所述模板进行所述变换的逆变换,以使变换后的模板中每个二进制数的位置,与所述待检测帧中植入区域相应像素点的位置一致。
- 根据权利要求1至7任一项所述的方法,其特征在于,所述方法还包括:响应于视频为采用静态镜头形成,基于所述参考帧中植入区域的位置,在所述待检测帧中定位相应位置的区域,以确定所述待检测帧中植入区域。
- 根据权利要求1至7任一项所述的方法,其特征在于,所述方法还包括:响应于所述待检测帧中植入区域的第一色彩空间分布,与所述参考帧中植入区域的第一色彩空间分布满足第一差异性条件,确定所述待检测帧中植入区域被所述前景遮挡;响应于所述待检测帧中植入区域的第二色彩空间分布,与所述参考帧中植入区域的第二色彩空间分布满足第二差异性条件,确定所述待检测帧中植入区域的光照情况发生变化。
- 一种计算机设备,其特征在于,包括:存储器,用于存储可执行指令;处理器,用于执行所述存储器中存储的可执行指令时,实现权利要求1至11中任一项 所述的方法。
- 一种存储介质,其特征在于,所述存储介质存储有可执行指令,用于引起处理器执行时,实现权利要求1至11任一项所述的方法。
- 一种计算机程序产品,其特征在于,所述计算机程序产品存储有计算机程序,用于由处理器加载并执行时,实现如权利要求1至11任一项所述的方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP20802358.0A EP3968627B1 (en) | 2019-05-09 | 2020-04-21 | Method for implanting information into video, computer device and storage medium |
| JP2021532214A JP7146091B2 (ja) | 2019-05-09 | 2020-04-21 | ビデオへの情報埋め込み方法、コンピュータ機器及びコンピュータプログラム |
| US17/394,579 US11785174B2 (en) | 2019-05-09 | 2021-08-05 | Method for implanting information into video, computer device and storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201910385878.4 | 2019-05-09 | ||
| CN201910385878.4A CN110121034B (zh) | 2019-05-09 | 2019-05-09 | 一种在视频中植入信息的方法、装置、设备及存储介质 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/394,579 Continuation US11785174B2 (en) | 2019-05-09 | 2021-08-05 | Method for implanting information into video, computer device and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020224428A1 true WO2020224428A1 (zh) | 2020-11-12 |
Family
ID=67522038
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/085939 Ceased WO2020224428A1 (zh) | 2019-05-09 | 2020-04-21 | 在视频中植入信息的方法、计算机设备及存储介质 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11785174B2 (zh) |
| EP (1) | EP3968627B1 (zh) |
| JP (1) | JP7146091B2 (zh) |
| CN (1) | CN110121034B (zh) |
| WO (1) | WO2020224428A1 (zh) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112672173A (zh) * | 2020-12-09 | 2021-04-16 | 上海东方传媒技术有限公司 | 一种电视直播信号中特定内容的遮挡方法及系统 |
| CN113486803A (zh) * | 2021-07-07 | 2021-10-08 | 北京沃东天骏信息技术有限公司 | 视频中嵌入图像的装置 |
| US11785174B2 (en) | 2019-05-09 | 2023-10-10 | Tencent Technology (Shenzhen) Company Limited | Method for implanting information into video, computer device and storage medium |
Families Citing this family (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111652207B (zh) * | 2019-09-21 | 2021-01-26 | 深圳久瀛信息技术有限公司 | 定位式数据加载装置和方法 |
| CN113011227B (zh) * | 2019-12-19 | 2024-01-26 | 合肥君正科技有限公司 | 一种遮挡检测中背景更新预判断期间避免误报的辅助检测方法 |
| CN111556336B (zh) * | 2020-05-12 | 2023-07-14 | 腾讯科技(深圳)有限公司 | 一种多媒体文件处理方法、装置、终端设备及介质 |
| CN111556337B (zh) * | 2020-05-15 | 2021-09-21 | 腾讯科技(深圳)有限公司 | 一种媒体内容植入方法、模型训练方法以及相关装置 |
| CN111556338B (zh) * | 2020-05-25 | 2023-10-31 | 腾讯科技(深圳)有限公司 | 视频中区域的检测方法、信息融合方法、装置和存储介质 |
| GB2599437A (en) * | 2020-10-02 | 2022-04-06 | Sony Europe Bv | Client devices, server, and methods |
| CN113989396B (zh) * | 2021-11-05 | 2025-12-16 | 北京字节跳动网络技术有限公司 | 图片渲染方法、装置、设备、存储介质和程序产品 |
| CN115761598B (zh) * | 2022-12-20 | 2023-09-08 | 易事软件(厦门)股份有限公司 | 一种基于云端业务平台的大数据分析方法及系统 |
| CN116939294B (zh) * | 2023-09-17 | 2024-03-05 | 世优(北京)科技有限公司 | 视频植入方法、装置、存储介质及电子设备 |
| CN116939293B (zh) * | 2023-09-17 | 2023-11-17 | 世优(北京)科技有限公司 | 植入位置的检测方法、装置、存储介质及电子设备 |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1144588A (zh) * | 1994-03-14 | 1997-03-05 | 美国赛特公司 | 一种将图像植入视像流的系统 |
| US20100315510A1 (en) * | 2009-06-11 | 2010-12-16 | Motorola, Inc. | System and Method for Providing Depth Imaging |
| CN105191287A (zh) * | 2013-03-08 | 2015-12-23 | 吉恩-鲁克·埃法蒂卡迪 | 替换视频流中的对象的方法及计算机程序 |
| CN107347166A (zh) * | 2016-08-19 | 2017-11-14 | 北京市商汤科技开发有限公司 | 视频图像的处理方法、装置和终端设备 |
| CN107493488A (zh) * | 2017-08-07 | 2017-12-19 | 上海交通大学 | 基于Faster R‑CNN模型的视频内容物智能植入的方法 |
| CN108961304A (zh) * | 2017-05-23 | 2018-12-07 | 阿里巴巴集团控股有限公司 | 识别视频中运动前景的方法和确定视频中目标位置的方法 |
| CN110121034A (zh) * | 2019-05-09 | 2019-08-13 | 腾讯科技(深圳)有限公司 | 一种在视频中植入信息的方法、装置及存储介质 |
Family Cites Families (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2008008045A1 (en) * | 2006-07-11 | 2008-01-17 | Agency For Science, Technology And Research | Method and system for context-controlled background updating |
| US8477246B2 (en) * | 2008-07-11 | 2013-07-02 | The Board Of Trustees Of The Leland Stanford Junior University | Systems, methods and devices for augmenting video content |
| JP5994493B2 (ja) * | 2012-08-31 | 2016-09-21 | カシオ計算機株式会社 | 動画像前景切抜き装置、方法、およびプログラム |
| CN105654458A (zh) * | 2014-11-14 | 2016-06-08 | 华为技术有限公司 | 图像处理的方法及装置 |
| EP3433816A1 (en) * | 2016-03-22 | 2019-01-30 | URU, Inc. | Apparatus, systems, and methods for integrating digital media content into other digital media content |
| US20190130215A1 (en) * | 2016-04-21 | 2019-05-02 | Osram Gmbh | Training method and detection method for object recognition |
| US20180048894A1 (en) * | 2016-08-11 | 2018-02-15 | Qualcomm Incorporated | Methods and systems of performing lighting condition change compensation in video analytics |
| US10198621B2 (en) * | 2016-11-28 | 2019-02-05 | Sony Corporation | Image-Processing device and method for foreground mask correction for object segmentation |
| US11720745B2 (en) * | 2017-06-13 | 2023-08-08 | Microsoft Technology Licensing, Llc | Detecting occlusion of digital ink |
| US10646999B2 (en) * | 2017-07-20 | 2020-05-12 | Tata Consultancy Services Limited | Systems and methods for detecting grasp poses for handling target objects |
| CN108419115A (zh) * | 2018-02-13 | 2018-08-17 | 杭州炫映科技有限公司 | 一种广告植入方法 |
| CN109461174B (zh) * | 2018-10-25 | 2021-01-29 | 北京陌上花科技有限公司 | 视频目标区域跟踪方法和视频平面广告植入方法及系统 |
-
2019
- 2019-05-09 CN CN201910385878.4A patent/CN110121034B/zh active Active
-
2020
- 2020-04-21 WO PCT/CN2020/085939 patent/WO2020224428A1/zh not_active Ceased
- 2020-04-21 JP JP2021532214A patent/JP7146091B2/ja active Active
- 2020-04-21 EP EP20802358.0A patent/EP3968627B1/en active Active
-
2021
- 2021-08-05 US US17/394,579 patent/US11785174B2/en active Active
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1144588A (zh) * | 1994-03-14 | 1997-03-05 | 美国赛特公司 | 一种将图像植入视像流的系统 |
| US20100315510A1 (en) * | 2009-06-11 | 2010-12-16 | Motorola, Inc. | System and Method for Providing Depth Imaging |
| CN105191287A (zh) * | 2013-03-08 | 2015-12-23 | 吉恩-鲁克·埃法蒂卡迪 | 替换视频流中的对象的方法及计算机程序 |
| CN107347166A (zh) * | 2016-08-19 | 2017-11-14 | 北京市商汤科技开发有限公司 | 视频图像的处理方法、装置和终端设备 |
| CN108961304A (zh) * | 2017-05-23 | 2018-12-07 | 阿里巴巴集团控股有限公司 | 识别视频中运动前景的方法和确定视频中目标位置的方法 |
| CN107493488A (zh) * | 2017-08-07 | 2017-12-19 | 上海交通大学 | 基于Faster R‑CNN模型的视频内容物智能植入的方法 |
| CN110121034A (zh) * | 2019-05-09 | 2019-08-13 | 腾讯科技(深圳)有限公司 | 一种在视频中植入信息的方法、装置及存储介质 |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11785174B2 (en) | 2019-05-09 | 2023-10-10 | Tencent Technology (Shenzhen) Company Limited | Method for implanting information into video, computer device and storage medium |
| CN112672173A (zh) * | 2020-12-09 | 2021-04-16 | 上海东方传媒技术有限公司 | 一种电视直播信号中特定内容的遮挡方法及系统 |
| CN113486803A (zh) * | 2021-07-07 | 2021-10-08 | 北京沃东天骏信息技术有限公司 | 视频中嵌入图像的装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| US11785174B2 (en) | 2023-10-10 |
| JP2022531639A (ja) | 2022-07-08 |
| CN110121034B (zh) | 2021-09-07 |
| CN110121034A (zh) | 2019-08-13 |
| EP3968627A1 (en) | 2022-03-16 |
| US20210368112A1 (en) | 2021-11-25 |
| EP3968627A4 (en) | 2022-06-29 |
| EP3968627B1 (en) | 2025-10-29 |
| JP7146091B2 (ja) | 2022-10-03 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110121034B (zh) | 一种在视频中植入信息的方法、装置、设备及存储介质 | |
| US12307732B2 (en) | Methods for handling occlusion in augmented reality applications using memory and device tracking and related apparatus | |
| CN112257729A (zh) | 图像识别方法、装置、设备及存储介质 | |
| CN120953465B (zh) | 生成式ai模型实时渲染引擎构建方法及其相关设备 | |
| KR20240049098A (ko) | 뷰 증강 기반의 뉴럴 렌더링 방법 및 장치 | |
| KR20250119873A (ko) | Gan 머신러닝 학습방법을 이용한 주행 시나리오 머신러닝 모델 생성 방법 | |
| KR20240106536A (ko) | 촬영에 기반한 정밀한 인터랙션 가능한 오브젝트를 메타버스에서 구현하는 방법 및 시스템 | |
| CN114565872A (zh) | 视频数据处理方法、装置、设备及计算机可读存储介质 | |
| KR102689751B1 (ko) | 실시간 머신러닝 모델 업데이트 방법 | |
| KR20250120476A (ko) | 실시간 머신러닝 모델 업데이트 방법 | |
| KR20250119872A (ko) | Gan 머신러닝 학습방법을 이용한 주행 시나리오 머신러닝 모델 생성 방법 | |
| KR20250119870A (ko) | 주행 시나리오 머신러닝 모델을 이용한 가상 객체 생성 방법 | |
| KR20250120478A (ko) | 실시간 머신러닝 모델 업데이트 방법 | |
| KR20250120473A (ko) | 차량 주행 시나리오 머신러닝 모델 업데이트 방법 | |
| WO2023221292A1 (en) | Methods and systems for image generation | |
| US12169908B2 (en) | Two-dimensional (2D) feature database generation | |
| TWM625817U (zh) | 具時序平滑性之影像模擬系統 | |
| CN115965674B (zh) | 具有时序平滑性的破碎深度图补正系统 | |
| TWI804001B (zh) | 具時序平滑性之破碎深度圖補正系統 | |
| KR102816839B1 (ko) | Gan 머신러닝 학습방법을 이용한 주행 시나리오 머신러닝 모델 생성 방법 | |
| KR102703811B1 (ko) | 다양한 영상 이미지 결합 방법 | |
| KR102781494B1 (ko) | 차량 주행 시나리오 머신러닝 모델 생성 방법 | |
| KR102806483B1 (ko) | 주행 시나리오 머신러닝 모델에 기반한 객체 인식 개선 방법 | |
| US12614282B2 (en) | Method and electronic device for generating a machine learning training dataset for shadow direction and removal | |
| KR102806482B1 (ko) | 주행 시나리오 머신러닝 모델을 이용한 가상 객체 생성 방법 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20802358 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2021532214 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2020802358 Country of ref document: EP Effective date: 20211209 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2020802358 Country of ref document: EP |




