WO2022148319A1 - 视频切换方法、装置、存储介质及设备 - Google Patents
视频切换方法、装置、存储介质及设备 Download PDFInfo
- Publication number
- WO2022148319A1 WO2022148319A1 PCT/CN2021/143821 CN2021143821W WO2022148319A1 WO 2022148319 A1 WO2022148319 A1 WO 2022148319A1 CN 2021143821 W CN2021143821 W CN 2021143821W WO 2022148319 A1 WO2022148319 A1 WO 2022148319A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image frame
- video
- switching
- target object
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23424—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/24—Aligning, centring, orientation detection or correction of the image
- G06V10/245—Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/41—Structure of client; Structure of client peripherals
- H04N21/414—Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance
- H04N21/41407—Specialised client platforms, e.g. receiver in car or embedded in a mobile appliance embedded in a portable device, e.g. video client on a mobile phone, PDA, laptop
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47205—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
Definitions
- the present application relates to the field of video, and in particular, to a video switching method, device, storage medium and device.
- the present application provides a video switching method, which can perform video switching according to a target object without manual editing by post-production personnel.
- the technical solution is as follows:
- a first aspect provides a video switching method, the method includes: determining a target object; calculating a similarity of the target object between a first image frame and a second image frame to obtain a similarity value, wherein the first image frame An image frame is from the first video, and the second image frame is from the second video; acquiring a switching image frame, wherein the switching image frame includes the first image frame and the second image frame whose similarity value is greater than or equal to a preset threshold image frame; switching the first image frame of the first video to the second image frame of the second video or switching the second image frame of the second video to the first image frame according to the switching image frame The first image frame of the video.
- the images are automatically aligned by calculating the similarity between the two image frames, thereby realizing video switching, and manual editing by post-production personnel is not required, which is convenient for users to use.
- the calculating the similarity of the target object between the first image frame and the second image frame, and obtaining the similarity value includes: acquiring the target in the first image frame and the second image frame. The feature of the object; calculate the distance of the feature of the target object between the first image frame and the second image frame to obtain a similarity value.
- the features of the target object include facial features of the target object and/or body posture features of the target object.
- the method further includes: providing an editing interface, where the editing interface includes objects presented after identifying the first image frame and the second image frame; then the determining the target object includes: responding to The user's selection determines the target object.
- the editing interface further includes one or more pairs of switching image frames for the user to select; then switching the first image frame of the first video to the second video according to the switching image frames
- the second image frame of the second video frame or switching the second image frame of the second video to the first image frame of the first video comprises: in response to one or more pairs of image frames selected by the user, switching the image frames according to the pair or Pairs of switching image frames switching a first image frame of the first video to a second image frame of the second video or switching a second image frame of the second video to a first image frame of the first video image frame.
- a video switching device comprising: a determination module for determining a target object; a calculation module for calculating the similarity of the target object between a first image frame and a second image frame , to obtain a similarity value, wherein the first image frame comes from the first video, and the second image frame comes from the second video; an acquiring module is used to acquire a switching image frame, wherein the switching image frame includes the similarity a first image frame and a second image frame whose value is greater than or equal to a preset threshold; a switching module, configured to switch the first image frame of the first video to the second image frame of the second video according to the switching image frame image frame or switching the second image frame of the second video to the first image frame of the first video.
- the computing module is specifically configured to: acquire the target pair in the first image frame and the second image frame
- the feature of the image is calculated; the distance of the feature of the target object between the first image frame and the second image frame is calculated to obtain a similarity value.
- the features of the target object include facial features of the target object and/or body posture features of the target object.
- the device further includes: an editing module, configured to provide an editing interface, the editing interface includes objects presented after identifying the first image frame and the second image frame; then the determining module Specifically for: determining the target object in response to the user's selection.
- an editing module configured to provide an editing interface, the editing interface includes objects presented after identifying the first image frame and the second image frame; then the determining module Specifically for: determining the target object in response to the user's selection.
- the editing interface further includes one or more pairs of switching image frames for the user to select; then the switching module is specifically configured to: respond to the one or more pairs of switching image frames selected by the user, according to the pair of switching image frames. or pairs of switching image frames to switch the first image frame of the first video to the second image frame of the second video or to switch the second image frame of the second video to the first image frame of the first video an image frame.
- the present application also provides an electronic device, the structure of the electronic device includes a processor and a memory, and the memory is used to store and support the electronic device to perform the above-mentioned first aspect and its optional implementations provided by The program of the video switching method, and storing the data involved in implementing the video switching method provided by the first aspect and its optional implementation manners.
- the processor executes the program stored in the memory to execute the method provided by the foregoing first aspect and its optional implementation manners.
- the electronic device may also include a communication bus for establishing a connection between the processor and the memory.
- the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium is run on a computer, the computer is made to execute the first aspect and optional implementations thereof.
- the video switching method described in the method is not limited to:
- the similarity of the target object between the first image frame and the second image frame is calculated by determining the target object, and the similarity value is obtained, wherein the first image frame is from the first video, so The second image frame is from the second video, if the similarity value is greater than or equal to a preset threshold, a pair of switching image frames is obtained according to the first image frame and the second image frame, and according to the switching image frame.
- the switching of the first video and the second video is realized, and the switching effect of the images can be realized very conveniently.
- FIG. 1 is a schematic structural diagram of a video switching device provided by an embodiment of the present application.
- FIG. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of an application of a video switching device provided by an embodiment of the present application.
- FIG. 4 is a schematic flowchart of a video switching method provided by an embodiment of the present application.
- 5a is a schematic diagram of an interactive interface provided by an embodiment of the present application.
- 5b is a schematic diagram of a video import interface provided by an embodiment of the present application.
- 5c is a schematic diagram of an object presentation interface provided by an embodiment of the present application.
- FIG. 6 is a flowchart of another video switching method provided by an embodiment of the present application.
- FIG. 7a is a schematic diagram of an editing interface provided by an embodiment of the present application.
- FIG. 7b is a schematic diagram of a first image frame image provided by an embodiment of the present application.
- 7c is a schematic diagram of a second image frame image provided by an embodiment of the present application.
- FIG. 8 is a schematic flowchart of another video switching method provided by an embodiment of the present application.
- 200-electronic equipment 110-processor; 120-external memory interface; 121-internal memory; 130-USB interface; 140-charging management module; 141-power management module; 142-battery; 1-antenna; 2-antenna; 150-mobile communication module; 160-wireless communication module; 170-audio module; 170A-speaker; 170B-receiver; 170C-microphone; 170D-headphone jack; 180-sensor module; 193-camera; 194-display; 195- Video codec; 100-video switching device; 10-determination module; 20-calculation module; 30-acquisition module; 40-switching module; 50-editing module; 501-Dock bar; 510-main interface; 511-status bar 512-video import interface; 513-object presentation interface; 711-switch image frame interface; 712-first image frame display interface; 713-second image frame display interface.
- the video switching method provided by the embodiments of the present application can be used to automatically perform video switching, thereby reducing the requirement on the user's technical capability for switching video production.
- the embodiment of the present application provides a video switching method, and the method is executed by a video switching apparatus.
- the function of the video switching apparatus can be realized by a software system, can also be realized by a hardware device, and can also be realized by a combination of a software system and a hardware device.
- the video switching device 100 can be logically divided into multiple modules, each module can have different functions, and the function of each module is read by the processor in the electronic device And to implement the computer instructions in the memory, the structure of the electronic device can be the electronic device shown in FIG. 2 below.
- the video switching apparatus 100 may include a determination module 10 , a calculation module 20 , an acquisition module 30 and a switching module 40 .
- the video switching apparatus 100 may perform the contents described in steps S40-S44, steps S61-S62 and steps S81-S85 described below. It should be noted that the embodiments of the present application only exemplarily divide the structure and functional modules of the video switching apparatus 100 , but do not make any limitations on the specific division.
- the determining module 10 is used for determining the target object.
- the determined target object is used for subsequent video switching, and video switching is realized by aligning the target object.
- a video can be understood to include a series of image frames, which are displayed at a given frame rate, while stopping at a particular frame in the sequence to obtain a single image frame, ie, an image.
- the video may include objects.
- the video may be a video file recorded for a specific object, and the object may be a living body, such as a person or an animal, or a static item such as a book or a TV.
- the video may be a video recorded for a moving human body.
- Image recognition is performed on the image frame in the video, and the object included in the image frame is recognized.
- image frames in the video may be acquired frame by frame, and image recognition is performed on the acquired image frames to obtain objects included in the video. It is also possible to acquire multiple image frames in a video. For example, a video including a specific video object can be acquired, and then multiple image frames can be captured from the video, such as multiple image frames in the 1st, 20th, and 34th seconds of the video. Frame Image frames all correspond to a specific time information. For another example, it is also possible to intercept multiple image frames from the video at certain time intervals, for example, the video can be intercepted every 10 seconds, and the video can be intercepted to the 1st, 11th, 21st, etc. seconds in the video. Multi-frame image frames.
- the recognized objects may include character A, character B, cat C, TV D, and the like.
- the determination of the target object may be determined according to the user's selection, for example, by presenting the object identified on the editing interface to the user, and the user determines the target object.
- the object that meets certain conditions in the default image frame can also be the target object, for example, the object located in the middle of the screen in the default image frame is the target object.
- the calculation module 20 is configured to calculate the similarity of the target object between the first image frame and the second image frame to obtain a similarity value.
- the first image frame is from a first video
- the second image frame is from a second video.
- the calculation module 20 is used to calculate the similarity of the target object between each first image frame in the first video and each second image frame in the second video, if the first video has 3 first image frames, the second The video has 3 second image frames, you can get 9 similarity values.
- Video switching can be performed on one or more videos. For example, when performing video switching on a video, edit this video into two videos according to different scenes to obtain a first video and a second video.
- the first video includes multiple frames of the first image frame
- the second video includes multiple frames.
- the second image frame For the second image frame, to calculate the similarity of the target object in the first image frame and the second image frame, you can first obtain a certain first image frame in the first video, and then calculate the similarity between the first image frame and the second video. The similarity of the target object among all the second image frames, then obtain the next first image frame in the first video, and then calculate the next first image frame and all the second image frames in the second video. The similarity of the target object between the image frames, and so on, all the first image frames in the first video are calculated.
- the obtaining module 30 is configured to obtain a switching image frame, wherein the switching image frame includes a first image frame and a second image frame whose similarity value is greater than or equal to a preset threshold. If the similarity value is greater than or equal to a preset threshold, a switching image frame is obtained according to the first image frame and the second image frame. If the similarity value of the target object between the first image frame and the second image frame is greater than or equal to a preset threshold, a pair of switching image frames is obtained, and the switching image frame includes the first image frame and the second image frame.
- the pair of switching image frames can be understood as the switching position between the first video and the second video, or the position where the first video and the second video are connected, that is, after the first image frame of the first video is displayed, it switches to the second video.
- the similarity value of the target object between the first image frame and the second image frame is greater than or equal to the preset threshold, it can be considered that the similarity of the target object in the first image frame and the second image frame is high, then When performing video switching, aligning the target objects of the first image frame and the second image frame allows the user to focus on the target object and ignore the changes of other objects, and the similarity of the target objects is high, so that the Video transitions are smooth and natural.
- the switching module 40 is configured to switch the first image frame of the first video to the second image frame of the second video or switch the second image frame of the second video to the second image frame according to the switching image frame the first image frame of the first video.
- the switching of the first video and the second video is realized according to the switching image frame.
- locate the image frame to be switched between the first video and the second video that is, the image frame to be switched in the first video is the first image frame in the switched image frame
- the image frame to be switched in the second video is the first image frame in the switched image frame.
- the image frame to be switched is the second image frame in the switched image frame.
- the first image frame and the second image frame can be combined to realize switching of the first video and the second video. Or connect the first image frame and the second image frame together, so that after the first image frame is displayed during playback, the next image frame is the second image frame, or after the second image frame is displayed during playback, the next image frame is displayed. is the first image frame.
- the video switching device 100 may further include an editing module 50, and the editing module 50 is configured to provide a user with an editing interface, where the editing interface includes objects for the user to select, and the objects include an adjustment to each image in the video.
- the object recognized after image recognition is performed on the frame. That is, the video switching apparatus 100 identifies the object to be switched in the video, and presents the identified object through the editing interface, so that the target object can be selected from the objects presented in the editing interface.
- the editing interface also includes one or more pairs of switching image frames for the user to select. After the video switching apparatus 100 calculates the similarity of the target object between the first image frame of the first video and the second image frame of the second video, multiple pairs of switching image frames, namely the first video and the second image frame, can be obtained.
- Video switching can be performed according to the selected switching image frame. For example, if two pairs of switching image frames are selected, after the first video is switched to the second video, the second video can also be switched to the first video.
- some of the modules included in the video switching apparatus 100 may also be combined into one module.
- the acquisition module 30 and the switching module 40 may be combined into a video switching module.
- the video switching apparatus 100 described above can be flexibly deployed.
- the video switching apparatus 100 may be deployed on an electronic device, which may be a software apparatus deployed on a server in a cloud data center or a virtual machine, and the software apparatus may be used for video switching.
- the electronic device may include a cell phone, tablet, smart watch, tablet computer, laptop computer, in-vehicle computer, desktop computer, wearable device, and the like.
- FIG. 2 Please refer to FIG. 2 .
- the electronic device is a mobile phone as an example.
- the mobile phone shown in FIG. 2 is only an example, and does not constitute a limitation on the mobile phone. Fewer parts.
- FIG. 2 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
- the electronic device 200 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, Antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, camera 193, display screen 194, etc.
- a processor 110 an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, Antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, camera 193, display screen 194, etc.
- USB universal serial bus
- the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the electronic device 200 .
- the electronic device 200 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
- the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, video codec 195, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU), etc. . Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
- application processor application processor, AP
- modem processor graphics processor
- image signal processor image signal processor
- ISP image signal processor
- controller video codec 195
- digital signal processor digital signal processor
- DSP digital signal processor
- NPU neural-network processing unit
- Memory may also be provided in the processor 110 for storing computer instructions and data.
- the memory in processor 110 is cache memory.
- the memory may hold computer instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the computer instructions or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
- the video switching apparatus 100 runs in the processor 110, and the function of each module in the video switching apparatus 100 is read by the processor 110 and executes relevant computer instructions to realize video switching.
- the video switching apparatus 100 may be deployed in a memory, and the processor 110 reads and executes computer instructions from the memory to implement the video switching.
- the processor 110 may include one or more interfaces.
- the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transceiver (universal asynchronous transmitter) receiver/transmitter, UART) interface, mobile industry processor interface (MIPI), general-purpose input/output (GPIO) interface, subscriber identity module (SIM) interface, and / or universal serial bus (universal serial bus, USB) interface, etc.
- I2C integrated circuit
- I2S integrated circuit built-in audio
- PCM pulse code modulation
- PCM pulse code modulation
- UART universal asynchronous transceiver
- MIPI mobile industry processor interface
- GPIO general-purpose input/output
- SIM subscriber identity module
- USB universal serial bus
- the charging management module 140 is used to receive charging input from the charger.
- the charger may be a wireless charger or a wired charger.
- the charging management module 140 may receive charging input from the wired charger through the USB interface 130 .
- the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 200 . While the charging management module 140 charges the battery 142 , the electronic device 200 can also be powered by the power management module 141 .
- the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
- the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display screen 194, the camera 193, and the wireless communication module 160.
- the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, battery health status (leakage, impedance).
- the power management module 141 may also be provided in the processor 110 .
- the power management module 141 and the charging management module 140 may also be provided in the same device.
- the wireless communication function of the electronic device 200 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
- Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in electronic device 200 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
- the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
- the mobile communication module 150 can provide a wireless communication solution including 2G/3G/4G/5G, etc. applied on the electronic device 200 .
- the mobile communication module 150 may include one or more filters, switches, power amplifiers, low noise amplifiers (LNAs), and the like.
- the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
- the mobile communication module 150 can also amplify the signal modulated by the modulation and demodulation processor, and then turn it into an electromagnetic wave for radiation through the antenna 1 .
- at least part of the functional modules of the mobile communication module 150 may be provided in the processor 110 .
- at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110.
- the modem processor may include a modulator and a demodulator.
- the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
- the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
- the low frequency baseband signal is processed by the baseband processor and passed to the application processor.
- the application processor outputs sound signals through audio devices (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 194 .
- the modem processor may be a stand-alone device.
- the modem processor may be independent of the processor 110, and may be provided in the same device as the mobile communication module 150 or other functional modules.
- the wireless communication module 160 can provide applications on the electronic device 200 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), Bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field communication technology (near field communication, NFC), infrared technology (infrared, IR) and other wireless communication solutions.
- WLAN wireless local area networks
- BT Bluetooth
- GNSS global navigation satellite system
- frequency modulation frequency modulation
- FM near field communication technology
- NFC near field communication
- IR infrared technology
- the wireless communication module 160 may be one or more devices integrating one or more communication processing modules.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
- the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through
- the antenna 1 of the electronic device 200 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 200 can communicate with the network and other devices through wireless communication technology.
- the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (GPRS), code division multiple access (CDMA), broadband Code Division Multiple Access (WCDMA), Time Division Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), BT, GNSS, WLAN, NFC , FM, and/or IR technology, etc.
- the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (GLONASS), a Beidou navigation satellite system (BDS), a quasi-zenith satellite system (quasi -zenith satellite system, QZSS) and/or satellite based augmentation systems (SBAS).
- GPS global positioning system
- GLONASS global navigation satellite system
- BDS Beidou navigation satellite system
- QZSS quasi-zenith satellite system
- SBAS satellite based augmentation systems
- the electronic device 200 implements a display function through a GPU, a display screen 194, an application processor, and the like.
- the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations for graphics rendering.
- Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
- Display screen 194 is used to display images, videos, and the like.
- Display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light).
- LED organic light-emitting diode
- AMOLED organic light-emitting diode
- FLED flexible light-emitting diode
- Miniled MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on.
- the electronic device 200 may include one or N display screens 194 , where N is a positive integer greater than one.
- the electronic device 200 can realize the shooting function through the ISP, the camera 193, the video codec 195, the GPU, the display screen 194 and the application processor.
- the ISP is used to process the data fed back by the camera 193 .
- the shutter is opened, the light is transmitted to the photosensitive element of the camera through the lens, the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
- ISP can also perform algorithm optimization on image noise, brightness, and skin tone. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
- the ISP may be provided in the camera 193 .
- Camera 193 is used to capture still images or video.
- the object is projected through the lens to generate an optical image onto the photosensitive element.
- the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
- CMOS complementary metal-oxide-semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
- the electronic device 200 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
- a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 200 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy, and the like.
- Video codec 195 is used to compress or decompress digital video.
- the electronic device 200 may support one or more video codecs 195 .
- the electronic device 200 can play or record videos in various encoding formats, such as: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
- MPEG moving picture experts group
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 200 .
- the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
- Internal memory 121 may be used to store one or more computer programs including instructions.
- the processor 110 may execute the above-mentioned instructions stored in the internal memory 121, thereby causing the electronic device 200 to execute the video switching method provided in some embodiments of the present application, as well as various functional applications and data processing.
- the internal memory 121 may include a storage program area and a storage data area.
- the stored program area may store the operating system; the stored program area may also store one or more application programs (such as gallery, contacts, etc.) and the like.
- the storage data area may store data (such as photos, contacts, etc.) created during the use of the electronic device 200 and the like.
- the internal memory 121 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic disk storage devices, flash memory devices, universal flash storage (UFS), and the like.
- the processor 110 causes the electronic device 200 to perform the video switching provided in the embodiments of the present application by executing the instructions stored in the internal memory 121 and/or the instructions stored in the memory provided in the processor. methods, as well as various functional applications and data processing.
- the electronic device 200 may implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone jack 170D, and the application processor. Such as music playback, recording, etc.
- the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal. Audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be provided in the processor 110 , or some functional modules of the audio module 170 may be provided in the processor 110 .
- Speaker 170A also referred to as a "speaker" is used to convert audio electrical signals into sound signals.
- the electronic device 200 can listen to music through the speaker 170A, or listen to a hands-free call.
- the receiver 170B also referred to as "earpiece” is used to convert audio electrical signals into sound signals.
- the voice can be answered by placing the receiver 170B close to the human ear.
- the microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
- the user can make a sound by approaching the microphone 170C through a human mouth, and input the sound signal into the microphone 170C.
- the electronic device 200 may be provided with one or more microphones 170C.
- the electronic device 200 may be provided with two microphones 170C, which may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 200 may further be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and implement directional recording functions.
- the earphone jack 170D is used to connect wired earphones.
- the earphone interface 170D can be the USB interface 130, or can be a 3.5mm open mobile terminal platform (OMTP) standard interface, a cellular telecommunications industry association of the USA (CTIA) standard interface.
- OMTP open mobile terminal platform
- CTIA cellular telecommunications industry association of the USA
- the sensor module 180 may include a pressure sensor, a gyro sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.
- the touch sensor can be arranged on the display screen, and the touch screen is composed of the touch sensor and the display screen, also called "touch screen”.
- the above electronic device 200 may also include one or more components such as buttons, motors, indicators, and SIM card interfaces, which are not limited in this embodiment of the present application.
- the video switching device When the video switching device is a hardware device, it may be the electronic device 200 described above, including a display screen 194, a processor 110 and an internal memory 121, and the internal memory 121 may exist independently and be connected to the processor 110 through a communication bus.
- the internal memory 121 may also be integrated with the processor 110 .
- the internal memory 121 may store computer instructions, and when the computer instructions stored in the internal memory 121 are executed by the processor 110, the model optimization method of the present application may be implemented.
- the internal memory 121 may also store data required by the processor in the process of executing the video switching method of the embodiment of the present application and the generated intermediate data and/or result data.
- FIG. 3 is a schematic diagram of an application of the video switching apparatus in this application.
- the function provided by the video switching apparatus 100 may be abstracted into an application by an electronic equipment supplier or an application supplier,
- a video switching application the electronic device supplier installs the video switching application on the electronic device 200, or the application supplier allows the user to purchase the video switching application.
- the user can use the video switching application installed on the electronic device 200, or download the video switching application from online, and the user can use the video switching application to perform video switching.
- FIG. 4 is a flowchart of a video switching method provided by an embodiment of the present application.
- the video switching method can be performed by the aforementioned video switching device, referring to FIG. 4 , the method includes the following steps:
- Step S40 Provide an editing interface to the user.
- the function of the video switching device is abstracted into a video switching application.
- the interface of the mobile phone includes a status bar 511 , a main interface 510 and a Dock bar 501 .
- the status bar 511 may include the operator's name (eg, China Mobile), time, signal strength, and current remaining power, and the like. The content of the following status bar 511 is similar and will not be repeated here.
- the main interface 510 includes applications, including embedded applications and downloadable applications. As shown in FIG. 5a, the main interface 510 includes calendar, alarm clock, and video switching applications.
- the Dock bar 501 includes commonly used applications such as phone, information and camera.
- the user can import the pending video to be edited to realize video switching through the video import interface 512.
- the user can obtain the pending video by reading the video in the gallery on the mobile phone, or obtain the pending video by shooting with the camera or through the web page. Download the corresponding video to be processed, which is not specifically limited in this application.
- videos for user selection are presented on the video import interface 512, including video 1, video 2 and video 3.
- the content of each video is different, and the duration can also be different, such as the duration of video 1.
- the duration of the videos may not be different.
- the user can select a video or multiple videos. When the user selects a video, the video can be edited into two or more videos according to the recognition of the scene in the video. Or the user determines the position of the clip and the number of clips. It can be understood that if the user selects a video to perform video switching, the target objects in the video are the same, but the scenes are different. It can be understood that the scene includes objects other than the target object, and the objects other than the target object include characters, background environment, etc., and the background environment can include grassland, indoor, sky, stationary objects, and the like. For example, if a user shoots a video, and the scene in the video changes from indoor to outdoor, the video can be edited into an indoor video and an outdoor video.
- Video 1 and video 3 can be videos of the same dance performed by the same dancer shot at the same angle.
- the objects in video 3 are all the same dancer, but the dancer's clothing, makeup or hairstyle are different, and the scenes in video 1 and video 3 are also different.
- Video 1 is the first video
- video 2 is the second video. It is understandable, Video 2 may also be selected as the video to be processed, that is, the number of videos to be processed is not limited to two, and the number of videos to be processed is not specifically limited in this application.
- the video switching device After determining the video to be processed, the video switching device performs image processing on the image frames in the to-be-processed video to identify objects in the image frame images.
- the image frames in the video can be acquired frame by frame, and the acquired The resulting image frames are subjected to image recognition to obtain objects in the video. It is also possible to acquire multiple image frames in a video. For example, a video including specific video objects can be acquired, and then multiple frames of image frames can be intercepted from the video, for example, the multi-frame image frames of the 1st, 11th, 20th, and 34th seconds in the video can be intercepted, wherein , each multi-frame image frame corresponds to a specific time information.
- the video can be intercepted every 10 seconds, and the video can be intercepted to the 1st, 11th, 21st, etc. seconds in the video.
- Multi-frame image frames For another example, it is also possible to intercept multiple image frames from the video at certain time intervals, for example, the video can be intercepted every 10 seconds, and the video can be intercepted to the 1st, 11th, 21st, etc. seconds in the video. Multi-frame image frames.
- the editing interface may further include an object presentation interface 513, which is presented on the object presentation interface 513.
- Recognition results of objects in the video to be processed including the faces of object A, object B, and object C. It can be understood that the object A, the object B and the object C are the objects recognized after performing image recognition on the image frames in the video 1 and the video 2.
- the object presentation interface 513 may present the face or the whole of the character.
- the image frame of video 1 includes object A and/or object B and/or object C
- the image frame of video 2 includes object A and/or object B and/or object C
- the image frames of video 1 and video 2 include object A and/or object B and/or object C.
- the recognized objects include object A, object B and object C.
- the number of recognized objects is not limited, and the number of recognized objects is determined by the actual number of objects in the video.
- the video switching device edits this segment of video, and edits multiple segments of video, then the video to be processed is the multiple segments of video edited, then the video switching device will
- the image frames of the same person with the same posture expression, body posture expression and with a certain time interval are screened out. Since the expression and posture of the person are very close in a certain period of time, in order to achieve the best effect, the screened image frames need to be at a certain time interval.
- Step S41 Determine the target object.
- the video switching device can screen out the image frames in which the same target person has the same gesture expression and the same body gesture expression in the first image frame and the second image frame according to the target person.
- the image frames in the first video can be processed frame by frame to perform face recognition, and whether the face is a target object is determined by combining the RGB data of the face in the image frame with a face recognition algorithm.
- the processing of the image frames in the second video is the same, and details are not repeated here.
- a rectangular frame of the human face can be obtained, and then the face in the rectangular frame of the human face can be identified by using the face recognition technology, such as: Face ID technology can be used to label the face to determine which person in the video the face is, and then to determine the target object in the first image frame.
- Face ID technology can be used to label the face to determine which person in the video the face is, and then to determine the target object in the first image frame.
- the processing of the second image frame is the same, and details are not repeated here.
- object A, object B, and object C are presented in the object presentation interface 513, and the user can click on object A to select and determine object A as the target object.
- the video switching device may automatically determine the object located in the center of the image frame as the target object.
- Step S42 Calculate the similarity of the target object between the first image frame and the second image frame to obtain a similarity value.
- a certain first image frame A1 in the first video may be acquired first, and then the first image frame A1 may be compared with all the second image frames in the second video, for example, selecting the second video
- For a certain second image frame B1 in the second video calculate the similarity of the target object between the first image frame A1 and the second image frame B1, and then obtain the next second image frame B2 in the second video, and calculate The similarity of the target object between the first image frame A1 and the second image frame B2, and so on, calculate the target object between the first image frame and all the second image frames in the second video similarity.
- Image frame A1 calculates the similarity of the target object between the first image frame A1 and the second image frame B1
- obtain the next first image frame A2 in the first video and calculate the first image frame A2
- the similarity of the target object with the second image frame B1 and so on calculate the similarity of the target object between the first image frame and the second image frame.
- the similarity of the target object can be calculated in the following manner to obtain the similarity value.
- Step S61 Acquire the characteristics of the target object in the first image frame and the second image frame.
- Step S62 Calculate the distance of the feature of the target object between the first image frame and the second image frame to obtain a similarity value.
- the features of the target object such as facial features and/or body posture features of the target object
- the features of the target object may be acquired.
- one or more of two-dimensional features, three-dimensional features, and face grids of the human face can be obtained, and the two-dimensional features of the human face in the first image frame and the two-dimensional features of the human face in the second image frame are calculated.
- the distance between the dimensional features is obtained to obtain the distance measure, and then the similarity value is obtained according to the distance measure.
- the distance measure of the above features can also be integrated, and the final similarity value can be obtained after processing.
- the distance may be Euclidean distance, cosine distance, etc., which is not specifically limited in this application. It can be understood that the distance metric is used to measure the distance between individuals in space, and the farther the distance is, the greater the difference between individuals.
- the similarity measure is to calculate the degree of similarity between individuals. Contrary to the distance measure, the smaller the value of the similarity measure, the smaller the similarity between individuals and the greater the difference.
- the distance between the facial feature and/or the body posture feature of the target object may be calculated to ensure the similarity of the target object's face in the first image frame and the second image frame, or to ensure that the first image frame and the target object's face are similar.
- the similarity of the body posture of the target object in the second image frame can also ensure that the face and body posture of the target object in the first image frame and the second image frame are similar.
- the similarity error value of the target object between the first image frame and the second image frame may be calculated, that is, the distance between the facial feature of the target object and/or the body posture feature of the target object may be calculated , and the similarity value is obtained according to the similarity error value. It can be understood that the larger the similarity error value, the smaller the similarity between individuals and the greater the difference.
- Step S43 Acquire a switching image frame, wherein the switching image frame includes a first image frame and a second image frame whose similarity value is greater than or equal to a preset threshold.
- the similarity value is greater than or equal to a preset threshold
- the features of the target object in the first image frame and the second image frame are similar, such as similar facial features and/or body posture
- the scenes of the first image frame and the second image frame or the clothing, hairstyle, etc. of the target object may not be similar.
- a pair of switching image frames can be obtained including the first image frame. an image frame and a second image frame.
- a pair of switching image frames can be obtained including the first image frame. an image frame and a second image frame.
- multiple pairs of switching image frames can be obtained.
- a pair of switching image frames including the first Image frame A1 and second image frame B1.
- the similarity value between the target object of the first image frame A2 of the first video and the target object of the second image frame B2 of the second video is greater than the preset threshold, then a pair of switching image frames is obtained including the first image frame A2 and The second image frame B2.
- the editing interface may further include a switching image frame interface 711 (as shown in FIG. 7 a ), and the user may select an image frame to be switched through the switching image frame interface 711 .
- the switching image frame 1 and the switching image frame 2 are presented on the switching image frame interface 711.
- the switching image frame interface 711 may include multiple pairs of switching image frames.
- the switching image frame 1 includes the first image frame A100 and the second image frame B200, and the image frame connected after the first image frame A100 is the first image frame A100.
- the two image frames B200, or the image frame connected after the second image frame B200 is the first image frame A100.
- Step S44 Switch the first image frame of the first video to the second image frame of the second video or switch the second image frame of the second video to the first image frame according to the switching image frame The first image frame of the video.
- the switching of the first video to the second video according to the switching image frame may be implemented as: switching the first video and the second video according to the obtained switching image frame , or implemented as switching the first video and the second video according to the switching image frame selected by the user.
- the switching between the first video and the second video according to the switching image frame may specifically include: obtaining the first video and the second video according to the switching image frame, and performing the switching between the first video and the second video. If the switching position is determined, the video switching is performed according to the switching position of the first video and the second video.
- a pair of switching image frames includes a first image frame A10 and a second image frame B10, and the switching of the first video and the second video is implemented according to the first image frame A10 and the second image frame B10, and the first An image frame A10 is connected to the second image frame B10, that is, the image frame after the first image frame A10 is the second image frame B10, or the image frame after the second image frame B10 is the first image frame A10. So as to switch to the second image frame B10 when the first image frame A10 is played. Or, switch to the first image frame A10 when playing the second image frame B10.
- the first image frame display interface 712 in FIG. 7b presents the image of the first image frame
- the second image frame display interface 713 in FIG. 7c presents the image of the second image frame
- the image of the first image frame includes the target object, the grass and the cloud
- the image of the second image frame in FIG. 7c includes the target object
- the face and/or body posture of the target object in the image of the first image frame is the same as that of the second image frame.
- the face and/or body poses of the target object in the images are similar, but the scene is different, such as the target object's clothing, and the background is different.
- the image frames can be combined into one video according to the switching, so as to realize the switching of the first video and the second video. If all the obtained switching image frames are merged into one video, the first image frame and the second image frame are adjacent in the merged video, and the merged video is different from the first video and the second video. different.
- some image frames may be appropriately added, for example, two pairs of switching image frames are obtained, wherein a pair of switching image frames includes a first image frame A10 and a second image frame B20, and the other The pair of switching image frames includes a first image frame A31 and a second image frame B41. Then, according to the combination of switching image frames into one video, the first image frames A1 to A9 before the first image frame A10 can be obtained, the second image frames B21 to B40 after the second image frame B20 can be obtained, and after the first image frame A31 can be obtained. of the first image frame.
- the first image frames A1 to A10, the second image frames B20 to B41, the first image frame A31 and the subsequent first image frames may be combined into one video.
- Frame A1 starts to display and play, sequentially plays and displays to the first image frame A10, switches to the second image frame B20 after playing and displays the first image frame A10, instead of continuing to play and display the first image frame A11, and then plays and displays the first image frame A11.
- the second image frames B21 to B40 after the two image frames B20 are switched to the first image frame A31 after being played and displayed to the second image frame B41, and then the image frames after the first image frame A31 are played and displayed.
- the target objects of the image frames in the switching image frames may be aligned, for example, the target objects in the first image frame and the target objects in the second image frame are aligned at the positions of the image frames, so that the display When switching to the second image frame after the first image frame, the user visually sees little change in the target object.
- the video to be processed includes three video segments, a pair of switching image frames of video 1 and video 2 can be obtained, then a pair of switching image frames of video 2 and video 3 can be obtained, and then a pair of switching image frames of video 3 and video 1 can be obtained. A pair of switching image frames, etc.
- the video switching method can automatically realize video switching.
- the user only needs to input the video to be processed to determine the target object, and then the video switching can be automatically realized, and the video switching can be automatically performed according to the characteristics of the target object, so as to avoid less labor and Waste of time.
- FIG. 8 is a schematic flowchart of a video switching provided by an embodiment of the present application.
- the description will take the target object as the face of the person.
- Step S81 Obtain a rectangular frame of the human face through the RGB image.
- the RGB images of the first image frame and the second image frame are acquired, and image processing is performed on the two RGB images to obtain a rectangular frame of the human face, that is, the location of the human face in the two RGB images is identified. area, and use the face rectangle to frame the face.
- Step S82 labeling the face using the recognition technology to determine the target object.
- the face in the rectangular frame of the face is identified by the face recognition technology, for example, the identified face is labeled using the Face ID technology, so as to determine which person in the video the face is.
- the video switching device can determine the human face located in the middle of the screen as the target object according to the position of the human face, and can also specify which person is the target object by the user.
- Step S83 2D face feature point calculation and/or face 3D feature point calculation and/or face grid calculation.
- step S82 determines that the target object in the first image frame and the second image frame is the same person
- the similarity of the target object in the first image frame and the second image frame is calculated.
- Obtain the two-dimensional feature points of the face of the target object in the first image frame obtain the two-dimensional feature points of the target object in the second image frame, and then calculate the two-dimensional feature points of the target object in the two image frames.
- the distance between the two image frames is obtained, and the similarity error value is obtained, and the difference value of the face of the target object between the two image frames is determined.
- the three-dimensional feature point of the face of the target object in the first image frame obtains the three-dimensional feature point of the face of the target object in the second image frame, and then calculate the three-dimensional feature point of the target object in the two image frames The distance between the two image frames is obtained, and the similarity error value is obtained, and the difference value of the target object's face between the two image frames is determined.
- obtaining the face grid points of the target object in the first image frame obtaining the face grid points of the target object in the second image frame, and then calculating the grid points of the target object in the two image frames The distance between the two image frames is obtained, and the similarity error value is obtained, and the difference value of the target object's face between the two image frames is determined.
- the similarity error value may be obtained according to the difference value between the two-dimensional feature points and/or the difference value between the three-dimensional feature points and/or the difference value between the grid points.
- Step S84 The similarity error value is less than or equal to the error threshold.
- Step S85 Obtain the switching image frame.
- the two image frames are selected to obtain the switching image frame. Video switching is then performed according to the switching image frame.
- the video switching device automatically aligns the faces in the two image frames completely, and realizes the switching effect, so that the two video
- the seamless and ingenious connection achieves the high similarity of the faces of the two frames of images to be switched, and the cool effect of environment switching.
- the computer program product for realizing video switching includes one or more computer instructions for performing video switching.
- the process described in FIG. 4 and FIG. 6 according to the embodiment of the present application is generated in whole or in part. or function.
- the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (eg, infrared, wireless, microwave, etc.).
- the computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, a data center, or the like that includes an integration of one or more available media.
- the available media may be magnetic media (eg: floppy disk, hard disk, magnetic tape), optical media (eg: digital versatile disc (DVD)), or semiconductor media (eg: solid state disk (SSD)) )Wait.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Marketing (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Computer Security & Cryptography (AREA)
- Television Signal Processing For Recording (AREA)
- Studio Devices (AREA)
Abstract
Description
Claims (12)
- 一种视频切换方法,其特征在于,所述方法包括:确定目标对象;计算第一图像帧和第二图像帧之间所述目标对象的相似度,得到相似度值,其中所述第一图像帧来自第一视频,所述第二图像帧来自第二视频;获取切换图像帧,其中所述切换图像帧包括所述相似度值大于或等于预设阈值的第一图像帧和第二图像帧;根据所述切换图像帧将所述第一视频的第一图像帧切换至所述第二视频的第二图像帧或将所述第二视频的第二图像帧切换至所述第一视频的第一图像帧。
- 根据权利要求1所述的方法,其特征在于,所述计算第一图像帧和第二图像帧之间所述目标对象的相似度,得到相似度值包括:获取所述第一图像帧和所述第二图像帧中所述目标对象的特征;计算所述第一图像帧和所述第二图像帧之间所述目标对象的特征的距离,得到相似度值。
- 根据权利要求2所述的方法,其特征在于,所述目标对象的特征包括目标对象的脸部特征和/或目标对象的身体姿态特征。
- 根据权利要求1至3任一项所述的方法,其特征在于,所述方法还包括:提供编辑界面,所述编辑界面包括对所述第一图像帧和所述第二图像帧进行识别后呈现的对象;则所述确定目标对象包括:响应于用户的选择确定目标对象。
- 根据权利要求4所述的方法,其特征在于,所述编辑界面还包括供用户选择的一对或多对切换图像帧;则所述根据所述切换图像帧将所述第一视频的第一图像帧切换至所述第二视频的第二图像帧或将所述第二视频的第二图像帧切换至所述第一视频的第一图像帧包括:响应用户选择的一对或多对切换图像帧,根据所述一对或多对切换图像帧将所述第一视频的第一图像帧切换至所述第二视频的第二图像帧或将所述第二视频的第二图像帧切换至所述第一视频的第一图像帧。
- 一种视频切换装置,其特征在于,所述装置包括:确定模块,用于确定目标对象;计算模块,用于计算第一图像帧和第二图像帧之间所述目标对象的相似度,得到相似度值,其中所述第一图像帧来自第一视频,所述第二图像帧来自第二视频;获取模块,用于获取切换图像帧,其中所述切换图像帧包括所述相似度值大于或等于预设阈值的第一图像帧和第二图像帧;切换模块,用于根据所述切换图像帧将所述第一视频的第一图像帧切换至所述第二视频的第二图像帧或将所述第二视频的第二图像帧切换至所述第一视频的第一图像帧。
- 根据权利要求6所述的装置,其特征在于,所述计算模块具体用于:获取所述第一图像帧和所述第二图像帧中所述目标对象的特征;计算所述第一图像帧和所述第二图像帧之间所述目标对象的特征的距离,得相似度值。
- 根据权利要求7所述的装置,其特征在于,所述目标对象的特征包括目标对象的脸部特征和/或目标对象的身体姿态特征。
- 根据权利要求6至8任一项所述的装置,其特征在于,所述装置还包括:编辑模块,用于提供编辑界面,所述编辑界面包括对所述第一图像帧和所述第二图像帧进行识别后呈现的对象;则所述确定模块具体用于:响应于用户的选择确定目标对象。
- 根据权利要求9所述的装置,其特征在于,所述编辑界面还包括供用户选择的一对或多对切换图像帧;则所述切换模块具体用于:响应用户选择的一对或多对切换图像帧,根据所述一对或多对切换图像帧将所述第一视频的第一图像帧切换至所述第二视频的第二图像帧或将所述第二视频的第二图像帧切换至所述第一视频的第一图像帧。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序代码,当所述计算机程序代码被电子设备执行时,所述电子设备执行上述权利要求1至5中任一项所述的方法。
- 一种电子设备,其特征在于,所述电子设备包括处理器和存储器,所述存储器用于存储一组计算机指令,当所述处理器执行所述一组计算机指令时,所述电子设备执行上述权利要求1至5中任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US18/260,192 US12401836B2 (en) | 2021-01-05 | 2021-12-31 | Video switching method and apparatus, storage medium, and device |
| EP21917358.0A EP4266208A4 (en) | 2021-01-05 | 2021-12-31 | VIDEO SWITCHING METHOD AND APPARATUS, STORAGE MEDIUM AND APPARATUS |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110008033.0 | 2021-01-05 | ||
| CN202110008033.0A CN114724055B (zh) | 2021-01-05 | 2021-01-05 | 视频切换方法、装置、存储介质及设备 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022148319A1 true WO2022148319A1 (zh) | 2022-07-14 |
Family
ID=82234015
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/143821 Ceased WO2022148319A1 (zh) | 2021-01-05 | 2021-12-31 | 视频切换方法、装置、存储介质及设备 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US12401836B2 (zh) |
| EP (1) | EP4266208A4 (zh) |
| CN (1) | CN114724055B (zh) |
| WO (1) | WO2022148319A1 (zh) |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12549784B2 (en) * | 2022-04-29 | 2026-02-10 | Rajiv Trehan | Method and system of generating on-demand video of interactive activities |
| CN115243023B (zh) * | 2022-07-20 | 2025-02-07 | 展讯通信(上海)有限公司 | 一种图像处理方法、装置、电子设备及存储介质 |
| US12423976B2 (en) * | 2022-07-25 | 2025-09-23 | Motorola Solutions, Inc. | Device, system, and method for altering video streams to identify objects of interest |
| CN116095221B (zh) * | 2022-08-10 | 2023-11-21 | 荣耀终端有限公司 | 一种游戏中的帧率调整方法及相关装置 |
| CN118118734A (zh) * | 2022-11-30 | 2024-05-31 | 华为技术有限公司 | 一种视频处理的方法以及电子设备 |
| WO2025213475A1 (zh) * | 2024-04-12 | 2025-10-16 | 北京字跳网络技术有限公司 | 对视频匹配的指示和交互 |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6636220B1 (en) * | 2000-01-05 | 2003-10-21 | Microsoft Corporation | Video-based rendering |
| CN110675433A (zh) * | 2019-10-31 | 2020-01-10 | 北京达佳互联信息技术有限公司 | 视频处理方法、装置、电子设备及存储介质 |
| CN111294644A (zh) * | 2018-12-07 | 2020-06-16 | 腾讯科技(深圳)有限公司 | 视频拼接方法、装置、电子设备及计算机存储介质 |
| CN111460219A (zh) * | 2020-04-01 | 2020-07-28 | 百度在线网络技术(北京)有限公司 | 视频处理方法及装置、短视频平台 |
| CN111970562A (zh) * | 2020-08-17 | 2020-11-20 | Oppo广东移动通信有限公司 | 视频处理方法、视频处理装置、存储介质与电子设备 |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102254336B (zh) | 2011-07-14 | 2013-01-16 | 清华大学 | 人脸视频合成方法及装置 |
| CN102306290B (zh) | 2011-10-14 | 2013-10-30 | 刘伟华 | 一种基于视频的人脸跟踪识别方法 |
| US10044944B2 (en) | 2015-09-28 | 2018-08-07 | Gopro, Inc. | Automatic composition of video with dynamic background and composite frames selected based on foreground object criteria |
| US10559062B2 (en) * | 2015-10-22 | 2020-02-11 | Korea Institute Of Science And Technology | Method for automatic facial impression transformation, recording medium and device for performing the method |
| US10108861B2 (en) * | 2016-09-20 | 2018-10-23 | Motorola Solutions, Inc. | Systems and methods of providing content differentiation between thumbnails |
| US11240567B2 (en) * | 2016-10-25 | 2022-02-01 | Aether Media, Inc. | Video content switching and synchronization system and method for switching between multiple video formats |
| CN106534967B (zh) | 2016-10-25 | 2019-08-02 | 司马大大(北京)智能系统有限公司 | 视频剪辑方法及装置 |
| US10055880B2 (en) * | 2016-12-06 | 2018-08-21 | Activision Publishing, Inc. | Methods and systems to modify a two dimensional facial image to increase dimensional depth and generate a facial image that appears three dimensional |
| US10734027B2 (en) * | 2017-02-16 | 2020-08-04 | Fusit, Inc. | System and methods for concatenating video sequences using face detection |
| CN108197555B (zh) | 2017-12-28 | 2020-10-16 | 杭州相芯科技有限公司 | 一种基于人脸追踪的实时人脸融合方法 |
| CN111091529A (zh) * | 2018-10-24 | 2020-05-01 | 株式会社理光 | 一种人数统计方法及人数统计系统 |
| CN110390263A (zh) | 2019-06-17 | 2019-10-29 | 宁波江丰智能科技有限公司 | 一种视频图像处理方法及系统 |
| CN111061914B (zh) | 2019-12-10 | 2024-01-02 | 懂频智能科技(上海)有限公司 | 一种基于人脸识别技术选取特定人脸视频片段的方法 |
| CN111062289A (zh) | 2019-12-10 | 2020-04-24 | 懂频智能科技(上海)有限公司 | 一种选取特定人脸视频片段替换模板窗口成短视频的方法 |
| US11354883B2 (en) * | 2019-12-30 | 2022-06-07 | Sensetime International Pte. Ltd. | Image processing method and apparatus, and electronic device |
| CN111491124B (zh) * | 2020-04-17 | 2023-02-17 | 维沃移动通信有限公司 | 视频处理方法、装置及电子设备 |
| CN112085097A (zh) * | 2020-09-09 | 2020-12-15 | 北京市商汤科技开发有限公司 | 图像处理方法及装置、电子设备和存储介质 |
-
2021
- 2021-01-05 CN CN202110008033.0A patent/CN114724055B/zh active Active
- 2021-12-31 WO PCT/CN2021/143821 patent/WO2022148319A1/zh not_active Ceased
- 2021-12-31 EP EP21917358.0A patent/EP4266208A4/en active Pending
- 2021-12-31 US US18/260,192 patent/US12401836B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6636220B1 (en) * | 2000-01-05 | 2003-10-21 | Microsoft Corporation | Video-based rendering |
| CN111294644A (zh) * | 2018-12-07 | 2020-06-16 | 腾讯科技(深圳)有限公司 | 视频拼接方法、装置、电子设备及计算机存储介质 |
| CN110675433A (zh) * | 2019-10-31 | 2020-01-10 | 北京达佳互联信息技术有限公司 | 视频处理方法、装置、电子设备及存储介质 |
| CN111460219A (zh) * | 2020-04-01 | 2020-07-28 | 百度在线网络技术(北京)有限公司 | 视频处理方法及装置、短视频平台 |
| CN111970562A (zh) * | 2020-08-17 | 2020-11-20 | Oppo广东移动通信有限公司 | 视频处理方法、视频处理装置、存储介质与电子设备 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4266208A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114724055A (zh) | 2022-07-08 |
| US20240064346A1 (en) | 2024-02-22 |
| EP4266208A4 (en) | 2024-06-12 |
| EP4266208A1 (en) | 2023-10-25 |
| US12401836B2 (en) | 2025-08-26 |
| CN114724055B (zh) | 2026-03-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111476911B (zh) | 虚拟影像实现方法、装置、存储介质与终端设备 | |
| CN111179282B (zh) | 图像处理方法、图像处理装置、存储介质与电子设备 | |
| US12401836B2 (en) | Video switching method and apparatus, storage medium, and device | |
| WO2021213120A1 (zh) | 投屏方法、装置和电子设备 | |
| WO2020192461A1 (zh) | 一种延时摄影的录制方法及电子设备 | |
| CN112954251B (zh) | 视频处理方法、视频处理装置、存储介质与电子设备 | |
| WO2020140726A1 (zh) | 一种拍摄方法及电子设备 | |
| CN114449333B (zh) | 视频笔记生成方法及电子设备 | |
| CN111448587B (zh) | 一种广告图片的显示方法、上传方法及装置 | |
| WO2021057673A1 (zh) | 一种图像显示方法及电子设备 | |
| CN111010693A (zh) | 一种提供无线保真WiFi网络接入服务的方法及电子设备 | |
| CN113473013A (zh) | 图像美化效果的显示方法、装置和终端设备 | |
| CN112188094B (zh) | 图像处理方法及装置、计算机可读介质及终端设备 | |
| CN117133306A (zh) | 立体声降噪方法、设备及存储介质 | |
| CN113593567B (zh) | 视频声音转文本的方法及相关设备 | |
| US20230319217A1 (en) | Recording Method and Device | |
| CN117544817A (zh) | 一种视频分享图片的生成方法及相关装置 | |
| CN112269554B (zh) | 显示系统及显示方法 | |
| CN111626931B (zh) | 图像处理方法、图像处理装置、存储介质与电子设备 | |
| CN114968163A (zh) | 音频播放方法、电子设备、系统及存储介质 | |
| WO2025066782A1 (zh) | 视频拍摄方法及终端设备 | |
| CN119277215A (zh) | 图像处理方法、电子设备、计算机程序产品及存储介质 | |
| CN116567328A (zh) | 一种采集音频的方法及电子设备 | |
| CN111738107A (zh) | 视频生成方法、视频生成装置、存储介质与电子设备 | |
| CN120494903B (zh) | 一种基于大数据的广告投放方法和系统 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21917358 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18260192 Country of ref document: US |
|
| ENP | Entry into the national phase |
Ref document number: 2021917358 Country of ref document: EP Effective date: 20230721 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 18260192 Country of ref document: US |