WO2024114569A1 - 一种视频处理的方法以及电子设备 - Google Patents
一种视频处理的方法以及电子设备 Download PDFInfo
- Publication number
- WO2024114569A1 WO2024114569A1 PCT/CN2023/134290 CN2023134290W WO2024114569A1 WO 2024114569 A1 WO2024114569 A1 WO 2024114569A1 CN 2023134290 W CN2023134290 W CN 2023134290W WO 2024114569 A1 WO2024114569 A1 WO 2024114569A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video
- frame
- target object
- video frame
- electronic device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44016—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04842—Selection of displayed objects or displayed text elements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
- H04N21/4316—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
- H04N21/440245—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/4728—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/01—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
- H04N7/0117—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
- H04N7/0122—Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal the input and the output signals having different aspect ratios
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/048—Indexing scheme relating to G06F3/048
- G06F2203/04803—Split screen, i.e. subdividing the display area or the window area into separate subareas
Definitions
- the present application relates to the field of electronic devices, and more specifically, to a video processing method and an electronic device.
- the present application provides a video processing method and electronic device, which can play a new video that highlights the target line in the original video according to the user's operation, thereby improving the user experience.
- a method for video processing comprising: acquiring a first video, the first video comprising N video frames, N ⁇ 2 and being an integer, wherein the N video frames comprise a first video frame and a second video frame, the first video frame and the second video frame comprise at least one object, and the at least one object comprises a first target object; responding to a first operation of a user, the first operation being an operation of selecting the first target object; acquiring a second video, wherein the second video comprises a third video frame and a fourth video frame, the third video frame and the fourth video frame comprise the first target object, the third video frame is obtained by cropping the first target object in the first video frame, and the fourth video frame is obtained by cropping the first target object in the second video frame; and playing the second video.
- the electronic device can determine the first target object in the original video according to the user's operation and play a new video.
- the new video is centered on the first target object, which can better highlight the first target object in the original video and enhance the user experience.
- the at least one object also includes a second target object
- the method also includes: in response to a second operation of the user, acquiring a third video, wherein the third video includes a fifth video frame and a sixth video frame, the fifth video frame includes the first target object and/or the second target object, the sixth video frame includes the first target object and/or the second target object, the fifth video frame is obtained by cropping according to the first target object and/or the second target object in the first video frame, and the sixth video frame is obtained by cropping according to the first target object and/or the second target object in the second video frame.
- the at least one object further includes a second target object, and the third video frame and the fourth video frame do not include the second target object.
- a size of the first target object in the third video frame is different from a size of the first target object in the first video frame.
- the first target object in the third video frame and the first target object in the fourth video frame have different sizes.
- the size of the first target object in the first video frame and the first target object in the second video frame are the same, and the inter-frame rate of the first target object in the first video frame and the inter-frame rate of the first target object in the second video frame are different.
- the electronic device can determine the target object in the original video and generate a new video.
- the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
- the first target object in the third video frame and the first target object in the fourth video frame are the same size.
- the first target object in the first video frame and the second video frame The size of the first target object in the first video frame is different, and the inter-frame rate of the first target object in the first video frame is different from the inter-frame rate of the first target object in the second video frame.
- the electronic device can determine the target object in the original video and generate a new video.
- the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
- the method further includes: displaying a first interface, the first interface displaying a first window and a second window, wherein the first window displays the first video frame, and the second window displays the third video frame.
- the method also includes: playing the first video; the playing of the second video includes: when the first target object is detected, playing the second video in full screen; after playing the second video in full screen and the first target object is not detected, the method also includes: continuing to play the first video.
- the method also includes: displaying a first interface, the first interface including a first window, and playing the first video in the first window; playing the second video includes: when the first target object is detected, displaying a second interface, the second interface including a first window and a second window, wherein the first window plays the first video and the second window plays the second video; after displaying the second interface, the method also includes: when the first target object is not detected, displaying a third interface, the third interface including the first window, continuing to play the first video in the first window, and the third interface does not include the second window.
- the method before responding to the user's first operation, the method also includes: displaying a third interface, the third interface including a fifth window, and the third window including the at least one object in the first video.
- the first operation is an operation of selecting the first target object in the fifth window.
- the N video frames include the first target object
- the second video includes M video frames
- the M video frames include the first target object
- M ⁇ N and M is an integer
- the method further includes: determining a frame extraction interval according to an inter-frame speed of the first target object between two adjacent video frames in the N video frames.
- the first video is a horizontal video
- the second video is a vertical video
- the first video frame and the second video frame are horizontal video frames
- the third video frame and the fourth video frame are vertical video frames.
- the first video and the second video have different durations.
- the third video frame and the fourth video frame have different heights.
- the second aspect is an electronic device of an embodiment of the present application, which includes modules/units for executing the above aspects or any possible design method of the above aspects; these modules/units can be implemented by hardware, or the corresponding software can be implemented by hardware.
- the third aspect is a chip of an embodiment of the present application, which is coupled to a memory in an electronic device and is used to call a computer program stored in the memory and execute the above-mentioned aspects of the embodiment of the present application and any possible design of the above-mentioned aspects of the embodiment of the present application; "coupling" in the embodiment of the present application refers to the direct or indirect combination of two components with each other.
- the fourth aspect is a computer-readable storage medium according to an embodiment of the present application, wherein the computer-readable storage medium includes a computer program.
- the computer program runs on an electronic device, the electronic device executes a technical solution such as the above aspect and any possible design of the above aspect.
- the fifth aspect is a computer program according to an embodiment of the present application, wherein the computer program includes instructions.
- the instructions When the instructions are executed on a computer, the computer executes a technical solution as in the above aspect and any possible design of the above aspect.
- the sixth aspect is a graphical user interface on an electronic device of an embodiment of the present application, wherein the electronic device has a display screen, one or more memories, and one or more processors, wherein the one or more processors are used to execute one or more computer programs stored in the one or more memories, and the graphical user interface includes a graphical user interface displayed when the electronic device executes the above aspect and any possible technical solution of the above aspect.
- the seventh aspect is an electronic device of an embodiment of the present application, which includes one or more processors; one or more memories; the one or more memories store one or more computer programs, and the one or more computer programs include instructions.
- the instructions are executed by the one or more processors, the above aspects or any possible implementation method of the above aspects is executed.
- FIG1 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
- FIG. 2 is a software structure block diagram of an electronic device provided in an embodiment of the present application.
- FIG. 3 is a set of GUIs provided in an embodiment of the present application.
- FIG. 4 is a set of GUIs provided in an embodiment of the present application.
- FIG. 5 is a set of GUIs provided in an embodiment of the present application.
- FIG. 6 is a set of GUIs provided in an embodiment of the present application.
- FIG. 7 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
- FIG. 8 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
- FIG. 9 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
- FIG. 10 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
- FIG. 11 is a schematic diagram of determining the inter-frame rate provided in an embodiment of the present application.
- FIG12 is a schematic diagram showing a comparison of the aspect ratios of an original video frame and a cropped video frame provided in an embodiment of the present application.
- FIG. 13 is a schematic diagram of determining a cropping frame provided in an embodiment of the present application.
- FIG. 14 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
- FIG. 15 is a schematic diagram of determining a cropping frame provided in an embodiment of the present application.
- FIG. 16 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
- FIG17 is a schematic diagram of the composition of an electronic device provided in an embodiment of the present application.
- FIG18 is a schematic diagram of a server composition provided in an embodiment of the present application.
- a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
- the character "/” generally indicates that the objects associated before and after are in an "or” relationship.
- references to "one embodiment” or “some embodiments” etc. described in this specification mean that a particular feature, structure or characteristic described in conjunction with the embodiment is included in one or more embodiments of the present application.
- the phrases “in one embodiment”, “in some embodiments”, “in some other embodiments”, “in some other embodiments”, etc. appearing in different places in this specification do not necessarily all refer to the same embodiment, but mean “one or more but not all embodiments", unless otherwise specifically emphasized in other ways.
- the terms “including”, “comprising”, “having” and their variations all mean “including but not limited to”, unless otherwise specifically emphasized in other ways.
- the electronic device may be a portable electronic device that also includes other functions such as a personal digital assistant and/or a music player function, such as a mobile phone, a tablet computer, a wearable electronic device with wireless communication function (such as a smart watch), etc.
- portable electronic devices include but are not limited to devices equipped with Or a portable electronic device with other operating systems.
- the portable electronic device may also be other portable electronic devices, such as a laptop computer, etc. It should also be understood that in some other embodiments, the electronic device may not be a portable electronic device, but a desktop computer.
- FIG1 shows a schematic diagram of the structure of an electronic device 100.
- the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a compass 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a subscriber identification module (SIM) card interface 195, etc.
- SIM subscriber identification module
- the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100.
- the electronic device 100 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently.
- the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
- the processor 110 may include one or more processing units.
- the processor 110 may include an application processor.
- the electronic device 100 may include an processor, an AP, a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU), etc.
- different processing units may be independent components or integrated in one or more processors.
- the electronic device 100 may also include one or more processors 110.
- the controller may generate an operation control signal according to the instruction opcode and the timing signal to complete the control of fetching and executing instructions.
- a memory may also be provided in the processor 110 for storing instructions and data.
- the memory in the processor 110 may be a cache memory.
- the memory may store instructions or data that the processor 110 has just used or circulated. If the processor 110 needs to use the instruction or data again, it may be directly called from the memory. In this way, repeated access is avoided, the waiting time of the processor 110 is reduced, and the efficiency of the electronic device 100 in processing data or executing instructions is improved.
- the processor 110 may include one or more interfaces.
- the interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input/output (GPIO) interface, a SIM card interface, and/or a USB interface.
- the USB interface 130 is an interface that complies with the USB standard specification, and specifically can be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
- the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and a peripheral device.
- the USB interface 130 can also be used to connect headphones to play audio through the headphones.
- the interface connection relationship between the modules illustrated in the embodiment of the present application is only a schematic illustration and does not constitute a structural limitation on the electronic device 100.
- the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
- the charging management module 140 is used to receive charging input from a charger.
- the charger may be a wireless charger or a wired charger.
- the charging management module 140 may receive charging input from a wired charger through the USB interface 130.
- the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100. While the charging management module 140 is charging the battery 142, it may also power the electronic device through the power management module 141.
- the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
- the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
- the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle number, battery health status (leakage, impedance), etc.
- the power management module 141 can also be set in the processor 110.
- the power management module 141 and the charging management module 140 can also be set in the same device.
- the wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
- Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve the utilization of antennas.
- antenna 1 can be reused as a diversity antenna for a wireless local area network.
- the antenna can be used in combination with a tuning switch.
- the mobile communication module 150 can provide solutions for wireless communications including 2G/3G/4G/5G, etc., applied to the electronic device 100.
- the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), etc.
- the mobile communication module 150 may receive electromagnetic waves from the antenna 1, and perform filtering, amplification, and other processing on the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
- the mobile communication module 150 may also amplify the signal modulated by the modulation and demodulation processor, and convert it into electromagnetic waves for radiation through the antenna 1.
- at least some of the functional modules of the mobile communication module 150 may be arranged in the processor 110.
- at least some of the functional modules of the mobile communication module 150 may be arranged in the same device as at least some of the modules of the processor 110.
- the wireless communication module 160 can provide wireless communication solutions including wireless local area networks (WLAN) (such as wireless fidelity (WiFi) networks), bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), infrared (IR), etc., which are applied to the electronic device 100.
- the wireless communication module 160 can be one or more devices integrating at least one communication processing module.
- the wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the frequency of the electromagnetic wave signal and performs filtering, and sends the processed signal to the processor 110.
- the wireless communication module 160 can also receive the signal to be sent from the processor 110, modulate the frequency of it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2.
- the electronic device 100 implements the display function through a GPU, a display screen 194, and an application processor.
- the GPU is a microprocessor for image processing, which connects the display screen 194 and the application processor.
- the GPU is used to perform mathematical and geometric calculations for graphics rendering.
- the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
- the display screen 194 is used to display images, videos, etc.
- the display screen 194 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (QLED), etc.
- the electronic device 100 may include one or more display screens 194.
- the display screen 194 in FIG. 1 can be bent.
- the display screen 194 can be bent, which means that the display screen can be bent to any angle at any position and can be maintained at the angle.
- the display screen 194 can be folded in half from the middle to the left or right. It can also be folded in half from the middle to the top or bottom.
- the display screen 194 of the electronic device 100 can be a flexible screen.
- the flexible screen has attracted much attention for its unique characteristics and huge potential.
- flexible screens have the characteristics of strong flexibility and bendability, which can provide users with a new interaction method based on the bendable characteristics, and can meet users' more needs for electronic devices.
- the foldable display screen on the electronic device can be switched between a small screen in a folded form and a large screen in an unfolded form at any time. Therefore, users use the split-screen function on electronic devices equipped with a foldable display screen more and more frequently.
- the electronic device 100 can realize the shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194 and the application processor.
- the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, and the light is transmitted to the camera photosensitive element through the lens. The light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
- the ISP can also perform algorithm optimization on the noise, brightness, and skin color of the image. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the camera 193.
- the camera 193 is used to capture still images or videos.
- the object generates an optical image through the lens and projects it onto the photosensitive element.
- the photosensitive element can be a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) phototransistor.
- CMOS complementary metal oxide semiconductor
- the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to be converted into a digital image signal.
- the ISP outputs the digital image signal to the DSP for processing.
- the DSP converts the digital image signal into an image signal in a standard RGB, YUV or other format.
- the electronic device 100 may include one or more cameras 193.
- the digital signal processor is used to process digital signals, and can process not only digital image signals but also other digital signals. For example, when the electronic device 100 is selecting a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy.
- Video codecs are used to compress or decompress digital videos.
- the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record videos in a variety of coding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
- MPEG Moving Picture Experts Group
- MPEG2 MPEG2, MPEG3, MPEG4, etc.
- NPU is a neural network (NN) computing processor.
- NN neural network
- applications such as intelligent cognition of electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, etc.
- the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
- the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music and videos can be stored in the external memory card.
- the internal memory 121 can be used to store one or more computer programs, which include instructions.
- the processor 110 can enable the electronic device 100 to perform the methods provided in some embodiments of the present application, as well as various applications and data processing, etc. by running the above instructions stored in the internal memory 121.
- the internal memory 121 may include a program storage area and a data storage area.
- the program storage area can store an operating system; the program storage area can also store one or more applications (such as a gallery, contacts, etc.).
- the data storage area can store data (such as photos, contacts, etc.) created during the use of the electronic device 100.
- the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more disk storage components, a flash memory component, a universal flash storage (UFS), etc.
- the processor 110 can enable the electronic device 100 to perform the methods provided in the embodiments of the present application, as well as other applications and data processing by running instructions stored in the internal memory 121, and/or instructions stored in a memory provided in the processor 110.
- the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. For example, music playback, Recording, etc.
- the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
- the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
- the pressure sensor 180A can be set on the display screen 194.
- the capacitive pressure sensor can be a parallel plate including at least two conductive materials.
- the electronic device 100 determines the intensity of the pressure according to the change in capacitance.
- the electronic device 100 detects the touch operation intensity according to the pressure sensor 180A.
- the electronic device 100 can also calculate the touch position according to the detection signal of the pressure sensor 180A.
- touch operations acting on the same touch position but with different touch operation intensities can correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
- the gyro sensor 180B can be used to determine the motion posture of the electronic device 100.
- the angular velocity of the electronic device 100 around three axes i.e., X, Y, and Z axes
- the gyro sensor 180B can be used for anti-shake shooting. For example, when the shutter is pressed, the gyro sensor 180B detects the angle of the electronic device 100 shaking, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to offset the shaking of the electronic device 100 through reverse movement to achieve anti-shake.
- the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
- the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in all directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of the electronic device and is applied to applications such as horizontal and vertical screen switching and pedometers.
- the ambient light sensor 180L is used to sense the brightness of the ambient light.
- the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
- the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
- the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touches.
- the fingerprint sensor 180H is used to collect fingerprints.
- the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photography, fingerprint call answering, etc.
- the temperature sensor 180J is used to detect temperature.
- the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 reduces the performance of a processor located near the temperature sensor 180J to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
- the touch sensor 180K is also called a "touch panel”.
- the touch sensor 180K can be set on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a "touch screen”.
- the touch sensor 180K is used to detect touch operations acting on or near it.
- the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
- Visual output related to the touch operation can be provided through the display screen 194.
- the touch sensor 180K can also be set on the surface of the electronic device 100, which is different from the position of the display screen 194.
- FIG2 is a software structure diagram of the electronic device 100 of an embodiment of the present application.
- the layered architecture divides the software into several layers, each layer has a clear role and division of labor.
- the layers communicate with each other through software interfaces.
- the Android system is divided into four layers, from top to bottom, namely, the application layer, the application framework layer, the Android runtime (Android runtime) and the system library, and the kernel layer.
- the application layer can include a series of application packages.
- the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
- the application framework layer provides application programming interface (API) and programming framework for applications in the application layer.
- API application programming interface
- the application framework layer includes some predefined functions.
- the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and the like.
- the window manager is used to manage window programs.
- the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
- Content providers are used to store and retrieve data and make it accessible to applications.
- This data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
- the view system includes visual controls, such as controls for displaying text, controls for displaying images, etc.
- the view system can be used to build applications.
- a display interface can be composed of one or more views.
- a display interface including a text notification icon can include a view for displaying text and a view for displaying images.
- the phone manager is used to provide communication functions of the electronic device 100, such as management of call status (including connecting, hanging up, etc.).
- the resource manager provides various resources for applications, such as localized strings, icons, images, layout files, video files, and so on.
- the notification manager enables applications to display notification information in the status bar. It can be used to convey notification-type messages and can disappear automatically after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
- the notification manager can also be a notification that appears in the system top status bar in the form of a chart or scroll bar text, such as notifications of applications running in the background, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in the status bar, a prompt sound is emitted, an electronic device vibrates, an indicator light flashes, etc.
- the system library can include multiple functional modules, such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
- functional modules such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
- the surface manager is used to manage the display subsystem and provide the fusion of 2D and 3D layers for multiple applications.
- the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
- the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG and PNG, etc.
- the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis and layer processing.
- a 2D graphics engine is a drawing engine for 2D drawings.
- the kernel layer is the layer between hardware and software.
- the kernel layer includes at least display driver, camera driver, audio driver, and sensor driver.
- the present application provides a method for video processing, which can lock the target object in the original video, and can adjust the size of the cropping frame when editing the video according to the target object, and can output a new video that highlights the target object in the original video, which can improve the user experience.
- the target object of a video can be understood as the main content presented by the video.
- the target object of a video can be a person, animal, object, action, etc. in the video.
- the target object of the football match video can be the football, the player with the ball, the star player, the player's passing action, the player's foul action, etc.
- the electronic device can determine the target object in the video according to the user's operation and generate a new video.
- FIG. 3 shows a set of graphical user interfaces (GUI) provided by an embodiment of the present application.
- GUI graphical user interfaces
- the electronic device displays an interface 301, which is a playback interface of a video application.
- the electronic device can play video #1 on the interface 301, and the content of the video #1 is a person jumping from position #1 to position #2.
- the bold solid line in the interface 301 is the boundary of video #1, and the electronic device can display other information outside the boundary, such as time, signal strength, etc.
- the electronic device detects the operation of the user clicking on the interface 301, and in response to the operation, it can display the GUI shown in (b) of FIG. 3 .
- the electronic device may display one or more controls on the interface 301, and the one or more controls may correspond to different functions.
- control 302 corresponds to the sharing function
- control 303 corresponds to the exit function
- control 304 corresponds to the function of generating a target object video.
- the function of generating a target object video can be understood as the electronic device selecting an object in the original video as the target object and generating a new video based on the target object.
- the video #1 can be cropped with the dotted box in (b) of Figure 3 as the cropping box, so that the GUI shown in (c) of Figure 3 can be displayed.
- the electronic device in response to detecting that the user clicks on the control 304, can crop video #1 using the dotted frame in (b) in FIG. 3 as the cropping frame.
- the cropping frame selects the jumping action of the character in video #1 as the target object, and expands the cropped video frame to the same size as video #1 to generate video #2 and play it on the interface 301.
- the content of video #2 is centered on the jumping action of the character, and can highlight the jumping action of the character.
- the electronic device in response to detecting the user clicking on control 304, may select the jumping action of the character in video #1 as the target object to generate video #2, and display interface 305 on interface 302, and preview and play video #2 on interface 305.
- video #2 may be played in full screen.
- the electronic device can automatically generate a new video based on the original video.
- the new video retains the main content of the original video, which can facilitate users to watch the main content presented by the video, thereby improving the user experience.
- the electronic device can automatically identify the human action in the video as the target object to generate a new video. In other examples, the electronic device can also determine the target object and generate a new video based on the user's selection.
- FIG. 4 shows another set of GUIs provided by an embodiment of the present application.
- the electronic device displays an interface 401, which is a playback interface of a video application.
- the electronic device can play video #1 on the interface 401, and the video #1 is a football match video.
- the objects in the video #1 include player #1, player #2, and football.
- the electronic device can also display a control 402 on the interface 401, and the control 402 corresponds to the function of generating a target object video.
- the electronic device detects the operation of the user clicking on the control 402, and in response to the operation, the object of the video can be identified and the interface 403 can be displayed, and the interface 403 includes information about the object in the video #1.
- a GUI as shown in (e) or (f) of Figure 4 can be displayed.
- the object selected by the user can be referred to as the target object.
- the user can also set the duration of the generated new video in interface 403. For example, as shown in (b) in Figure 4, the user sets the video time to 00:00-5:30.
- the electronic device When the electronic device generates a new video, it can determine the target object from the video content of the 00:00-05:30 time period of video #1 and generate a new video.
- the duration of the generated new video is 5:30.
- the electronic device detects the user's operation of clicking on control 402, and in response to the operation, the electronic device can select the identified object in interface 401.
- the electronic device detects the user's operation of selecting an object in interface 401 (e.g., clicking on player #2 and football), in response to the operation, a GUI as shown in (e) or (f) of FIG. 4 can be displayed.
- the object selected by the user can be referred to as a target object.
- the electronic device can automatically identify the target object in the video. Taking the detected target object as a shot as an example, when the electronic device detects that player #2 in video #1 shoots, it can display a GUI as shown in (d) or (e) in Figure 4.
- the electronic device may detect the target object in video #1 in the process of playing video #1 according to a preset rule, which may be set by the system or by the user.
- the preset rule is to automatically detect a shot in a football video when playing a football video.
- an option box 404 may be displayed, and the option box 404 includes prompt information for prompting the user that a shot action is detected.
- the electronic device may play video #2 centered on the shot of player #2 on interface 401, that is, the electronic device displays a GUI as shown in (e) or (f) of FIG4 .
- the electronic device in response to detecting the user's selection of the football and player #2 as target objects or automatically identifying the football and player #2 as target objects, the electronic device can select the football and player #2 in video #1 as target objects to generate video #2 and play it on interface 401.
- the electronic device in response to detecting that the user selects the football and player #2 as the target objects or automatically identifying the football and player #2 as the target objects, the electronic device can select the football and player #2 in video #1 as the target objects to generate video #2, and display interface 406 on interface 401, and preview and play video #2 on the interface 406.
- video #2 can be played in full screen.
- the electronic device can determine the target object or automatically identify the target object based on the user's selection, and then automatically generate a new video based on the original video.
- the new video focuses on the target object selected by the user or the automatically identified target object, which can facilitate the user to watch the main content presented in the video and improve the user experience.
- the video played by the electronic device may be an online video of a video application, or the video may be a local video.
- the electronic device determines the target object based on the user's selection, or automatically detects the target object, and plays a new video centered on the target object based on the target object.
- the electronic device plays the video in horizontal mode. In other examples, when the electronic device changes from horizontal mode to vertical mode, the electronic device can also play a new video centered on the target object.
- the new video centered on the target object described in the embodiment of the present application refers to the overall video content of the new video centered on the target object.
- the target object is at the center, but this does not mean that every video frame of the new video is centered on the target object.
- the cropping frame can be smoothed between frames, so that the target object in some video frames of the new video may be slightly offset from the center of the video.
- FIG. 5 shows a set of GUIs provided by an embodiment of the present application.
- the electronic device plays video #1 on the interface 501 in a horizontal screen.
- a GUI as shown in (b) of FIG. 5 may be displayed.
- the electronic device detects that the screen has changed from horizontal to video, and can determine the target object in video #1 and play video #2 centered on the target object.
- the vertical length of the video frame can change, and the blank part can be filled with black borders or masks. Please see below for specific instructions.
- an electronic device when playing a video in horizontal screen to playing a video in vertical screen, it can determine the target object in the original video, and then automatically generate a new video based on the original video.
- the new video focuses on the target object, making it easier for users to watch the main content presented in the video, thereby improving the user experience.
- the electronic device determines the target object based on the user's selection, or automatically detects the target object, and plays a new video centered on the target object based on the target object.
- the electronic device can crop the original video according to default parameters, but is not limited to this. In other examples of the present application, the electronic device can generate a new video centered on the target object based on user configuration.
- FIG. 6 shows another set of GUIs provided in an embodiment of the present application.
- the electronic device displays a window 601 , and the electronic device can display a video to be cropped in the window 601 , wherein the video to be cropped can be uploaded by a user, or can also be an online video.
- the electronic device can generate a new video centered on the target object in response to the user's configuration operation.
- the user may perform one or more of the following configuration operations:
- Video generation type In the embodiment of the present application, the video generation type can be divided into two types:
- One is to generate a video highlighting the target object based on the original video, that is, as shown in the examples of Figures 3 and 4, when the electronic device plays video #1, it can generate and play video #2 based on video #1, and video #2 highlights the target object in video #1.
- the electronic device in response to the user selecting a video generation type that highlights the target object, can generate video # 2 based on video # 1 and play video # 2 in window 601 .
- One is to generate a vertical video that highlights the target object based on a horizontal video, that is, as shown in the example of Figure 5, the electronic device plays video #1 in horizontal mode.
- the electronic device When it detects that the screen has changed from horizontal to vertical mode, it can generate and play video #2 based on video #1.
- Video #2 is suitable for vertical playback of the electronic device and highlights the target object in video #1.
- the electronic device in response to the user selecting the video generation type for converting a horizontal video to a vertical video, the electronic device can generate video #2 based on video #1 and play video #2 in window 602 .
- the electronic device can still play video #1 in window 601.
- Target object The electronic device can identify the object in the video, so that the user can select the object in the video as the target object, or the electronic device can identify the target object in the video according to preset rules.
- the target object can be a person, animal, object, action, etc. in the video.
- the electronic device can determine the football as the target object according to the user's selection, or determine the football as the target object according to a preset rule.
- Cropping frame size limit When an electronic device generates a new video based on the original video, it is necessary to crop the original video.
- the cropping frame in the embodiment of the present application is determined based on the target object.
- the size of the cropping frame for different video frames may be different.
- the user can define the upper limit and/or lower limit of the size of the cropping frame, so that when the electronic device crops the original video, the size of the cropping frame will not be less than the lower limit defined by the user, and will not be greater than the upper limit defined by the user.
- the electronic device determines that the video generation type is to highlight the target object, and the user can set the upper limit and/or lower limit of the cropping frame size, where l1 is the upper limit value of the vertical length of the cropping frame, w1 is the upper limit value of the horizontal length of the cropping frame, l2 is the lower limit value of the vertical length of the cropping frame, and w2 is the lower limit value of the horizontal length of the cropping frame.
- the electronic device determines that the video generation type is horizontal video to vertical video, and the user can set the upper limit and/or lower limit of the cropping box size, where w1 is the upper limit of the horizontal length of the cropping box, and w2 is the lower limit of the horizontal length of the cropping box.
- the electronic device determines that the video generation type is a horizontal screen video to a vertical screen video
- the user can only set the cropping frame.
- the upper limit value and/or lower limit value of the horizontal length, the vertical length of the cropping frame can be equal to the vertical length of the video frame of the original video.
- the longitudinal length in the embodiment of the present application may also be referred to as the height.
- Target object inter-frame speed threshold In the embodiment of the present application, the size of the cropping frame can be determined by the target object inter-frame speed threshold defined by the user. Please see below for details.
- the target object inter-frame speed threshold may be a single speed value, for example, the target object inter-frame speed threshold is 70 pixels/s.
- the target object inter-frame speed threshold may be a speed range, for example, the target object inter-frame speed threshold is 70 pixels/s-90 pixels/s.
- a frame extraction interval can be defined so that the electronic device can extract video frames from the original video according to the frame extraction interval and crop the extracted video frames to generate a new video.
- Video size In the embodiment of the present application, a video size can be defined, and the video size is the size of the generated video #2.
- the electronic device determines that the video generation type is to highlight the target object, and the generated video is a horizontal video.
- the user can set the video size, and the video size is the horizontal video size, where l 3 is the horizontal length of the horizontal video and w 3 is the vertical length of the horizontal video.
- the electronic device determines that the video generation type is horizontal video to vertical video, and the generated video is a vertical video.
- the user can set the video size, and the video size is the vertical video size, where l 4 is the horizontal length of the vertical video, and w 4 is the vertical length of the vertical video.
- the electronic device can generate a new video based on the user's configuration and the original video.
- the new video focuses on the target object, which makes it easier for the user to watch the main content presented in the video, thereby improving the user experience.
- interfaces described above in the GUI shown in FIGS. 3 to 6 may also be understood as windows.
- interface 301 may also be referred to as window 301
- interface 305 may also be referred to as window 305 .
- the electronic device can determine the target object in video #1, and crop the video frame of video #1 according to the determined target object, and then generate video #2 according to the cropped video frame.
- the size of the cropping frame of each video frame can be determined according to the target object.
- the size of the cropping frame of each video frame may be different.
- the field of view of the cropped video frame may also be different.
- the field of view of the cropped video frame can be understood as a ratio used to characterize the size of the cropped video frame and the size of the original video frame.
- the size of video frame #1 and video frame #2 is a, wherein the size of the cropping box of video frame #1 is b, and the size of the cropping box of video frame #2 is c, b>c, video frame #1 is cropped to obtain video frame #3, and video frame #2 is cropped to obtain video frame #4. Since b>c and the sizes of video frame #1 and video frame #2 are both a, the ratio of video frame #3 to video frame #1 is greater than the ratio of video frame #4 to video frame #2, that is, the field of view of video frame #3 is greater than the field of view of video frame #4.
- FIG. 7 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
- the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
- the size of cropping frame #1 is smaller than the size of cropping frame #2, and the size of cropping frame #2 is smaller than the size of cropping frame #3, and the aspect ratio of the above cropping frame and the video frame can be equal.
- Video frame #4 can be obtained by cropping video frame #1
- video frame #5 can be obtained by cropping video frame #2
- video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is smaller than that of crop frame #2, and the size of crop frame #2 is smaller than that of crop frame #3, the size of video frame #4 is smaller than that of video frame #5, and the size of video frame #5 is smaller than that of video frame #6, that is, the field of view of video frame #4 is smaller than that of video frame #5, and the field of view of video frame #5 is smaller than that of video frame #6.
- the size of video frame #4 can be expanded to the same size as video frame #1 to obtain video frame #7
- the size of video frame #5 can be expanded to the same size as video frame #2 to obtain video frame #8
- the size of video frame #6 can be expanded to the same size as video frame #3 to obtain video frame #9.
- video frame #4, video frame #5 and video frame #6 are enlarged to the same size to obtain video frame #7, video frame #8 and video frame #9 respectively, it can still be considered that the field of view of video frame #7 is smaller than that of video frame #8, and the field of view of video frame #8 is smaller than that of video frame #9.
- FIG. 8 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
- the sizes of video frame #1, video frame #2 and video frame #3 are the same, video frame #1 is before video frame #2, and video frame 2 is before video frame #3.
- the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
- the size of cropping frame #1 is larger than the size of cropping frame #2, and the size of cropping frame #2 is larger than the size of cropping frame #3.
- the aspect ratios of the above cropping frames and the video frames may be equal.
- Video frame #4 can be obtained by cropping video frame #1
- video frame #5 can be obtained by cropping video frame #2
- video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is larger than that of crop frame #2, and the size of crop frame #2 is larger than that of crop frame #3, the size of video frame #4 is larger than that of video frame #5, and the size of video frame #5 is larger than that of video frame #6, that is, the field of view of video frame #4 is larger than that of video frame #5, and the field of view of video frame #5 is larger than that of video frame #6.
- the size of video frame #4 can be expanded to the same size as video frame #1 to obtain video frame #7
- the size of video frame #5 can be expanded to the same size as video frame #2 to obtain video frame #8
- the size of video frame #6 can be expanded to the same size as video frame #3 to obtain video frame #9.
- video frame #4, video frame #5 and video frame #6 are enlarged to the same size to obtain video frame #7, video frame #8 and video frame #9 respectively, it can still be considered that the field of view of video frame #7 is larger than that of video frame #8, and the field of view of video frame #8 is larger than that of video frame #9.
- the cropping box may gradually increase or decrease with the video frame, but the embodiments of the present application are not limited to this. In other examples, the cropping box may gradually increase and then decrease with the video frame, or gradually decrease and then increase with the video frame.
- video #1 is a horizontal video
- video #2 obtained after cropping is still a horizontal video
- the embodiments of the present application are not limited to this.
- video #1 is a horizontal video
- video #2 obtained after cropping can be a vertical video.
- FIG. 9 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
- the sizes of video frame #1, video frame #2 and video frame #3 are the same, video frame #1 is before video frame #2, and video frame 2 is before video frame #3.
- the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
- the longitudinal lengths of the above cropping frames are the same, the lateral length of cropping frame #1 is greater than the lateral length of cropping frame #2, and the lateral length of cropping frame #2 is greater than the lateral length of cropping frame #3. Therefore, the size of cropping frame #1 is greater than the size of cropping frame #2, and the size of cropping frame #2 is greater than the size of cropping frame #3.
- Video frame #4 can be obtained by cropping video frame #1
- video frame #5 can be obtained by cropping video frame #2
- video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is larger than that of crop frame #2, and the size of crop frame #2 is larger than that of crop frame #3, the size of video frame #4 is larger than that of video frame #5, and the size of video frame #5 is larger than that of video frame #6, that is, the field of view of video frame #4 is larger than that of video frame #5, and the field of view of video frame #5 is larger than that of video frame #6.
- the horizontal lengths of video frames #4, #6 and #7 can be adjusted to the horizontal length of the vertical video, which can be user-defined or determined according to the size of the screen of the electronic device.
- FIG10 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in FIG10 , the method includes:
- the electronic device may obtain a first video when playing a video.
- the first video may be an online video or a local video.
- the first video is a video played in horizontal screen.
- the user can upload the first video to edit the first video so that the electronic device can obtain the first video.
- the first video includes N video frames, the sizes of the N video frames may be the same, N>1 and is an integer.
- the N video frames include M objects.
- the M objects may be people, animals, objects, actions, etc.
- the N video frames including the M objects can be understood as the N video frames including the M people, animals, or objects.
- the N video frames including the M objects can be understood as the content presented by the video composed of the N video frames is the M actions.
- the electronic device determines a first video parameter in response to a user determining an operation of generating a video
- the first video parameter may be preset.
- the first video parameter includes a target object and one or more of the following: a video generation type, a cropping frame size limit, a target object frame speed threshold, a frame extraction interval, a video size, and a video time.
- the electronic device may determine a first video parameter, which is used to generate video #2.
- the electronic device detects that the first video includes a preset target object, and determines a first video parameter, which may be preset.
- the electronic device detects that video #1 includes a shooting action, and determines a first video parameter, which is used to generate video #2.
- the electronic device detects an operation of a user configuring a video parameter and determines a first video parameter.
- the user may set video parameters in interface 601 , wherein the video generation type configured by the user is to highlight the target object, so that the electronic device may determine the first video parameter in response to the user's operation of configuring the video parameter.
- the electronic device may track the target object in each video frame of the first video.
- the electronic device after the electronic device determines the first video parameter, it can send the first video parameter and the first video to a server, and the server tracks the target object in each video frame of the first video.
- the method before S1003, tracking the target object, the method further includes:
- L video frames of a first video are determined.
- L video frames of the first video can be determined from the N video frames of the first video, where the L video frames of the first video include the target object determined by the electronic device.
- the electronic device or the server may determine the L video frames of the first video from the N video frames of the first video by the following two possible implementations:
- the electronic device determines L video frames of the first video from N video frames of the first video according to a frame extraction interval, where N>L.
- the frame extraction interval may be user-configured, or may be system-preset or automatically configured.
- the system can determine the frame extraction interval according to the frame rate when configuring the frame extraction interval. For example, if the frame rate is 70 pixels/s and the frame extraction interval is 2, one video frame is extracted every two video frames; if the frame rate is 50 pixels/s, the frame extraction interval is 3; if the frame rate is 90 pixels/s, the frame extraction interval is 1. In other words, the frame rate is inversely proportional to the frame extraction interval.
- the electronic device after the electronic device determines the target object of each video frame, it can determine a cropping frame of each video frame.
- the server may determine a cropping frame for each video frame.
- the electronic device or server may determine the cropping frame of the video frame in the following possible implementations:
- the electronic device or server determines the inter-frame speed of the target object, and determines the cropping frame of each video frame according to the inter-frame speed and the target object.
- the inter-frame speed of the target object can be understood as the ratio of the displacement of the target object between two adjacent video frames to time. For example, as shown in FIG11 , the center coordinates of the target object in video frame #1 are (x 1 , y 1 ), and the center coordinates of the target object in video frame #2 are (x 2 , y 2 ). Video frame #1 and video frame #2 are adjacent video frames, and the time interval is t1. Then the target object is located between video frame #1 and video frame #2.
- the frame rate can be calculated using formula (1).
- the electronic device needs to determine the position and size of the cropping frame to determine the cropping frame, wherein the electronic device can determine the position of the cropping frame according to the position of the target object in the video frame, and can determine the size of the cropping frame according to the frame-to-frame speed of the target object.
- the electronic device or server determines 3 video frames, and the order of the 3 video frames is video frame #1, video frame #2 and video frame #3.
- the electronic device or server can identify the target object in the above 3 video frames and determine the inter-frame speed #1 of the target object between video frame #1 and video frame #2, and the inter-frame speed #2 of the target object between video frame #2 and video frame #3.
- the electronic device or server can first determine the cropping frame #1 of video frame #1.
- the electronic device or server can determine the position of cropping frame #1 according to the position of the target object in video frame #1, and can expand outward a certain distance to determine the size of cropping frame #1 while ensuring that the target object in video frame #1 is intact.
- the distance of the outward expansion can be preset by the system, or can be set by the user.
- the electronic device or server can determine cropping frame #2.
- the electronic device or server may determine the position of cropping frame #2 according to the position of the target object in video frame #2, and determine the size of cropping frame #2 according to the size of cropping frame #1 and inter-frame speed #1.
- the electronic device or server may make the size of cropping frame #2 smaller than the size of cropping frame #1.
- the electronic device or server may determine cropping frame #3.
- the electronic device or server may determine the position of cropping frame #3 according to the position of the target object in video frame #3, and determine the size of cropping frame #3 according to the size of cropping frame #2 and inter-frame speed #2.
- the electronic device or server may make the size of cropping frame #3 smaller than the size of cropping frame #2, so that the electronic device or server determines the sizes of three cropping frames, wherein the size of cropping frame #1 is larger than the size of cropping frame #2, and the size of cropping frame #2 is larger than the size of cropping frame #3.
- the threshold of the inter-frame speed may be a system threshold or a user-configured threshold, that is, in the GUI shown in FIG. 6 , the user may configure the threshold of the inter-frame speed in the interface 601 .
- a possible implementation manner the electronic device or the server determines a cropping frame of a video frame according to a video understanding algorithm.
- the electronic device or server can identify the high-level semantics of the first video frame based on the video understanding algorithm, and the electronic device or server can determine the position and size of the cropping frame of each video frame while ensuring that the high-level semantics remain unchanged.
- the size of the target object in each video frame of the first video may change, in order to ensure that the high-level semantics remain unchanged, the size of the cropping frame of each video frame determined by the electronic device or the server may be different.
- the embodiments of the present application do not limit the video understanding algorithm.
- the video understanding algorithm can be an improved dense trajectory feature (IDT) algorithm, a slow feature analysis algorithm, etc.
- the cropping box size limit may include an upper limit and/or a lower limit, and the electronic device or server needs to make the size of the cropping box larger than the lower limit and/or smaller than the upper limit when determining the size of the cropping box.
- the cropping box size limit may be preset by the system, or may be configured by the user, i.e., in the GUI shown in FIG6 , the user may configure the cropping box size limit in interface 601.
- the cropping frame size limit may be a cropping frame area limit, and the area of the cropping frame must be greater than a lower limit and/or less than an upper limit.
- the cropping frame size limit may be a horizontal length limit and a vertical length limit of the cropping frame, and the horizontal length and the vertical length of the cropping frame must be greater than a lower limit and/or less than an upper limit.
- the cropping frame size limit may be a cropping frame perimeter limit, and the perimeter of the cropping frame must be greater than a lower limit and/or less than an upper limit.
- the aspect ratio of the cropping frame determined by the electronic device or the server is the same as the aspect ratio of the video frame of the first video.
- the horizontal length of the video frame is a
- the vertical length is b
- the aspect ratio of the video frame is a/b
- the horizontal length of the cropping frame is c
- the vertical length is d
- the vertical length of the cropping frame determined by the electronic device or the server is the same as the vertical length of the video frame of the first video.
- the vertical length of the cropping frame determined by the electronic device or the server is the same as the vertical length of the video frame of the first video.
- video frame #1 may be the first video frame of the first video determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
- the electronic device or server may determine cropping frame #1 based on the target object in video frame #1. After the electronic device determines cropping frame #1, it may determine the position of cropping frame #2 based on the position of the target object in video frame #2, and may determine that the size of cropping frame #2 is smaller than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm.
- FIG. 13 (a) video frame #1, video frame #2, The aspect ratio of cropping frame #1 and cropping frame #2 is the same.
- video frame #1 may be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
- the electronic device or the server may determine cropping frame #1 based on the target object in video frame #1.
- After the electronic device determines cropping frame #1 it may determine the position of cropping frame #2 based on the position of the target object in video frame #2, and may determine that the size of cropping frame #2 is larger than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm.
- the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (b) of FIG. 13 have the same aspect ratio.
- video frame #1 may be the first video frame of the first video determined by the electronic device or server
- video frame #2 is the video frame after video frame #1.
- the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, and determine cropping frame #2 based on target object #1 and target object #2 in video frame #2.
- the distance between target object #1 and target object #2 in video frame #2 increases, and the size of cropping frame #2 is larger than the size of cropping frame #1.
- the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (c) of FIG. 13 have the same aspect ratio.
- video frame #1 may be the first video frame of the first video determined by the electronic device or server
- video frame #2 is the video frame after video frame #1.
- the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, and determine cropping frame #2 based on target object #1 and target object #2 in video frame #2.
- the distance between target object #1 and target object #2 in video frame #2 is reduced, and the size of cropping frame #2 is smaller than the size of cropping frame #1.
- the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (d) of FIG. 13 have the same aspect ratio.
- the electronic device may determine a cropping frame for each video frame based on a video understanding algorithm.
- video frame #1 may be the first video frame of the first video determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
- the content of video frame #1 is the player kicking the ball
- the content of video frame #2 is the football entering the goal.
- cropping frame #1 includes the player and the football, that is, target object #1 and target object #2.
- cropping frame #2 may only include the football, that is, only include target object #2.
- the electronic device or server may determine the priority of each target object, and determine L cropping frames according to the priority of each target object and the frame rate of each target object.
- video frame #1 may be the first video frame determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
- the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, wherein the priority of target object #1 is higher than the priority of target object #2. If the inter-frame speeds of target object #1 and target object #2 are both less than a threshold, and since the priority of target object #1 is higher than the priority of target object #2, in order to highlight target object #1 in the new video, the size of the cropping frame may be reduced toward the direction of target object #1 compared to cropping frame #1 to obtain cropping frame #2.
- Video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (f) of FIG. 13 have the same aspect ratio.
- video frame #1 may be the first video frame determined by the electronic device or server, and video frame #2 is a video frame after video frame #1.
- the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, wherein the priority of target object #1 is higher than the priority of target object #2. If any of the inter-frame speeds of target object #1 and target object #2 is greater than a threshold, the size of the cropping frame may be expanded to obtain cropping frame #2 compared to cropping frame #1.
- the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (g) of FIG. 13 have the same aspect ratio.
- S1005 Crop the first video according to the cropping frame.
- the electronic device or server After the electronic device or server determines the cropping frame, it can crop the first video to obtain a cropped video frame. Since the aspect ratio of the cropping frame is the same as the aspect ratio of the video frame of the first video, the aspect ratio of the cropped video frame is the same as the aspect ratio of the video frame of the first video.
- the electronic device or server obtains the cropped video frame and may perform resampling processing to obtain a second video.
- the sizes of the cropped video frames obtained by the electronic device or server may also be different, but the aspect ratio is the same. After resampling, the size of the video frame of the second video is the same.
- the electronic device or server obtains video frame #4, video frame #5 and video frame #6 after cropping, and the electronic device or server obtains video frame #7, video frame #8 and video frame #9 of the same size by resampling the above video frames, but the video frames #4, #5 and #6 are not the same size as the above video frames.
- the fields of view of video frame #7, video frame #8, and video frame #9 are different.
- the electronic device can determine the target object in the original video and generate a new video.
- the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
- the new video centered on the target object described in the embodiments of the present application means that the overall video content of the new video is centered on the target object, but this does not mean that every video frame of the new video is centered on the target object.
- the cropping frame can be smoothed between frames, so that the target object in some video frames of the new video may be slightly offset relative to the center position of the video.
- FIG. 14 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in the figure, the method includes:
- S1402 Determine a first video parameter.
- the first video is a horizontal screen video.
- the electronic device detects that the screen changes from horizontal to video and determines a first video parameter, which may be preset.
- the electronic device detects that the screen has changed from horizontal to video, and determines a first video parameter, which is used to generate video #2.
- the electronic device detects an operation of a user configuring a video parameter and determines a first video parameter.
- the user may set video parameters in interface 601 , wherein the video generation type configured by the user is a horizontal-to-vertical-to-vertical video, and thus the electronic device may determine the first video parameter in response to the user's operation of configuring the video parameter.
- the method before S1403, tracking the target object, the method further includes:
- L video frames of a first video are determined.
- the electronic device or server determines the position and size of the cropping frame.
- the method by which the electronic device or server determines the position and size of the cropping frame is similar to that described above and will not be described in detail here.
- the difference from the method shown in FIG. 10 is that the vertical lengths of the cropping frames determined in this method are the same, but the horizontal lengths are different. In other words, the aspect ratios of the cropping frames are different.
- the horizontal length of the video frame is a, and the vertical length is b; the horizontal length of the cropping frame is c, and the vertical length is b.
- video frame #1 can be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
- the electronic device or the server can determine cropping frame #1 based on the target object in video frame #1.
- After the electronic device determines cropping frame #1 if it is determined that the size of cropping frame #2 is larger than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm, then when expanding cropping frame #2, it can be expanded in the direction in which the target object moves.
- the video frame #1 and cropping frame #1 shown in (a) in Figure 15 have the same vertical length but different horizontal lengths, that is, the aspect ratios of video frame #1 and cropping frame #1 are different. Similarly, the aspect ratios of video frame #2 and cropping frame #2 are different.
- video frame #1 may be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
- the electronic device or the server may determine cropping frame #1 based on the target object in video frame #1. After the electronic device determines cropping frame #1, if it is determined based on the inter-frame rate or the video understanding algorithm that the size of cropping frame #2 is smaller than the size of cropping frame #1, when shrinking cropping frame #2, it may be shrunk in the direction in which the target object moves.
- the video frame #1 and cropping frame #1 shown in (b) of FIG. 15 have different aspect ratios, and the aspect ratios of video frame #1 and cropping frame #2 are different.
- S1405 Crop the first video according to the cropping frame.
- the electronic device or the server crops the first video according to the cropping frame to obtain cropped video frames.
- the cropped video frames have the same vertical length but different horizontal lengths.
- the cropped video frame can be resampled.
- the horizontal length of the resampled video frame is the same, so that the electronic device generates a second video according to the horizontal length of the resampled video frame, and the second video is a vertical screen.
- the electronic device or server obtains video frame #4, video frame #5, and video frame #6 after cropping, and the electronic device or server resamples the above video frames to obtain video frame #7, video frame #8, and video frame #9 with the same horizontal length.
- the electronic device can convert a horizontal video into a vertical video.
- the vertical video is centered on the target object, and the size of the cropping frame of the vertical video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
- FIG. 16 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in FIG. 16 , the method includes:
- the electronic device may obtain a first video when playing a video.
- the first video may be an online video or a local video.
- the user can upload the first video to edit the first video so that the electronic device can obtain the first video.
- the first video includes N video frames, where N ⁇ 2 and is an integer.
- the N video frames include a first video frame and a second video frame.
- the first video frame and the second video frame include at least one object, and the at least one object includes a first target object, and the first target object can be a person, an animal, an object, an action, etc.
- the first video frame and the second video frame including the first target object can be understood as the first video frame and the second video frame including the person, the animal, or the object.
- the first video frame and the second video frame including the first target object can be understood as the content presented by the video composed of the first video frame and the second video frame is the action.
- S1602 Responding to a first operation of the user, where the first operation is an operation of selecting a first target object.
- the electronic device may acquire the second video in response to a first operation of the user, where the first operation is an operation of selecting a first target object.
- the electronic device can mark an object in video #1 in response to a user clicking on control 402 , and can then obtain and play video #2 in response to a user selecting a target object.
- the electronic device detects that the user changes the electronic device from a landscape orientation to a portrait orientation, determines the target object in video #1, and then obtains video #2 and plays video #2.
- the second video acquired by the electronic device includes a third video frame and a fourth video frame, and the third video frame and the fourth video frame include a first target object, wherein the third video frame is obtained by cropping according to the first target object in the first video frame, and the fourth video frame is obtained by cropping according to the first target object in the second video frame, and the size of the third video frame is the same as that of the fourth video frame.
- the at least one object further includes a second target object, and the third video frame and the fourth video frame do not include the second target object.
- the first video frame and the second video frame include the first target object and the second target object. Since the user only selects the first target object, the third video frame and the fourth video frame may not include the second target object.
- the size of the first target object in the first video frame is different from the size of the first target object in the third video frame.
- the sizes of the first target object in the third video frame and the fourth video frame are different.
- the size of the first target object in the first video frame and the first target object in the second video frame is the same, but the inter-frame speed of the first target object in the first video frame is different from the inter-frame speed of the first target object in the second video frame, then the size of the cropping frame of the first video frame may be different from the size of the cropping frame of the second video frame, and the third video frame and the fourth video frame are generated by resampling according to the cropped first video frame and the cropped video frame, respectively, and the third video frame and the fourth video frame are video frames of the same size, then the sizes of the first target objects in the third video frame and the fourth video frame are different.
- the sizes of characters in video frame #1, video frame #2 and video frame #3 are the same, and the electronic device or server obtains video frame #4, video frame #5 and video frame #6 after cropping.
- the electronic device or server resamples the above video frames to obtain video frame #7, video frame #8 and video frame #9 of the same size, but the sizes of characters in video frame #7, video frame #8 and video frame #9 are different.
- the size of the first target object in the first video frame is the same as the size of the first target object in the second video frame
- the electronic device determines according to the video understanding algorithm that the size of the cropping frame of the first video frame is different from the size of the cropping frame of the second video frame, and the third video frame and the first video frame are different.
- the four video frames are video frames of the same size, and the sizes of the first target object in the third video frame and the fourth video frame are different.
- the sizes of the first target object in the third video frame and the first target object in the fourth video frame are the same.
- the sizes of the first target object in the first video frame and the first target object in the second video frame are different, the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different, then the size of the cropping box of the first video frame is different from the size of the cropping box of the second video frame, and the third video frame and the fourth video frame are video frames of the same size, then the sizes of the first target objects in the third video frame and the fourth video frame may be the same.
- the size of the first target object in the first video frame is different from the size of the first target object in the second video frame.
- the electronic device determines based on a video understanding algorithm that the size of the cropping box of the first video frame is different from the size of the cropping box of the second video frame, and the third video frame and the fourth video frame are video frames of the same size. Then the size of the first target object in the third video frame and the fourth video frame may be the same.
- the number of video frames of the second video is the same as the number of video frames of the first video.
- the number of video frames of the second video is determined according to the number of video frames of the first video and the frame extraction interval, that is, the number of video frames of the second video is less than the number of video frames of the first video.
- the N video frames include a first target object
- the second video includes M video frames
- the M video frames include the first target object
- M ⁇ N and M is an integer
- the frame extraction interval in the embodiment of the present application can be set by the user.
- the frame extraction interval in the embodiment of the present application can be system-defined.
- the frame extraction interval in the embodiment of the present application can be determined according to the inter-frame speed between two adjacent video frames of the first target object in N video frames.
- the first video and the second video have different durations.
- the first video has a duration of 10 minutes
- the second video has a duration of 2 minutes.
- the first video is a horizontal video
- the second video is a vertical video
- the first video frame and the second video frame are horizontal video frames
- the third video frame and the fourth video frame are vertical video frames
- the third video frame and the fourth video frame have different heights (or vertical lengths).
- video frame # 7 , video frame # 8 , and video frame # 9 are vertical video frames and the vertical lengths of video frame # 7 , video frame # 8 , and video frame # 9 are different.
- the electronic device After the electronic device obtains the second video, it can play the second video.
- the electronic device can determine the first target object in the original video according to the user's operation and play a new video.
- the new video is centered on the first target object, which can better highlight the first target object in the original video and enhance the user experience.
- the method further comprises:
- S1604 playing the second video, including:
- the second video is played in full screen
- the method further includes:
- the electronic device after the electronic device obtains the first video, it can play the first video on interface 401.
- the electronic device can identify the object of the first video and determine the target object in response to the user's selection, and can play video #2 on interface 401.
- video #2 is played after interface 401 and the target object is not detected, video #1 continues to be played.
- the method further comprises:
- a first interface is displayed, wherein the first interface displays a first window and a second window, wherein the first window displays a first video frame, and the second window displays a third video frame.
- the electronic device displays a window 601 and a window 602 , wherein the window 601 may display the first video frame, and the window 603 may display the third video frame.
- the method further comprises:
- S1604 playing the second video, including:
- a second interface is displayed, the second interface includes a first window and a second window, wherein the first window plays the first video and the second window plays the second video;
- the method further includes:
- a third interface is displayed, the third interface includes the first window, the first video continues to be played in the first window, and the third interface does not include the second window.
- the first window is the window for playing video #1
- the second window is the window for playing video #2.
- the first window can be a full-screen window, then the area of the first window is the same as the area of the interface, and the second window can be a small window, or a floating window, which can be displayed above the first window.
- the electronic device first displays window 301 in the interface to play video #1.
- window 301 and window 305 are displayed in the interface, wherein window 301 plays video #1 and window 305 plays video #2.
- window 301 plays video #1 and window 305 plays video #2.
- the electronic device finishes playing video #2 in window 305 if the target object is not detected, only window 301 can be displayed in the interface to play video #1.
- the method before responding to the first user operation at S1602, the method further includes:
- a third interface is displayed, the third interface including a fifth window including at least one object in the first video.
- the electronic device after the electronic device obtains the first video, it can play the first video in interface (or window) 401.
- the electronic device can identify the object of the first video and display the identified object in interface (or window) 403.
- the first operation is an operation of selecting the first target object in the fifth window.
- the electronic device after the electronic device obtains the first video, it can play the first video in interface (or window 401.
- the electronic device can identify the object of the first video and display the identified object in interface (or window 403). The user can select the target object in interface (or window 403).
- the at least one object further includes a second target object
- the method further includes:
- a third video is acquired, wherein the third video includes a fifth video frame and a sixth video frame, the fifth video frame includes the first target object and/or the second target object, the sixth video frame includes the first target object and/or the second target object, the fifth video frame is obtained by cropping according to the first target object and/or the second target object in the first video frame, and the sixth video frame is obtained by cropping according to the first target object and/or the second target object in the second video frame.
- the second operation is an operation for generating a highlighted target object. For example, as shown in FIG. 3 , the user clicks on the control 304 .
- the electronic device may determine the cropped video frame based on a video understanding algorithm or the frame rate between the first target object and the second target object and the priority of the first target object and the second target object.
- the cropped video frame may include both the first target object and the second target object or only include any one of the first target object and the second target object.
- the video frame after the video frame #1 is cropped includes the target object #1 and the target object #2
- the video frame after the video frame #2 is cropped includes the target object #1 but does not include the target object #2.
- the above mainly introduces a method of video processing provided by an embodiment of the present application from the perspective of an electronic device and a server. It is understandable that, in order to realize the above functions, the electronic device and the server include hardware structures and/or software modules corresponding to the execution of each function.
- the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present application.
- the embodiment of the present application can divide the functional modules of the processors in the electronic device and the server according to the above method example.
- each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
- the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of modules in the embodiment of the present application is schematic and is only a logical function division. There may be other division methods in actual implementation.
- FIG17 shows a schematic diagram of the composition of an electronic device provided in an embodiment of the present application.
- the electronic device 1700 includes: an acquisition module 1710 and a video processing module 1720.
- the transceiver module 1710 is used to obtain a first video.
- the video processing module 1720 is used to obtain a second video in response to a user operation.
- the video processing module 1720 is also used to play the second video.
- the video processing module 1720 is further configured to obtain a third video in response to a second operation of the user.
- the second video includes a third video frame and a fourth video frame, and the sizes of the first target object in the third video frame and the first target object in the fourth video frame are different.
- the first video includes a first video frame and a second video frame
- the first target object in the first video frame and the first target object in the second video frame are of the same size
- the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
- the second video includes a third video frame and a fourth video frame, and the first target object in the third video frame and the first target object in the fourth video frame have the same size.
- the first video includes a first video frame and a second video frame
- the sizes of the first target object in the first video frame and the first target object in the second video frame are different
- the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
- the video processing module 1720 is further used to display a first interface, where the first interface displays a first window and a second window, wherein the first window displays a first video frame, and the second window displays a third video frame.
- the video processing module 1720 is further configured to:
- the video processing module 1720 is specifically configured to play the second video in full screen when the first target object is detected;
- the video processing module 1720 is further configured to continue playing the first video after the second video is played in full screen and when the first target object is not detected.
- the number of video frames of the second video is determined based on the number of video frames of the first video frame and the frame extraction interval.
- the video processing module 1720 is further configured to determine a frame extraction interval according to an inter-frame rate of the first target object.
- the first video and the second video have different durations.
- the second video includes a third video frame and a fourth video frame
- the third video frame and the fourth video frame are vertical screen video frames
- the vertical length of the third video frame is different from the vertical length of the fourth video frame.
- Figure 18 shows a schematic diagram of a server composition provided in an embodiment of the present application.
- the electronic device 1800 includes: a transceiver module 1810 and a video processing module 1820.
- the transceiver module 1810 is used to obtain a first video.
- the video processing module 1820 is used to generate a second video in response to a first operation of the user.
- the transceiver module 1810 is further configured to send the second video to the electronic device.
- the video processing module 1820 is specifically configured to: determine a cropping frame of the first video in response to a first operation of the user;
- a second video is generated according to the first video and a cropping frame group of the first video.
- the video processing module 1820 is further configured to generate a third video in response to a second operation of the user.
- the second video includes a third video frame and a fourth video frame, and the sizes of the first target object in the third video frame and the first target object in the fourth video frame are different.
- the first video includes a first video frame and a second video frame
- the first target object in the first video frame and the first target object in the second video frame are of the same size
- the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
- the second video includes a third video frame and a fourth video frame, and the first target object in the third video frame and the first target object in the fourth video frame have the same size.
- the first video includes a first video frame and a second video frame
- the sizes of the first target object in the first video frame and the first target object in the second video frame are different
- the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
- the present application also provides an electronic device, including: a processor, a memory, an application program, and a computer program.
- the above-mentioned components can be connected through one or more communication buses.
- the one or more computer programs are stored in the above-mentioned memory and configured to be executed by the one or more processors.
- the one or more computer programs include instructions, which can be used to enable the electronic device to execute Execute each step of the electronic device in the above embodiments.
- the processor may specifically be the processor 110 shown in FIG. 1
- the memory may specifically be the internal memory 120 shown in FIG. 1 and/or an external memory connected to the electronic device.
- An embodiment of the present application also provides a chip, which includes a processor and a communication interface, wherein the communication interface is used to receive a signal and transmit the signal to the processor, and the processor processes the signal so that the video processing method described in any possible implementation method in the foregoing text is executed.
- This embodiment further provides a computer-readable storage medium, in which computer instructions are stored.
- the computer instructions When the computer instructions are executed on an electronic device, the electronic device executes the above-mentioned related method steps to implement the video processing method in the above-mentioned embodiment.
- This embodiment further provides a computer program product.
- the computer program product When the computer program product is run on a computer, the computer is enabled to execute the above-mentioned related steps to implement the video processing method in the above-mentioned embodiment.
- the term “when" or “after" may be interpreted to mean “if" or “after" or “in response to determining" or “in response to detecting", depending on the context.
- the phrase “upon determining" or “if (the stated condition or event) is detected” may be interpreted to mean “if determining" or “in response to determining" or “upon detecting (the stated condition or event)” or “in response to detecting (the stated condition or event)", depending on the context.
- the disclosed systems, devices and methods can be implemented in other ways.
- the device embodiments described above are only schematic.
- the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
- Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art.
- the computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- Marketing (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (19)
- 一种视频处理的方法,其特征在于,所述方法包括:获取第一视频,所述第一视频包括N个视频帧,N≥2且为整数,其中,所述N个视频帧包括第一视频帧和第二视频帧,所述第一视频帧和所述第二视频帧包括至少一个对象,所述至少一个对象包括第一目标对象;响应于用户第一操作,所述第一操作为选择所述第一目标对象的操作;获取第二视频,其中,所述第二视频包括第三视频帧和第四视频帧,所述第三视频帧和所述第四视频帧包括所述第一目标对象,所述第三视频帧为根据第一视频帧中所述第一目标对象裁剪获得的,所述第四视频帧为根据所述第二视频帧中所述第一目标对象裁剪获得的;播放所述第二视频。
- 根据权利要求1所述的方法,其特征在于,所述至少一个对象还包括第二目标对象,所述方法还包括:响应于用户第二操作,所述第二操作用于选择所述第一目标对象和所述第二目标对象;获取第三视频,其中所述第三视频包括第五视频帧和第六视频帧,所述第五视频帧包括所述第一目标对象和/或所述第二目标对象,所述第六视频帧包括所述第一目标对象和/或所述第二目标对象,所述第五视频帧为根据所述第一视频帧中的所述第一目标对象和/或所述第二目标对象裁剪获得,所述第六视频帧为根据所述第二视频帧中的所述第一目标对象和/或所述第二目标对象裁剪获得。
- 根据权利要求1所述的方法,其特征在于,所述至少一个对象还包括第二目标对象,所述第三视频帧和所述第四视频帧不包括所述第二目标对象。
- 根据权利要求1至3中任一项所述的方法,其特征在于,所述第三视频帧中的所述第一目标对象与所述第一视频帧中的所述第一目标对象的尺寸不同。
- 根据权利要求1至4中任一项所述的方法,其特征在于,所述第三视频帧中的所述第一目标对象和所述第四视频帧中的所述第一目标对象的尺寸不同。
- 根据权利要求5所述的方法,其特征在于,所述第一视频帧中的所述第一目标对象和所述第二视频帧中的所述第一目标对象的尺寸相同,所述第一视频帧中的所述第一目标对象的帧间速度和所述第二视频帧中的所述第一目标对象的帧间速度不同。
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:显示第一界面,所述第一界面显示第一窗口和第二窗口,其中所述第一窗口显示所述第一视频帧,所述第二窗口显示所述第三视频帧。
- 根据权利要求1至6中任一项所述的方法,其特征在于,在播放所述第二视频之前,所述方法还包括:全屏播放所述第一视频;所述播放所述第二视频,包括:当检测到所述第一目标对象,全屏播放所述第二视频;在全屏播放所述第二视频之后,所述方法还包括:当未检测到所述第一目标对象时,继续全屏播放所述第一视频。
- 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:显示第一界面,所述第一界面包括第一窗口,在所述第一窗口中播放所述第一视频;所述播放所述第二视频,包括:当检测到所述第一目标对象,显示第二界面,所述第二界面包括第一窗口和第二窗口,其中所述第一窗口播放所述第一视频,所述第二窗口播放所述第二视频;在显示第二界面之后,所述方法还包括:当未检测到所述第一目标对象时,显示第三界面,所述第三界面包括第一窗口,在所述第一窗口中继续播放所述第一视频,且所述第三界面不包括所述第二窗口。
- 根据权利要求1至9任一所述的方法,其特征在于,在所述响应于用户第一操作之前,所述方法还包括:显示第三界面,所述第三界面包括第五窗口,所述第五窗口包括所述第一视频中的所述至少一个对象。
- 根据权利要求10所述的方法,其特征在于,所述第一操作为在所述第五窗口选择所述第一目标对象的操作。
- 根据权利要求1至11中任一项所述的方法,其特征在于,所述N个视频帧包括所述第一目标对象,所述第二视频包括M个视频帧,所述M个视频帧包括所述第一目标对象,M≤N且M为整数。
- 根据权利要求12所述的方法,其特征在于,所述方法还包括:根据所述第一目标对象在所述N个视频帧中的相邻两个视频帧之间的帧间速度确定抽帧间隔。
- 根据权利要求1至11中任一项所述的方法,其特征在于,第一视频为横屏视频,第二视频为竖屏视频,所述第一视频帧和所述第二视频帧为横屏视频帧,所述第三视频帧和所述第四视频帧为竖屏视频帧。
- 根据权利要求14所述的方法,其特征在于,所述第三视频帧和所述第四视频帧的高度不同。
- 一种电子设备,其特征在于,包括一个或多个处理器;一个或多个存储器;所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得如权利要求1至15中任一项所述的方法被执行。
- 一种芯片,其特征在于,所述芯片包括处理器和通信接口,所述通信接口用于接收信号,并将所述信号传输至所述处理器,所述处理器处理所述信号,使得如权利要求1至15中任一项所述的方法被执行。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机指令,当所述计算机指令在计算机上运行时,使得如权利要求1至15中任一项所述的方法被执行。
- 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在计算机上运行时,使得所述计算机执行如权利要求1至15中任一项所述的方法。
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23896720.2A EP4618561A4 (en) | 2022-11-30 | 2023-11-27 | VIDEO PROCESSING METHOD AND ELECTRONIC DEVICE |
| US19/223,863 US20250291467A1 (en) | 2022-11-30 | 2025-05-30 | Video Processing Method and Electronic Device |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202211519957.8A CN118118734A (zh) | 2022-11-30 | 2022-11-30 | 一种视频处理的方法以及电子设备 |
| CN202211519957.8 | 2022-11-30 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US19/223,863 Continuation US20250291467A1 (en) | 2022-11-30 | 2025-05-30 | Video Processing Method and Electronic Device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024114569A1 true WO2024114569A1 (zh) | 2024-06-06 |
Family
ID=91220007
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/134290 Ceased WO2024114569A1 (zh) | 2022-11-30 | 2023-11-27 | 一种视频处理的方法以及电子设备 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20250291467A1 (zh) |
| EP (1) | EP4618561A4 (zh) |
| CN (1) | CN118118734A (zh) |
| WO (1) | WO2024114569A1 (zh) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN121217980A (zh) * | 2024-06-26 | 2025-12-26 | 北京字跳网络技术有限公司 | 一种视频处理方法、装置、设备及存储介质 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019127868A1 (zh) * | 2017-12-29 | 2019-07-04 | 广州优视网络科技有限公司 | 横竖屏切换方法、装置和终端 |
| CN112135188A (zh) * | 2020-09-16 | 2020-12-25 | 咪咕文化科技有限公司 | 视频裁剪方法、电子设备及计算机可读存储介质 |
| CN113438436A (zh) * | 2020-03-23 | 2021-09-24 | 阿里巴巴集团控股有限公司 | 一种视频播放方法、视频会议方法、直播方法及相关设备 |
| CN114724055A (zh) * | 2021-01-05 | 2022-07-08 | 华为技术有限公司 | 视频切换方法、装置、存储介质及设备 |
| CN114816210A (zh) * | 2019-06-25 | 2022-07-29 | 华为技术有限公司 | 一种移动终端的全屏显示方法及设备 |
| CN115174994A (zh) * | 2021-04-01 | 2022-10-11 | 腾讯科技(深圳)有限公司 | 视频处理方法、装置、计算机设备及存储介质 |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10623662B2 (en) * | 2016-07-01 | 2020-04-14 | Snap Inc. | Processing and formatting video for interactive presentation |
| US10084970B2 (en) * | 2016-12-05 | 2018-09-25 | International Institute Of Information Technology, Hyderabad | System and method for automatically generating split screen for a video of a dynamic scene |
| CN113014793A (zh) * | 2019-12-19 | 2021-06-22 | 华为技术有限公司 | 一种视频处理方法及电子设备 |
-
2022
- 2022-11-30 CN CN202211519957.8A patent/CN118118734A/zh active Pending
-
2023
- 2023-11-27 WO PCT/CN2023/134290 patent/WO2024114569A1/zh not_active Ceased
- 2023-11-27 EP EP23896720.2A patent/EP4618561A4/en active Pending
-
2025
- 2025-05-30 US US19/223,863 patent/US20250291467A1/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2019127868A1 (zh) * | 2017-12-29 | 2019-07-04 | 广州优视网络科技有限公司 | 横竖屏切换方法、装置和终端 |
| CN114816210A (zh) * | 2019-06-25 | 2022-07-29 | 华为技术有限公司 | 一种移动终端的全屏显示方法及设备 |
| CN113438436A (zh) * | 2020-03-23 | 2021-09-24 | 阿里巴巴集团控股有限公司 | 一种视频播放方法、视频会议方法、直播方法及相关设备 |
| CN112135188A (zh) * | 2020-09-16 | 2020-12-25 | 咪咕文化科技有限公司 | 视频裁剪方法、电子设备及计算机可读存储介质 |
| CN114724055A (zh) * | 2021-01-05 | 2022-07-08 | 华为技术有限公司 | 视频切换方法、装置、存储介质及设备 |
| CN115174994A (zh) * | 2021-04-01 | 2022-10-11 | 腾讯科技(深圳)有限公司 | 视频处理方法、装置、计算机设备及存储介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4618561A1 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN118118734A (zh) | 2024-05-31 |
| EP4618561A4 (en) | 2026-01-28 |
| EP4618561A1 (en) | 2025-09-17 |
| US20250291467A1 (en) | 2025-09-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN109496423B (zh) | 一种拍摄场景下的图像显示方法及电子设备 | |
| CN112717370B (zh) | 一种控制方法和电子设备 | |
| CN113760427B (zh) | 显示页面元素的方法和电子设备 | |
| CN111768416B (zh) | 照片裁剪方法及装置 | |
| WO2020259452A1 (zh) | 一种移动终端的全屏显示方法及设备 | |
| WO2021000881A1 (zh) | 一种分屏方法及电子设备 | |
| WO2023280021A1 (zh) | 一种生成主题壁纸的方法及电子设备 | |
| CN113099146B (zh) | 一种视频生成方法、装置及相关设备 | |
| CN111526314A (zh) | 视频拍摄方法及电子设备 | |
| WO2021104485A1 (zh) | 一种拍摄方法及电子设备 | |
| CN111768352B (zh) | 图像处理方法及装置 | |
| CN110830645B (zh) | 一种操作方法和电子设备及计算机存储介质 | |
| WO2020113534A1 (zh) | 一种拍摄长曝光图像的方法和电子设备 | |
| WO2022156473A1 (zh) | 一种播放视频的方法及电子设备 | |
| WO2022228010A1 (zh) | 一种生成封面的方法及电子设备 | |
| WO2021204103A1 (zh) | 照片预览方法、电子设备和存储介质 | |
| CN114079725A (zh) | 视频防抖方法、终端设备和计算机可读存储介质 | |
| WO2023036084A1 (zh) | 一种图像处理方法及相关装置 | |
| CN115115679A (zh) | 一种图像配准方法及相关设备 | |
| CN114257775B (zh) | 视频特效添加方法、装置及终端设备 | |
| CN110704145A (zh) | 一种热区调整方法与装置、电子设备与存储介质 | |
| US20250291467A1 (en) | Video Processing Method and Electronic Device | |
| US20250350829A1 (en) | Video Recording Method and Electronic Device | |
| WO2024152676A1 (zh) | 一种窗口管理方法以及电子设备 | |
| WO2024109198A1 (zh) | 窗口调整方法及相关装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23896720 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023896720 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023896720 Country of ref document: EP Effective date: 20250613 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 2023896720 Country of ref document: EP |