WO2024114569A1 - 一种视频处理的方法以及电子设备 - Google Patents

一种视频处理的方法以及电子设备 Download PDF

Info

Publication number
WO2024114569A1
WO2024114569A1 PCT/CN2023/134290 CN2023134290W WO2024114569A1 WO 2024114569 A1 WO2024114569 A1 WO 2024114569A1 CN 2023134290 W CN2023134290 W CN 2023134290W WO 2024114569 A1 WO2024114569 A1 WO 2024114569A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
frame
target object
video frame
electronic device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/134290
Other languages
English (en)
French (fr)
Inventor
王悦
钟伟才
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP23896720.2A priority Critical patent/EP4618561A4/en
Publication of WO2024114569A1 publication Critical patent/WO2024114569A1/zh
Priority to US19/223,863 priority patent/US20250291467A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0481Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04842Selection of displayed objects or displayed text elements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • H04N21/4316Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations for displaying supplemental content in a region of the screen, e.g. an advertisement in a separate window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440245Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display the reformatting operation being performed only on part of the stream, e.g. a region of the image or a time segment
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/4728End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for selecting a Region Of Interest [ROI], e.g. for requesting a higher resolution version of a selected region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0117Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
    • H04N7/0122Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal the input and the output signals having different aspect ratios
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2203/00Indexing scheme relating to G06F3/00 - G06F3/048
    • G06F2203/048Indexing scheme relating to G06F3/048
    • G06F2203/04803Split screen, i.e. subdividing the display area or the window area into separate subareas

Definitions

  • the present application relates to the field of electronic devices, and more specifically, to a video processing method and an electronic device.
  • the present application provides a video processing method and electronic device, which can play a new video that highlights the target line in the original video according to the user's operation, thereby improving the user experience.
  • a method for video processing comprising: acquiring a first video, the first video comprising N video frames, N ⁇ 2 and being an integer, wherein the N video frames comprise a first video frame and a second video frame, the first video frame and the second video frame comprise at least one object, and the at least one object comprises a first target object; responding to a first operation of a user, the first operation being an operation of selecting the first target object; acquiring a second video, wherein the second video comprises a third video frame and a fourth video frame, the third video frame and the fourth video frame comprise the first target object, the third video frame is obtained by cropping the first target object in the first video frame, and the fourth video frame is obtained by cropping the first target object in the second video frame; and playing the second video.
  • the electronic device can determine the first target object in the original video according to the user's operation and play a new video.
  • the new video is centered on the first target object, which can better highlight the first target object in the original video and enhance the user experience.
  • the at least one object also includes a second target object
  • the method also includes: in response to a second operation of the user, acquiring a third video, wherein the third video includes a fifth video frame and a sixth video frame, the fifth video frame includes the first target object and/or the second target object, the sixth video frame includes the first target object and/or the second target object, the fifth video frame is obtained by cropping according to the first target object and/or the second target object in the first video frame, and the sixth video frame is obtained by cropping according to the first target object and/or the second target object in the second video frame.
  • the at least one object further includes a second target object, and the third video frame and the fourth video frame do not include the second target object.
  • a size of the first target object in the third video frame is different from a size of the first target object in the first video frame.
  • the first target object in the third video frame and the first target object in the fourth video frame have different sizes.
  • the size of the first target object in the first video frame and the first target object in the second video frame are the same, and the inter-frame rate of the first target object in the first video frame and the inter-frame rate of the first target object in the second video frame are different.
  • the electronic device can determine the target object in the original video and generate a new video.
  • the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
  • the first target object in the third video frame and the first target object in the fourth video frame are the same size.
  • the first target object in the first video frame and the second video frame The size of the first target object in the first video frame is different, and the inter-frame rate of the first target object in the first video frame is different from the inter-frame rate of the first target object in the second video frame.
  • the electronic device can determine the target object in the original video and generate a new video.
  • the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
  • the method further includes: displaying a first interface, the first interface displaying a first window and a second window, wherein the first window displays the first video frame, and the second window displays the third video frame.
  • the method also includes: playing the first video; the playing of the second video includes: when the first target object is detected, playing the second video in full screen; after playing the second video in full screen and the first target object is not detected, the method also includes: continuing to play the first video.
  • the method also includes: displaying a first interface, the first interface including a first window, and playing the first video in the first window; playing the second video includes: when the first target object is detected, displaying a second interface, the second interface including a first window and a second window, wherein the first window plays the first video and the second window plays the second video; after displaying the second interface, the method also includes: when the first target object is not detected, displaying a third interface, the third interface including the first window, continuing to play the first video in the first window, and the third interface does not include the second window.
  • the method before responding to the user's first operation, the method also includes: displaying a third interface, the third interface including a fifth window, and the third window including the at least one object in the first video.
  • the first operation is an operation of selecting the first target object in the fifth window.
  • the N video frames include the first target object
  • the second video includes M video frames
  • the M video frames include the first target object
  • M ⁇ N and M is an integer
  • the method further includes: determining a frame extraction interval according to an inter-frame speed of the first target object between two adjacent video frames in the N video frames.
  • the first video is a horizontal video
  • the second video is a vertical video
  • the first video frame and the second video frame are horizontal video frames
  • the third video frame and the fourth video frame are vertical video frames.
  • the first video and the second video have different durations.
  • the third video frame and the fourth video frame have different heights.
  • the second aspect is an electronic device of an embodiment of the present application, which includes modules/units for executing the above aspects or any possible design method of the above aspects; these modules/units can be implemented by hardware, or the corresponding software can be implemented by hardware.
  • the third aspect is a chip of an embodiment of the present application, which is coupled to a memory in an electronic device and is used to call a computer program stored in the memory and execute the above-mentioned aspects of the embodiment of the present application and any possible design of the above-mentioned aspects of the embodiment of the present application; "coupling" in the embodiment of the present application refers to the direct or indirect combination of two components with each other.
  • the fourth aspect is a computer-readable storage medium according to an embodiment of the present application, wherein the computer-readable storage medium includes a computer program.
  • the computer program runs on an electronic device, the electronic device executes a technical solution such as the above aspect and any possible design of the above aspect.
  • the fifth aspect is a computer program according to an embodiment of the present application, wherein the computer program includes instructions.
  • the instructions When the instructions are executed on a computer, the computer executes a technical solution as in the above aspect and any possible design of the above aspect.
  • the sixth aspect is a graphical user interface on an electronic device of an embodiment of the present application, wherein the electronic device has a display screen, one or more memories, and one or more processors, wherein the one or more processors are used to execute one or more computer programs stored in the one or more memories, and the graphical user interface includes a graphical user interface displayed when the electronic device executes the above aspect and any possible technical solution of the above aspect.
  • the seventh aspect is an electronic device of an embodiment of the present application, which includes one or more processors; one or more memories; the one or more memories store one or more computer programs, and the one or more computer programs include instructions.
  • the instructions are executed by the one or more processors, the above aspects or any possible implementation method of the above aspects is executed.
  • FIG1 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG. 2 is a software structure block diagram of an electronic device provided in an embodiment of the present application.
  • FIG. 3 is a set of GUIs provided in an embodiment of the present application.
  • FIG. 4 is a set of GUIs provided in an embodiment of the present application.
  • FIG. 5 is a set of GUIs provided in an embodiment of the present application.
  • FIG. 6 is a set of GUIs provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
  • FIG. 8 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of a cropped video frame provided in an embodiment of the present application.
  • FIG. 10 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
  • FIG. 11 is a schematic diagram of determining the inter-frame rate provided in an embodiment of the present application.
  • FIG12 is a schematic diagram showing a comparison of the aspect ratios of an original video frame and a cropped video frame provided in an embodiment of the present application.
  • FIG. 13 is a schematic diagram of determining a cropping frame provided in an embodiment of the present application.
  • FIG. 14 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
  • FIG. 15 is a schematic diagram of determining a cropping frame provided in an embodiment of the present application.
  • FIG. 16 is a schematic flowchart of a video processing method provided in an embodiment of the present application.
  • FIG17 is a schematic diagram of the composition of an electronic device provided in an embodiment of the present application.
  • FIG18 is a schematic diagram of a server composition provided in an embodiment of the present application.
  • a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
  • the character "/” generally indicates that the objects associated before and after are in an "or” relationship.
  • references to "one embodiment” or “some embodiments” etc. described in this specification mean that a particular feature, structure or characteristic described in conjunction with the embodiment is included in one or more embodiments of the present application.
  • the phrases “in one embodiment”, “in some embodiments”, “in some other embodiments”, “in some other embodiments”, etc. appearing in different places in this specification do not necessarily all refer to the same embodiment, but mean “one or more but not all embodiments", unless otherwise specifically emphasized in other ways.
  • the terms “including”, “comprising”, “having” and their variations all mean “including but not limited to”, unless otherwise specifically emphasized in other ways.
  • the electronic device may be a portable electronic device that also includes other functions such as a personal digital assistant and/or a music player function, such as a mobile phone, a tablet computer, a wearable electronic device with wireless communication function (such as a smart watch), etc.
  • portable electronic devices include but are not limited to devices equipped with Or a portable electronic device with other operating systems.
  • the portable electronic device may also be other portable electronic devices, such as a laptop computer, etc. It should also be understood that in some other embodiments, the electronic device may not be a portable electronic device, but a desktop computer.
  • FIG1 shows a schematic diagram of the structure of an electronic device 100.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a compass 190, a motor 191, an indicator 192, a camera 193, a display screen 194, and a subscriber identification module (SIM) card interface 195, etc.
  • SIM subscriber identification module
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100.
  • the electronic device 100 may include more or fewer components than shown in the figure, or combine some components, or split some components, or arrange the components differently.
  • the components shown in the figure may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor.
  • the electronic device 100 may include an processor, an AP, a modem processor, a graphics processing unit (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), a baseband processor, and/or a neural-network processing unit (NPU), etc.
  • different processing units may be independent components or integrated in one or more processors.
  • the electronic device 100 may also include one or more processors 110.
  • the controller may generate an operation control signal according to the instruction opcode and the timing signal to complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in the processor 110 may be a cache memory.
  • the memory may store instructions or data that the processor 110 has just used or circulated. If the processor 110 needs to use the instruction or data again, it may be directly called from the memory. In this way, repeated access is avoided, the waiting time of the processor 110 is reduced, and the efficiency of the electronic device 100 in processing data or executing instructions is improved.
  • the processor 110 may include one or more interfaces.
  • the interface may include an inter-integrated circuit (I2C) interface, an inter-integrated circuit sound (I2S) interface, a pulse code modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (MIPI), a general-purpose input/output (GPIO) interface, a SIM card interface, and/or a USB interface.
  • the USB interface 130 is an interface that complies with the USB standard specification, and specifically can be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and a peripheral device.
  • the USB interface 130 can also be used to connect headphones to play audio through the headphones.
  • the interface connection relationship between the modules illustrated in the embodiment of the present application is only a schematic illustration and does not constitute a structural limitation on the electronic device 100.
  • the electronic device 100 may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • the charging management module 140 is used to receive charging input from a charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 may receive charging input from a wired charger through the USB interface 130.
  • the charging management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100. While the charging management module 140 is charging the battery 142, it may also power the electronic device through the power management module 141.
  • the power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110.
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the external memory, the display screen 194, the camera 193, and the wireless communication module 160.
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle number, battery health status (leakage, impedance), etc.
  • the power management module 141 can also be set in the processor 110.
  • the power management module 141 and the charging management module 140 can also be set in the same device.
  • the wireless communication function of the electronic device 100 can be implemented through the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modem processor and the baseband processor.
  • Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve the utilization of antennas.
  • antenna 1 can be reused as a diversity antenna for a wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 150 can provide solutions for wireless communications including 2G/3G/4G/5G, etc., applied to the electronic device 100.
  • the mobile communication module 150 may include at least one filter, a switch, a power amplifier, a low noise amplifier (LNA), etc.
  • the mobile communication module 150 may receive electromagnetic waves from the antenna 1, and perform filtering, amplification, and other processing on the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • the mobile communication module 150 may also amplify the signal modulated by the modulation and demodulation processor, and convert it into electromagnetic waves for radiation through the antenna 1.
  • at least some of the functional modules of the mobile communication module 150 may be arranged in the processor 110.
  • at least some of the functional modules of the mobile communication module 150 may be arranged in the same device as at least some of the modules of the processor 110.
  • the wireless communication module 160 can provide wireless communication solutions including wireless local area networks (WLAN) (such as wireless fidelity (WiFi) networks), bluetooth (BT), global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), infrared (IR), etc., which are applied to the electronic device 100.
  • the wireless communication module 160 can be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the frequency of the electromagnetic wave signal and performs filtering, and sends the processed signal to the processor 110.
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110, modulate the frequency of it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2.
  • the electronic device 100 implements the display function through a GPU, a display screen 194, and an application processor.
  • the GPU is a microprocessor for image processing, which connects the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • the processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos, etc.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diodes (QLED), etc.
  • the electronic device 100 may include one or more display screens 194.
  • the display screen 194 in FIG. 1 can be bent.
  • the display screen 194 can be bent, which means that the display screen can be bent to any angle at any position and can be maintained at the angle.
  • the display screen 194 can be folded in half from the middle to the left or right. It can also be folded in half from the middle to the top or bottom.
  • the display screen 194 of the electronic device 100 can be a flexible screen.
  • the flexible screen has attracted much attention for its unique characteristics and huge potential.
  • flexible screens have the characteristics of strong flexibility and bendability, which can provide users with a new interaction method based on the bendable characteristics, and can meet users' more needs for electronic devices.
  • the foldable display screen on the electronic device can be switched between a small screen in a folded form and a large screen in an unfolded form at any time. Therefore, users use the split-screen function on electronic devices equipped with a foldable display screen more and more frequently.
  • the electronic device 100 can realize the shooting function through the ISP, the camera 193, the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used to process the data fed back by the camera 193. For example, when taking a photo, the shutter is opened, and the light is transmitted to the camera photosensitive element through the lens. The light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
  • the ISP can also perform algorithm optimization on the noise, brightness, and skin color of the image. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the camera 193.
  • the camera 193 is used to capture still images or videos.
  • the object generates an optical image through the lens and projects it onto the photosensitive element.
  • the photosensitive element can be a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) phototransistor.
  • CMOS complementary metal oxide semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then passes the electrical signal to the ISP to be converted into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • the DSP converts the digital image signal into an image signal in a standard RGB, YUV or other format.
  • the electronic device 100 may include one or more cameras 193.
  • the digital signal processor is used to process digital signals, and can process not only digital image signals but also other digital signals. For example, when the electronic device 100 is selecting a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy.
  • Video codecs are used to compress or decompress digital videos.
  • the electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record videos in a variety of coding formats, such as Moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
  • MPEG Moving Picture Experts Group
  • MPEG2 MPEG2, MPEG3, MPEG4, etc.
  • NPU is a neural network (NN) computing processor.
  • NN neural network
  • applications such as intelligent cognition of electronic device 100 can be realized, such as image recognition, face recognition, voice recognition, text understanding, etc.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music and videos can be stored in the external memory card.
  • the internal memory 121 can be used to store one or more computer programs, which include instructions.
  • the processor 110 can enable the electronic device 100 to perform the methods provided in some embodiments of the present application, as well as various applications and data processing, etc. by running the above instructions stored in the internal memory 121.
  • the internal memory 121 may include a program storage area and a data storage area.
  • the program storage area can store an operating system; the program storage area can also store one or more applications (such as a gallery, contacts, etc.).
  • the data storage area can store data (such as photos, contacts, etc.) created during the use of the electronic device 100.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more disk storage components, a flash memory component, a universal flash storage (UFS), etc.
  • the processor 110 can enable the electronic device 100 to perform the methods provided in the embodiments of the present application, as well as other applications and data processing by running instructions stored in the internal memory 121, and/or instructions stored in a memory provided in the processor 110.
  • the electronic device 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the headphone interface 170D, and the application processor. For example, music playback, Recording, etc.
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
  • the pressure sensor 180A is used to sense the pressure signal and can convert the pressure signal into an electrical signal.
  • the pressure sensor 180A can be set on the display screen 194.
  • the capacitive pressure sensor can be a parallel plate including at least two conductive materials.
  • the electronic device 100 determines the intensity of the pressure according to the change in capacitance.
  • the electronic device 100 detects the touch operation intensity according to the pressure sensor 180A.
  • the electronic device 100 can also calculate the touch position according to the detection signal of the pressure sensor 180A.
  • touch operations acting on the same touch position but with different touch operation intensities can correspond to different operation instructions. For example: when a touch operation with a touch operation intensity less than the first pressure threshold acts on the short message application icon, an instruction to view the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, an instruction to create a new short message is executed.
  • the gyro sensor 180B can be used to determine the motion posture of the electronic device 100.
  • the angular velocity of the electronic device 100 around three axes i.e., X, Y, and Z axes
  • the gyro sensor 180B can be used for anti-shake shooting. For example, when the shutter is pressed, the gyro sensor 180B detects the angle of the electronic device 100 shaking, calculates the distance that the lens module needs to compensate based on the angle, and allows the lens to offset the shaking of the electronic device 100 through reverse movement to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenes.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in all directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of the electronic device and is applied to applications such as horizontal and vertical screen switching and pedometers.
  • the ambient light sensor 180L is used to sense the brightness of the ambient light.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket to prevent accidental touches.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photography, fingerprint call answering, etc.
  • the temperature sensor 180J is used to detect temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the electronic device 100 reduces the performance of a processor located near the temperature sensor 180J to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 due to low temperature. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • the touch sensor 180K is also called a "touch panel”.
  • the touch sensor 180K can be set on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a "touch screen”.
  • the touch sensor 180K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194.
  • the touch sensor 180K can also be set on the surface of the electronic device 100, which is different from the position of the display screen 194.
  • FIG2 is a software structure diagram of the electronic device 100 of an embodiment of the present application.
  • the layered architecture divides the software into several layers, each layer has a clear role and division of labor.
  • the layers communicate with each other through software interfaces.
  • the Android system is divided into four layers, from top to bottom, namely, the application layer, the application framework layer, the Android runtime (Android runtime) and the system library, and the kernel layer.
  • the application layer can include a series of application packages.
  • the application package may include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
  • the application framework layer provides application programming interface (API) and programming framework for applications in the application layer.
  • API application programming interface
  • the application framework layer includes some predefined functions.
  • the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and the like.
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, capture the screen, etc.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • This data can include videos, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying images, etc.
  • the view system can be used to build applications.
  • a display interface can be composed of one or more views.
  • a display interface including a text notification icon can include a view for displaying text and a view for displaying images.
  • the phone manager is used to provide communication functions of the electronic device 100, such as management of call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources for applications, such as localized strings, icons, images, layout files, video files, and so on.
  • the notification manager enables applications to display notification information in the status bar. It can be used to convey notification-type messages and can disappear automatically after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, etc.
  • the notification manager can also be a notification that appears in the system top status bar in the form of a chart or scroll bar text, such as notifications of applications running in the background, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in the status bar, a prompt sound is emitted, an electronic device vibrates, an indicator light flashes, etc.
  • the system library can include multiple functional modules, such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
  • functional modules such as surface manager, media library, 3D graphics processing library (such as OpenGL ES), 2D graphics engine (such as SGL), etc.
  • the surface manager is used to manage the display subsystem and provide the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as static image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG and PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, synthesis and layer processing.
  • a 2D graphics engine is a drawing engine for 2D drawings.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least display driver, camera driver, audio driver, and sensor driver.
  • the present application provides a method for video processing, which can lock the target object in the original video, and can adjust the size of the cropping frame when editing the video according to the target object, and can output a new video that highlights the target object in the original video, which can improve the user experience.
  • the target object of a video can be understood as the main content presented by the video.
  • the target object of a video can be a person, animal, object, action, etc. in the video.
  • the target object of the football match video can be the football, the player with the ball, the star player, the player's passing action, the player's foul action, etc.
  • the electronic device can determine the target object in the video according to the user's operation and generate a new video.
  • FIG. 3 shows a set of graphical user interfaces (GUI) provided by an embodiment of the present application.
  • GUI graphical user interfaces
  • the electronic device displays an interface 301, which is a playback interface of a video application.
  • the electronic device can play video #1 on the interface 301, and the content of the video #1 is a person jumping from position #1 to position #2.
  • the bold solid line in the interface 301 is the boundary of video #1, and the electronic device can display other information outside the boundary, such as time, signal strength, etc.
  • the electronic device detects the operation of the user clicking on the interface 301, and in response to the operation, it can display the GUI shown in (b) of FIG. 3 .
  • the electronic device may display one or more controls on the interface 301, and the one or more controls may correspond to different functions.
  • control 302 corresponds to the sharing function
  • control 303 corresponds to the exit function
  • control 304 corresponds to the function of generating a target object video.
  • the function of generating a target object video can be understood as the electronic device selecting an object in the original video as the target object and generating a new video based on the target object.
  • the video #1 can be cropped with the dotted box in (b) of Figure 3 as the cropping box, so that the GUI shown in (c) of Figure 3 can be displayed.
  • the electronic device in response to detecting that the user clicks on the control 304, can crop video #1 using the dotted frame in (b) in FIG. 3 as the cropping frame.
  • the cropping frame selects the jumping action of the character in video #1 as the target object, and expands the cropped video frame to the same size as video #1 to generate video #2 and play it on the interface 301.
  • the content of video #2 is centered on the jumping action of the character, and can highlight the jumping action of the character.
  • the electronic device in response to detecting the user clicking on control 304, may select the jumping action of the character in video #1 as the target object to generate video #2, and display interface 305 on interface 302, and preview and play video #2 on interface 305.
  • video #2 may be played in full screen.
  • the electronic device can automatically generate a new video based on the original video.
  • the new video retains the main content of the original video, which can facilitate users to watch the main content presented by the video, thereby improving the user experience.
  • the electronic device can automatically identify the human action in the video as the target object to generate a new video. In other examples, the electronic device can also determine the target object and generate a new video based on the user's selection.
  • FIG. 4 shows another set of GUIs provided by an embodiment of the present application.
  • the electronic device displays an interface 401, which is a playback interface of a video application.
  • the electronic device can play video #1 on the interface 401, and the video #1 is a football match video.
  • the objects in the video #1 include player #1, player #2, and football.
  • the electronic device can also display a control 402 on the interface 401, and the control 402 corresponds to the function of generating a target object video.
  • the electronic device detects the operation of the user clicking on the control 402, and in response to the operation, the object of the video can be identified and the interface 403 can be displayed, and the interface 403 includes information about the object in the video #1.
  • a GUI as shown in (e) or (f) of Figure 4 can be displayed.
  • the object selected by the user can be referred to as the target object.
  • the user can also set the duration of the generated new video in interface 403. For example, as shown in (b) in Figure 4, the user sets the video time to 00:00-5:30.
  • the electronic device When the electronic device generates a new video, it can determine the target object from the video content of the 00:00-05:30 time period of video #1 and generate a new video.
  • the duration of the generated new video is 5:30.
  • the electronic device detects the user's operation of clicking on control 402, and in response to the operation, the electronic device can select the identified object in interface 401.
  • the electronic device detects the user's operation of selecting an object in interface 401 (e.g., clicking on player #2 and football), in response to the operation, a GUI as shown in (e) or (f) of FIG. 4 can be displayed.
  • the object selected by the user can be referred to as a target object.
  • the electronic device can automatically identify the target object in the video. Taking the detected target object as a shot as an example, when the electronic device detects that player #2 in video #1 shoots, it can display a GUI as shown in (d) or (e) in Figure 4.
  • the electronic device may detect the target object in video #1 in the process of playing video #1 according to a preset rule, which may be set by the system or by the user.
  • the preset rule is to automatically detect a shot in a football video when playing a football video.
  • an option box 404 may be displayed, and the option box 404 includes prompt information for prompting the user that a shot action is detected.
  • the electronic device may play video #2 centered on the shot of player #2 on interface 401, that is, the electronic device displays a GUI as shown in (e) or (f) of FIG4 .
  • the electronic device in response to detecting the user's selection of the football and player #2 as target objects or automatically identifying the football and player #2 as target objects, the electronic device can select the football and player #2 in video #1 as target objects to generate video #2 and play it on interface 401.
  • the electronic device in response to detecting that the user selects the football and player #2 as the target objects or automatically identifying the football and player #2 as the target objects, the electronic device can select the football and player #2 in video #1 as the target objects to generate video #2, and display interface 406 on interface 401, and preview and play video #2 on the interface 406.
  • video #2 can be played in full screen.
  • the electronic device can determine the target object or automatically identify the target object based on the user's selection, and then automatically generate a new video based on the original video.
  • the new video focuses on the target object selected by the user or the automatically identified target object, which can facilitate the user to watch the main content presented in the video and improve the user experience.
  • the video played by the electronic device may be an online video of a video application, or the video may be a local video.
  • the electronic device determines the target object based on the user's selection, or automatically detects the target object, and plays a new video centered on the target object based on the target object.
  • the electronic device plays the video in horizontal mode. In other examples, when the electronic device changes from horizontal mode to vertical mode, the electronic device can also play a new video centered on the target object.
  • the new video centered on the target object described in the embodiment of the present application refers to the overall video content of the new video centered on the target object.
  • the target object is at the center, but this does not mean that every video frame of the new video is centered on the target object.
  • the cropping frame can be smoothed between frames, so that the target object in some video frames of the new video may be slightly offset from the center of the video.
  • FIG. 5 shows a set of GUIs provided by an embodiment of the present application.
  • the electronic device plays video #1 on the interface 501 in a horizontal screen.
  • a GUI as shown in (b) of FIG. 5 may be displayed.
  • the electronic device detects that the screen has changed from horizontal to video, and can determine the target object in video #1 and play video #2 centered on the target object.
  • the vertical length of the video frame can change, and the blank part can be filled with black borders or masks. Please see below for specific instructions.
  • an electronic device when playing a video in horizontal screen to playing a video in vertical screen, it can determine the target object in the original video, and then automatically generate a new video based on the original video.
  • the new video focuses on the target object, making it easier for users to watch the main content presented in the video, thereby improving the user experience.
  • the electronic device determines the target object based on the user's selection, or automatically detects the target object, and plays a new video centered on the target object based on the target object.
  • the electronic device can crop the original video according to default parameters, but is not limited to this. In other examples of the present application, the electronic device can generate a new video centered on the target object based on user configuration.
  • FIG. 6 shows another set of GUIs provided in an embodiment of the present application.
  • the electronic device displays a window 601 , and the electronic device can display a video to be cropped in the window 601 , wherein the video to be cropped can be uploaded by a user, or can also be an online video.
  • the electronic device can generate a new video centered on the target object in response to the user's configuration operation.
  • the user may perform one or more of the following configuration operations:
  • Video generation type In the embodiment of the present application, the video generation type can be divided into two types:
  • One is to generate a video highlighting the target object based on the original video, that is, as shown in the examples of Figures 3 and 4, when the electronic device plays video #1, it can generate and play video #2 based on video #1, and video #2 highlights the target object in video #1.
  • the electronic device in response to the user selecting a video generation type that highlights the target object, can generate video # 2 based on video # 1 and play video # 2 in window 601 .
  • One is to generate a vertical video that highlights the target object based on a horizontal video, that is, as shown in the example of Figure 5, the electronic device plays video #1 in horizontal mode.
  • the electronic device When it detects that the screen has changed from horizontal to vertical mode, it can generate and play video #2 based on video #1.
  • Video #2 is suitable for vertical playback of the electronic device and highlights the target object in video #1.
  • the electronic device in response to the user selecting the video generation type for converting a horizontal video to a vertical video, the electronic device can generate video #2 based on video #1 and play video #2 in window 602 .
  • the electronic device can still play video #1 in window 601.
  • Target object The electronic device can identify the object in the video, so that the user can select the object in the video as the target object, or the electronic device can identify the target object in the video according to preset rules.
  • the target object can be a person, animal, object, action, etc. in the video.
  • the electronic device can determine the football as the target object according to the user's selection, or determine the football as the target object according to a preset rule.
  • Cropping frame size limit When an electronic device generates a new video based on the original video, it is necessary to crop the original video.
  • the cropping frame in the embodiment of the present application is determined based on the target object.
  • the size of the cropping frame for different video frames may be different.
  • the user can define the upper limit and/or lower limit of the size of the cropping frame, so that when the electronic device crops the original video, the size of the cropping frame will not be less than the lower limit defined by the user, and will not be greater than the upper limit defined by the user.
  • the electronic device determines that the video generation type is to highlight the target object, and the user can set the upper limit and/or lower limit of the cropping frame size, where l1 is the upper limit value of the vertical length of the cropping frame, w1 is the upper limit value of the horizontal length of the cropping frame, l2 is the lower limit value of the vertical length of the cropping frame, and w2 is the lower limit value of the horizontal length of the cropping frame.
  • the electronic device determines that the video generation type is horizontal video to vertical video, and the user can set the upper limit and/or lower limit of the cropping box size, where w1 is the upper limit of the horizontal length of the cropping box, and w2 is the lower limit of the horizontal length of the cropping box.
  • the electronic device determines that the video generation type is a horizontal screen video to a vertical screen video
  • the user can only set the cropping frame.
  • the upper limit value and/or lower limit value of the horizontal length, the vertical length of the cropping frame can be equal to the vertical length of the video frame of the original video.
  • the longitudinal length in the embodiment of the present application may also be referred to as the height.
  • Target object inter-frame speed threshold In the embodiment of the present application, the size of the cropping frame can be determined by the target object inter-frame speed threshold defined by the user. Please see below for details.
  • the target object inter-frame speed threshold may be a single speed value, for example, the target object inter-frame speed threshold is 70 pixels/s.
  • the target object inter-frame speed threshold may be a speed range, for example, the target object inter-frame speed threshold is 70 pixels/s-90 pixels/s.
  • a frame extraction interval can be defined so that the electronic device can extract video frames from the original video according to the frame extraction interval and crop the extracted video frames to generate a new video.
  • Video size In the embodiment of the present application, a video size can be defined, and the video size is the size of the generated video #2.
  • the electronic device determines that the video generation type is to highlight the target object, and the generated video is a horizontal video.
  • the user can set the video size, and the video size is the horizontal video size, where l 3 is the horizontal length of the horizontal video and w 3 is the vertical length of the horizontal video.
  • the electronic device determines that the video generation type is horizontal video to vertical video, and the generated video is a vertical video.
  • the user can set the video size, and the video size is the vertical video size, where l 4 is the horizontal length of the vertical video, and w 4 is the vertical length of the vertical video.
  • the electronic device can generate a new video based on the user's configuration and the original video.
  • the new video focuses on the target object, which makes it easier for the user to watch the main content presented in the video, thereby improving the user experience.
  • interfaces described above in the GUI shown in FIGS. 3 to 6 may also be understood as windows.
  • interface 301 may also be referred to as window 301
  • interface 305 may also be referred to as window 305 .
  • the electronic device can determine the target object in video #1, and crop the video frame of video #1 according to the determined target object, and then generate video #2 according to the cropped video frame.
  • the size of the cropping frame of each video frame can be determined according to the target object.
  • the size of the cropping frame of each video frame may be different.
  • the field of view of the cropped video frame may also be different.
  • the field of view of the cropped video frame can be understood as a ratio used to characterize the size of the cropped video frame and the size of the original video frame.
  • the size of video frame #1 and video frame #2 is a, wherein the size of the cropping box of video frame #1 is b, and the size of the cropping box of video frame #2 is c, b>c, video frame #1 is cropped to obtain video frame #3, and video frame #2 is cropped to obtain video frame #4. Since b>c and the sizes of video frame #1 and video frame #2 are both a, the ratio of video frame #3 to video frame #1 is greater than the ratio of video frame #4 to video frame #2, that is, the field of view of video frame #3 is greater than the field of view of video frame #4.
  • FIG. 7 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
  • the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
  • the size of cropping frame #1 is smaller than the size of cropping frame #2, and the size of cropping frame #2 is smaller than the size of cropping frame #3, and the aspect ratio of the above cropping frame and the video frame can be equal.
  • Video frame #4 can be obtained by cropping video frame #1
  • video frame #5 can be obtained by cropping video frame #2
  • video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is smaller than that of crop frame #2, and the size of crop frame #2 is smaller than that of crop frame #3, the size of video frame #4 is smaller than that of video frame #5, and the size of video frame #5 is smaller than that of video frame #6, that is, the field of view of video frame #4 is smaller than that of video frame #5, and the field of view of video frame #5 is smaller than that of video frame #6.
  • the size of video frame #4 can be expanded to the same size as video frame #1 to obtain video frame #7
  • the size of video frame #5 can be expanded to the same size as video frame #2 to obtain video frame #8
  • the size of video frame #6 can be expanded to the same size as video frame #3 to obtain video frame #9.
  • video frame #4, video frame #5 and video frame #6 are enlarged to the same size to obtain video frame #7, video frame #8 and video frame #9 respectively, it can still be considered that the field of view of video frame #7 is smaller than that of video frame #8, and the field of view of video frame #8 is smaller than that of video frame #9.
  • FIG. 8 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
  • the sizes of video frame #1, video frame #2 and video frame #3 are the same, video frame #1 is before video frame #2, and video frame 2 is before video frame #3.
  • the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
  • the size of cropping frame #1 is larger than the size of cropping frame #2, and the size of cropping frame #2 is larger than the size of cropping frame #3.
  • the aspect ratios of the above cropping frames and the video frames may be equal.
  • Video frame #4 can be obtained by cropping video frame #1
  • video frame #5 can be obtained by cropping video frame #2
  • video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is larger than that of crop frame #2, and the size of crop frame #2 is larger than that of crop frame #3, the size of video frame #4 is larger than that of video frame #5, and the size of video frame #5 is larger than that of video frame #6, that is, the field of view of video frame #4 is larger than that of video frame #5, and the field of view of video frame #5 is larger than that of video frame #6.
  • the size of video frame #4 can be expanded to the same size as video frame #1 to obtain video frame #7
  • the size of video frame #5 can be expanded to the same size as video frame #2 to obtain video frame #8
  • the size of video frame #6 can be expanded to the same size as video frame #3 to obtain video frame #9.
  • video frame #4, video frame #5 and video frame #6 are enlarged to the same size to obtain video frame #7, video frame #8 and video frame #9 respectively, it can still be considered that the field of view of video frame #7 is larger than that of video frame #8, and the field of view of video frame #8 is larger than that of video frame #9.
  • the cropping box may gradually increase or decrease with the video frame, but the embodiments of the present application are not limited to this. In other examples, the cropping box may gradually increase and then decrease with the video frame, or gradually decrease and then increase with the video frame.
  • video #1 is a horizontal video
  • video #2 obtained after cropping is still a horizontal video
  • the embodiments of the present application are not limited to this.
  • video #1 is a horizontal video
  • video #2 obtained after cropping can be a vertical video.
  • FIG. 9 shows a schematic diagram of cropping a video frame provided in an embodiment of the present application.
  • the sizes of video frame #1, video frame #2 and video frame #3 are the same, video frame #1 is before video frame #2, and video frame 2 is before video frame #3.
  • the electronic device determines cropping frame #1 in video frame #1, determines cropping frame #2 in video frame #2, and determines cropping frame #3 in video frame #3.
  • the longitudinal lengths of the above cropping frames are the same, the lateral length of cropping frame #1 is greater than the lateral length of cropping frame #2, and the lateral length of cropping frame #2 is greater than the lateral length of cropping frame #3. Therefore, the size of cropping frame #1 is greater than the size of cropping frame #2, and the size of cropping frame #2 is greater than the size of cropping frame #3.
  • Video frame #4 can be obtained by cropping video frame #1
  • video frame #5 can be obtained by cropping video frame #2
  • video frame #6 can be obtained by cropping video frame #3. Since the size of crop frame #1 is larger than that of crop frame #2, and the size of crop frame #2 is larger than that of crop frame #3, the size of video frame #4 is larger than that of video frame #5, and the size of video frame #5 is larger than that of video frame #6, that is, the field of view of video frame #4 is larger than that of video frame #5, and the field of view of video frame #5 is larger than that of video frame #6.
  • the horizontal lengths of video frames #4, #6 and #7 can be adjusted to the horizontal length of the vertical video, which can be user-defined or determined according to the size of the screen of the electronic device.
  • FIG10 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in FIG10 , the method includes:
  • the electronic device may obtain a first video when playing a video.
  • the first video may be an online video or a local video.
  • the first video is a video played in horizontal screen.
  • the user can upload the first video to edit the first video so that the electronic device can obtain the first video.
  • the first video includes N video frames, the sizes of the N video frames may be the same, N>1 and is an integer.
  • the N video frames include M objects.
  • the M objects may be people, animals, objects, actions, etc.
  • the N video frames including the M objects can be understood as the N video frames including the M people, animals, or objects.
  • the N video frames including the M objects can be understood as the content presented by the video composed of the N video frames is the M actions.
  • the electronic device determines a first video parameter in response to a user determining an operation of generating a video
  • the first video parameter may be preset.
  • the first video parameter includes a target object and one or more of the following: a video generation type, a cropping frame size limit, a target object frame speed threshold, a frame extraction interval, a video size, and a video time.
  • the electronic device may determine a first video parameter, which is used to generate video #2.
  • the electronic device detects that the first video includes a preset target object, and determines a first video parameter, which may be preset.
  • the electronic device detects that video #1 includes a shooting action, and determines a first video parameter, which is used to generate video #2.
  • the electronic device detects an operation of a user configuring a video parameter and determines a first video parameter.
  • the user may set video parameters in interface 601 , wherein the video generation type configured by the user is to highlight the target object, so that the electronic device may determine the first video parameter in response to the user's operation of configuring the video parameter.
  • the electronic device may track the target object in each video frame of the first video.
  • the electronic device after the electronic device determines the first video parameter, it can send the first video parameter and the first video to a server, and the server tracks the target object in each video frame of the first video.
  • the method before S1003, tracking the target object, the method further includes:
  • L video frames of a first video are determined.
  • L video frames of the first video can be determined from the N video frames of the first video, where the L video frames of the first video include the target object determined by the electronic device.
  • the electronic device or the server may determine the L video frames of the first video from the N video frames of the first video by the following two possible implementations:
  • the electronic device determines L video frames of the first video from N video frames of the first video according to a frame extraction interval, where N>L.
  • the frame extraction interval may be user-configured, or may be system-preset or automatically configured.
  • the system can determine the frame extraction interval according to the frame rate when configuring the frame extraction interval. For example, if the frame rate is 70 pixels/s and the frame extraction interval is 2, one video frame is extracted every two video frames; if the frame rate is 50 pixels/s, the frame extraction interval is 3; if the frame rate is 90 pixels/s, the frame extraction interval is 1. In other words, the frame rate is inversely proportional to the frame extraction interval.
  • the electronic device after the electronic device determines the target object of each video frame, it can determine a cropping frame of each video frame.
  • the server may determine a cropping frame for each video frame.
  • the electronic device or server may determine the cropping frame of the video frame in the following possible implementations:
  • the electronic device or server determines the inter-frame speed of the target object, and determines the cropping frame of each video frame according to the inter-frame speed and the target object.
  • the inter-frame speed of the target object can be understood as the ratio of the displacement of the target object between two adjacent video frames to time. For example, as shown in FIG11 , the center coordinates of the target object in video frame #1 are (x 1 , y 1 ), and the center coordinates of the target object in video frame #2 are (x 2 , y 2 ). Video frame #1 and video frame #2 are adjacent video frames, and the time interval is t1. Then the target object is located between video frame #1 and video frame #2.
  • the frame rate can be calculated using formula (1).
  • the electronic device needs to determine the position and size of the cropping frame to determine the cropping frame, wherein the electronic device can determine the position of the cropping frame according to the position of the target object in the video frame, and can determine the size of the cropping frame according to the frame-to-frame speed of the target object.
  • the electronic device or server determines 3 video frames, and the order of the 3 video frames is video frame #1, video frame #2 and video frame #3.
  • the electronic device or server can identify the target object in the above 3 video frames and determine the inter-frame speed #1 of the target object between video frame #1 and video frame #2, and the inter-frame speed #2 of the target object between video frame #2 and video frame #3.
  • the electronic device or server can first determine the cropping frame #1 of video frame #1.
  • the electronic device or server can determine the position of cropping frame #1 according to the position of the target object in video frame #1, and can expand outward a certain distance to determine the size of cropping frame #1 while ensuring that the target object in video frame #1 is intact.
  • the distance of the outward expansion can be preset by the system, or can be set by the user.
  • the electronic device or server can determine cropping frame #2.
  • the electronic device or server may determine the position of cropping frame #2 according to the position of the target object in video frame #2, and determine the size of cropping frame #2 according to the size of cropping frame #1 and inter-frame speed #1.
  • the electronic device or server may make the size of cropping frame #2 smaller than the size of cropping frame #1.
  • the electronic device or server may determine cropping frame #3.
  • the electronic device or server may determine the position of cropping frame #3 according to the position of the target object in video frame #3, and determine the size of cropping frame #3 according to the size of cropping frame #2 and inter-frame speed #2.
  • the electronic device or server may make the size of cropping frame #3 smaller than the size of cropping frame #2, so that the electronic device or server determines the sizes of three cropping frames, wherein the size of cropping frame #1 is larger than the size of cropping frame #2, and the size of cropping frame #2 is larger than the size of cropping frame #3.
  • the threshold of the inter-frame speed may be a system threshold or a user-configured threshold, that is, in the GUI shown in FIG. 6 , the user may configure the threshold of the inter-frame speed in the interface 601 .
  • a possible implementation manner the electronic device or the server determines a cropping frame of a video frame according to a video understanding algorithm.
  • the electronic device or server can identify the high-level semantics of the first video frame based on the video understanding algorithm, and the electronic device or server can determine the position and size of the cropping frame of each video frame while ensuring that the high-level semantics remain unchanged.
  • the size of the target object in each video frame of the first video may change, in order to ensure that the high-level semantics remain unchanged, the size of the cropping frame of each video frame determined by the electronic device or the server may be different.
  • the embodiments of the present application do not limit the video understanding algorithm.
  • the video understanding algorithm can be an improved dense trajectory feature (IDT) algorithm, a slow feature analysis algorithm, etc.
  • the cropping box size limit may include an upper limit and/or a lower limit, and the electronic device or server needs to make the size of the cropping box larger than the lower limit and/or smaller than the upper limit when determining the size of the cropping box.
  • the cropping box size limit may be preset by the system, or may be configured by the user, i.e., in the GUI shown in FIG6 , the user may configure the cropping box size limit in interface 601.
  • the cropping frame size limit may be a cropping frame area limit, and the area of the cropping frame must be greater than a lower limit and/or less than an upper limit.
  • the cropping frame size limit may be a horizontal length limit and a vertical length limit of the cropping frame, and the horizontal length and the vertical length of the cropping frame must be greater than a lower limit and/or less than an upper limit.
  • the cropping frame size limit may be a cropping frame perimeter limit, and the perimeter of the cropping frame must be greater than a lower limit and/or less than an upper limit.
  • the aspect ratio of the cropping frame determined by the electronic device or the server is the same as the aspect ratio of the video frame of the first video.
  • the horizontal length of the video frame is a
  • the vertical length is b
  • the aspect ratio of the video frame is a/b
  • the horizontal length of the cropping frame is c
  • the vertical length is d
  • the vertical length of the cropping frame determined by the electronic device or the server is the same as the vertical length of the video frame of the first video.
  • the vertical length of the cropping frame determined by the electronic device or the server is the same as the vertical length of the video frame of the first video.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
  • the electronic device or server may determine cropping frame #1 based on the target object in video frame #1. After the electronic device determines cropping frame #1, it may determine the position of cropping frame #2 based on the position of the target object in video frame #2, and may determine that the size of cropping frame #2 is smaller than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm.
  • FIG. 13 (a) video frame #1, video frame #2, The aspect ratio of cropping frame #1 and cropping frame #2 is the same.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
  • the electronic device or the server may determine cropping frame #1 based on the target object in video frame #1.
  • After the electronic device determines cropping frame #1 it may determine the position of cropping frame #2 based on the position of the target object in video frame #2, and may determine that the size of cropping frame #2 is larger than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm.
  • the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (b) of FIG. 13 have the same aspect ratio.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or server
  • video frame #2 is the video frame after video frame #1.
  • the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, and determine cropping frame #2 based on target object #1 and target object #2 in video frame #2.
  • the distance between target object #1 and target object #2 in video frame #2 increases, and the size of cropping frame #2 is larger than the size of cropping frame #1.
  • the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (c) of FIG. 13 have the same aspect ratio.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or server
  • video frame #2 is the video frame after video frame #1.
  • the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, and determine cropping frame #2 based on target object #1 and target object #2 in video frame #2.
  • the distance between target object #1 and target object #2 in video frame #2 is reduced, and the size of cropping frame #2 is smaller than the size of cropping frame #1.
  • the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (d) of FIG. 13 have the same aspect ratio.
  • the electronic device may determine a cropping frame for each video frame based on a video understanding algorithm.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
  • the content of video frame #1 is the player kicking the ball
  • the content of video frame #2 is the football entering the goal.
  • cropping frame #1 includes the player and the football, that is, target object #1 and target object #2.
  • cropping frame #2 may only include the football, that is, only include target object #2.
  • the electronic device or server may determine the priority of each target object, and determine L cropping frames according to the priority of each target object and the frame rate of each target object.
  • video frame #1 may be the first video frame determined by the electronic device or server, and video frame #2 is the video frame after video frame #1.
  • the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, wherein the priority of target object #1 is higher than the priority of target object #2. If the inter-frame speeds of target object #1 and target object #2 are both less than a threshold, and since the priority of target object #1 is higher than the priority of target object #2, in order to highlight target object #1 in the new video, the size of the cropping frame may be reduced toward the direction of target object #1 compared to cropping frame #1 to obtain cropping frame #2.
  • Video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (f) of FIG. 13 have the same aspect ratio.
  • video frame #1 may be the first video frame determined by the electronic device or server, and video frame #2 is a video frame after video frame #1.
  • the electronic device or server may determine cropping frame #1 based on target object #1 and target object #2 in video frame #1, wherein the priority of target object #1 is higher than the priority of target object #2. If any of the inter-frame speeds of target object #1 and target object #2 is greater than a threshold, the size of the cropping frame may be expanded to obtain cropping frame #2 compared to cropping frame #1.
  • the video frame #1, video frame #2, cropping frame #1, and cropping frame #2 shown in (g) of FIG. 13 have the same aspect ratio.
  • S1005 Crop the first video according to the cropping frame.
  • the electronic device or server After the electronic device or server determines the cropping frame, it can crop the first video to obtain a cropped video frame. Since the aspect ratio of the cropping frame is the same as the aspect ratio of the video frame of the first video, the aspect ratio of the cropped video frame is the same as the aspect ratio of the video frame of the first video.
  • the electronic device or server obtains the cropped video frame and may perform resampling processing to obtain a second video.
  • the sizes of the cropped video frames obtained by the electronic device or server may also be different, but the aspect ratio is the same. After resampling, the size of the video frame of the second video is the same.
  • the electronic device or server obtains video frame #4, video frame #5 and video frame #6 after cropping, and the electronic device or server obtains video frame #7, video frame #8 and video frame #9 of the same size by resampling the above video frames, but the video frames #4, #5 and #6 are not the same size as the above video frames.
  • the fields of view of video frame #7, video frame #8, and video frame #9 are different.
  • the electronic device can determine the target object in the original video and generate a new video.
  • the new video is centered on the target object, and the size of the cropping frame of the new video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
  • the new video centered on the target object described in the embodiments of the present application means that the overall video content of the new video is centered on the target object, but this does not mean that every video frame of the new video is centered on the target object.
  • the cropping frame can be smoothed between frames, so that the target object in some video frames of the new video may be slightly offset relative to the center position of the video.
  • FIG. 14 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in the figure, the method includes:
  • S1402 Determine a first video parameter.
  • the first video is a horizontal screen video.
  • the electronic device detects that the screen changes from horizontal to video and determines a first video parameter, which may be preset.
  • the electronic device detects that the screen has changed from horizontal to video, and determines a first video parameter, which is used to generate video #2.
  • the electronic device detects an operation of a user configuring a video parameter and determines a first video parameter.
  • the user may set video parameters in interface 601 , wherein the video generation type configured by the user is a horizontal-to-vertical-to-vertical video, and thus the electronic device may determine the first video parameter in response to the user's operation of configuring the video parameter.
  • the method before S1403, tracking the target object, the method further includes:
  • L video frames of a first video are determined.
  • the electronic device or server determines the position and size of the cropping frame.
  • the method by which the electronic device or server determines the position and size of the cropping frame is similar to that described above and will not be described in detail here.
  • the difference from the method shown in FIG. 10 is that the vertical lengths of the cropping frames determined in this method are the same, but the horizontal lengths are different. In other words, the aspect ratios of the cropping frames are different.
  • the horizontal length of the video frame is a, and the vertical length is b; the horizontal length of the cropping frame is c, and the vertical length is b.
  • video frame #1 can be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
  • the electronic device or the server can determine cropping frame #1 based on the target object in video frame #1.
  • After the electronic device determines cropping frame #1 if it is determined that the size of cropping frame #2 is larger than the size of cropping frame #1 based on the inter-frame speed or the video understanding algorithm, then when expanding cropping frame #2, it can be expanded in the direction in which the target object moves.
  • the video frame #1 and cropping frame #1 shown in (a) in Figure 15 have the same vertical length but different horizontal lengths, that is, the aspect ratios of video frame #1 and cropping frame #1 are different. Similarly, the aspect ratios of video frame #2 and cropping frame #2 are different.
  • video frame #1 may be the first video frame of the first video determined by the electronic device or the server, and video frame #2 is the video frame after video frame #1.
  • the electronic device or the server may determine cropping frame #1 based on the target object in video frame #1. After the electronic device determines cropping frame #1, if it is determined based on the inter-frame rate or the video understanding algorithm that the size of cropping frame #2 is smaller than the size of cropping frame #1, when shrinking cropping frame #2, it may be shrunk in the direction in which the target object moves.
  • the video frame #1 and cropping frame #1 shown in (b) of FIG. 15 have different aspect ratios, and the aspect ratios of video frame #1 and cropping frame #2 are different.
  • S1405 Crop the first video according to the cropping frame.
  • the electronic device or the server crops the first video according to the cropping frame to obtain cropped video frames.
  • the cropped video frames have the same vertical length but different horizontal lengths.
  • the cropped video frame can be resampled.
  • the horizontal length of the resampled video frame is the same, so that the electronic device generates a second video according to the horizontal length of the resampled video frame, and the second video is a vertical screen.
  • the electronic device or server obtains video frame #4, video frame #5, and video frame #6 after cropping, and the electronic device or server resamples the above video frames to obtain video frame #7, video frame #8, and video frame #9 with the same horizontal length.
  • the electronic device can convert a horizontal video into a vertical video.
  • the vertical video is centered on the target object, and the size of the cropping frame of the vertical video is different from that of the original video, thereby bringing about a change in the field of view, which can better highlight the target object in the original video and enhance the user experience.
  • FIG. 16 shows a schematic flow chart of a video processing method provided in an embodiment of the present application. As shown in FIG. 16 , the method includes:
  • the electronic device may obtain a first video when playing a video.
  • the first video may be an online video or a local video.
  • the user can upload the first video to edit the first video so that the electronic device can obtain the first video.
  • the first video includes N video frames, where N ⁇ 2 and is an integer.
  • the N video frames include a first video frame and a second video frame.
  • the first video frame and the second video frame include at least one object, and the at least one object includes a first target object, and the first target object can be a person, an animal, an object, an action, etc.
  • the first video frame and the second video frame including the first target object can be understood as the first video frame and the second video frame including the person, the animal, or the object.
  • the first video frame and the second video frame including the first target object can be understood as the content presented by the video composed of the first video frame and the second video frame is the action.
  • S1602 Responding to a first operation of the user, where the first operation is an operation of selecting a first target object.
  • the electronic device may acquire the second video in response to a first operation of the user, where the first operation is an operation of selecting a first target object.
  • the electronic device can mark an object in video #1 in response to a user clicking on control 402 , and can then obtain and play video #2 in response to a user selecting a target object.
  • the electronic device detects that the user changes the electronic device from a landscape orientation to a portrait orientation, determines the target object in video #1, and then obtains video #2 and plays video #2.
  • the second video acquired by the electronic device includes a third video frame and a fourth video frame, and the third video frame and the fourth video frame include a first target object, wherein the third video frame is obtained by cropping according to the first target object in the first video frame, and the fourth video frame is obtained by cropping according to the first target object in the second video frame, and the size of the third video frame is the same as that of the fourth video frame.
  • the at least one object further includes a second target object, and the third video frame and the fourth video frame do not include the second target object.
  • the first video frame and the second video frame include the first target object and the second target object. Since the user only selects the first target object, the third video frame and the fourth video frame may not include the second target object.
  • the size of the first target object in the first video frame is different from the size of the first target object in the third video frame.
  • the sizes of the first target object in the third video frame and the fourth video frame are different.
  • the size of the first target object in the first video frame and the first target object in the second video frame is the same, but the inter-frame speed of the first target object in the first video frame is different from the inter-frame speed of the first target object in the second video frame, then the size of the cropping frame of the first video frame may be different from the size of the cropping frame of the second video frame, and the third video frame and the fourth video frame are generated by resampling according to the cropped first video frame and the cropped video frame, respectively, and the third video frame and the fourth video frame are video frames of the same size, then the sizes of the first target objects in the third video frame and the fourth video frame are different.
  • the sizes of characters in video frame #1, video frame #2 and video frame #3 are the same, and the electronic device or server obtains video frame #4, video frame #5 and video frame #6 after cropping.
  • the electronic device or server resamples the above video frames to obtain video frame #7, video frame #8 and video frame #9 of the same size, but the sizes of characters in video frame #7, video frame #8 and video frame #9 are different.
  • the size of the first target object in the first video frame is the same as the size of the first target object in the second video frame
  • the electronic device determines according to the video understanding algorithm that the size of the cropping frame of the first video frame is different from the size of the cropping frame of the second video frame, and the third video frame and the first video frame are different.
  • the four video frames are video frames of the same size, and the sizes of the first target object in the third video frame and the fourth video frame are different.
  • the sizes of the first target object in the third video frame and the first target object in the fourth video frame are the same.
  • the sizes of the first target object in the first video frame and the first target object in the second video frame are different, the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different, then the size of the cropping box of the first video frame is different from the size of the cropping box of the second video frame, and the third video frame and the fourth video frame are video frames of the same size, then the sizes of the first target objects in the third video frame and the fourth video frame may be the same.
  • the size of the first target object in the first video frame is different from the size of the first target object in the second video frame.
  • the electronic device determines based on a video understanding algorithm that the size of the cropping box of the first video frame is different from the size of the cropping box of the second video frame, and the third video frame and the fourth video frame are video frames of the same size. Then the size of the first target object in the third video frame and the fourth video frame may be the same.
  • the number of video frames of the second video is the same as the number of video frames of the first video.
  • the number of video frames of the second video is determined according to the number of video frames of the first video and the frame extraction interval, that is, the number of video frames of the second video is less than the number of video frames of the first video.
  • the N video frames include a first target object
  • the second video includes M video frames
  • the M video frames include the first target object
  • M ⁇ N and M is an integer
  • the frame extraction interval in the embodiment of the present application can be set by the user.
  • the frame extraction interval in the embodiment of the present application can be system-defined.
  • the frame extraction interval in the embodiment of the present application can be determined according to the inter-frame speed between two adjacent video frames of the first target object in N video frames.
  • the first video and the second video have different durations.
  • the first video has a duration of 10 minutes
  • the second video has a duration of 2 minutes.
  • the first video is a horizontal video
  • the second video is a vertical video
  • the first video frame and the second video frame are horizontal video frames
  • the third video frame and the fourth video frame are vertical video frames
  • the third video frame and the fourth video frame have different heights (or vertical lengths).
  • video frame # 7 , video frame # 8 , and video frame # 9 are vertical video frames and the vertical lengths of video frame # 7 , video frame # 8 , and video frame # 9 are different.
  • the electronic device After the electronic device obtains the second video, it can play the second video.
  • the electronic device can determine the first target object in the original video according to the user's operation and play a new video.
  • the new video is centered on the first target object, which can better highlight the first target object in the original video and enhance the user experience.
  • the method further comprises:
  • S1604 playing the second video, including:
  • the second video is played in full screen
  • the method further includes:
  • the electronic device after the electronic device obtains the first video, it can play the first video on interface 401.
  • the electronic device can identify the object of the first video and determine the target object in response to the user's selection, and can play video #2 on interface 401.
  • video #2 is played after interface 401 and the target object is not detected, video #1 continues to be played.
  • the method further comprises:
  • a first interface is displayed, wherein the first interface displays a first window and a second window, wherein the first window displays a first video frame, and the second window displays a third video frame.
  • the electronic device displays a window 601 and a window 602 , wherein the window 601 may display the first video frame, and the window 603 may display the third video frame.
  • the method further comprises:
  • S1604 playing the second video, including:
  • a second interface is displayed, the second interface includes a first window and a second window, wherein the first window plays the first video and the second window plays the second video;
  • the method further includes:
  • a third interface is displayed, the third interface includes the first window, the first video continues to be played in the first window, and the third interface does not include the second window.
  • the first window is the window for playing video #1
  • the second window is the window for playing video #2.
  • the first window can be a full-screen window, then the area of the first window is the same as the area of the interface, and the second window can be a small window, or a floating window, which can be displayed above the first window.
  • the electronic device first displays window 301 in the interface to play video #1.
  • window 301 and window 305 are displayed in the interface, wherein window 301 plays video #1 and window 305 plays video #2.
  • window 301 plays video #1 and window 305 plays video #2.
  • the electronic device finishes playing video #2 in window 305 if the target object is not detected, only window 301 can be displayed in the interface to play video #1.
  • the method before responding to the first user operation at S1602, the method further includes:
  • a third interface is displayed, the third interface including a fifth window including at least one object in the first video.
  • the electronic device after the electronic device obtains the first video, it can play the first video in interface (or window) 401.
  • the electronic device can identify the object of the first video and display the identified object in interface (or window) 403.
  • the first operation is an operation of selecting the first target object in the fifth window.
  • the electronic device after the electronic device obtains the first video, it can play the first video in interface (or window 401.
  • the electronic device can identify the object of the first video and display the identified object in interface (or window 403). The user can select the target object in interface (or window 403).
  • the at least one object further includes a second target object
  • the method further includes:
  • a third video is acquired, wherein the third video includes a fifth video frame and a sixth video frame, the fifth video frame includes the first target object and/or the second target object, the sixth video frame includes the first target object and/or the second target object, the fifth video frame is obtained by cropping according to the first target object and/or the second target object in the first video frame, and the sixth video frame is obtained by cropping according to the first target object and/or the second target object in the second video frame.
  • the second operation is an operation for generating a highlighted target object. For example, as shown in FIG. 3 , the user clicks on the control 304 .
  • the electronic device may determine the cropped video frame based on a video understanding algorithm or the frame rate between the first target object and the second target object and the priority of the first target object and the second target object.
  • the cropped video frame may include both the first target object and the second target object or only include any one of the first target object and the second target object.
  • the video frame after the video frame #1 is cropped includes the target object #1 and the target object #2
  • the video frame after the video frame #2 is cropped includes the target object #1 but does not include the target object #2.
  • the above mainly introduces a method of video processing provided by an embodiment of the present application from the perspective of an electronic device and a server. It is understandable that, in order to realize the above functions, the electronic device and the server include hardware structures and/or software modules corresponding to the execution of each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed in the form of hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Professional and technical personnel can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of the present application.
  • the embodiment of the present application can divide the functional modules of the processors in the electronic device and the server according to the above method example.
  • each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules. It should be noted that the division of modules in the embodiment of the present application is schematic and is only a logical function division. There may be other division methods in actual implementation.
  • FIG17 shows a schematic diagram of the composition of an electronic device provided in an embodiment of the present application.
  • the electronic device 1700 includes: an acquisition module 1710 and a video processing module 1720.
  • the transceiver module 1710 is used to obtain a first video.
  • the video processing module 1720 is used to obtain a second video in response to a user operation.
  • the video processing module 1720 is also used to play the second video.
  • the video processing module 1720 is further configured to obtain a third video in response to a second operation of the user.
  • the second video includes a third video frame and a fourth video frame, and the sizes of the first target object in the third video frame and the first target object in the fourth video frame are different.
  • the first video includes a first video frame and a second video frame
  • the first target object in the first video frame and the first target object in the second video frame are of the same size
  • the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
  • the second video includes a third video frame and a fourth video frame, and the first target object in the third video frame and the first target object in the fourth video frame have the same size.
  • the first video includes a first video frame and a second video frame
  • the sizes of the first target object in the first video frame and the first target object in the second video frame are different
  • the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
  • the video processing module 1720 is further used to display a first interface, where the first interface displays a first window and a second window, wherein the first window displays a first video frame, and the second window displays a third video frame.
  • the video processing module 1720 is further configured to:
  • the video processing module 1720 is specifically configured to play the second video in full screen when the first target object is detected;
  • the video processing module 1720 is further configured to continue playing the first video after the second video is played in full screen and when the first target object is not detected.
  • the number of video frames of the second video is determined based on the number of video frames of the first video frame and the frame extraction interval.
  • the video processing module 1720 is further configured to determine a frame extraction interval according to an inter-frame rate of the first target object.
  • the first video and the second video have different durations.
  • the second video includes a third video frame and a fourth video frame
  • the third video frame and the fourth video frame are vertical screen video frames
  • the vertical length of the third video frame is different from the vertical length of the fourth video frame.
  • Figure 18 shows a schematic diagram of a server composition provided in an embodiment of the present application.
  • the electronic device 1800 includes: a transceiver module 1810 and a video processing module 1820.
  • the transceiver module 1810 is used to obtain a first video.
  • the video processing module 1820 is used to generate a second video in response to a first operation of the user.
  • the transceiver module 1810 is further configured to send the second video to the electronic device.
  • the video processing module 1820 is specifically configured to: determine a cropping frame of the first video in response to a first operation of the user;
  • a second video is generated according to the first video and a cropping frame group of the first video.
  • the video processing module 1820 is further configured to generate a third video in response to a second operation of the user.
  • the second video includes a third video frame and a fourth video frame, and the sizes of the first target object in the third video frame and the first target object in the fourth video frame are different.
  • the first video includes a first video frame and a second video frame
  • the first target object in the first video frame and the first target object in the second video frame are of the same size
  • the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
  • the second video includes a third video frame and a fourth video frame, and the first target object in the third video frame and the first target object in the fourth video frame have the same size.
  • the first video includes a first video frame and a second video frame
  • the sizes of the first target object in the first video frame and the first target object in the second video frame are different
  • the inter-frame speed of the first target object in the first video frame and the inter-frame speed of the first target object in the second video frame are different.
  • the present application also provides an electronic device, including: a processor, a memory, an application program, and a computer program.
  • the above-mentioned components can be connected through one or more communication buses.
  • the one or more computer programs are stored in the above-mentioned memory and configured to be executed by the one or more processors.
  • the one or more computer programs include instructions, which can be used to enable the electronic device to execute Execute each step of the electronic device in the above embodiments.
  • the processor may specifically be the processor 110 shown in FIG. 1
  • the memory may specifically be the internal memory 120 shown in FIG. 1 and/or an external memory connected to the electronic device.
  • An embodiment of the present application also provides a chip, which includes a processor and a communication interface, wherein the communication interface is used to receive a signal and transmit the signal to the processor, and the processor processes the signal so that the video processing method described in any possible implementation method in the foregoing text is executed.
  • This embodiment further provides a computer-readable storage medium, in which computer instructions are stored.
  • the computer instructions When the computer instructions are executed on an electronic device, the electronic device executes the above-mentioned related method steps to implement the video processing method in the above-mentioned embodiment.
  • This embodiment further provides a computer program product.
  • the computer program product When the computer program product is run on a computer, the computer is enabled to execute the above-mentioned related steps to implement the video processing method in the above-mentioned embodiment.
  • the term “when" or “after" may be interpreted to mean “if" or “after" or “in response to determining" or “in response to detecting", depending on the context.
  • the phrase “upon determining" or “if (the stated condition or event) is detected” may be interpreted to mean “if determining" or “in response to determining" or “upon detecting (the stated condition or event)” or “in response to detecting (the stated condition or event)", depending on the context.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the units is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of devices or units, which can be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application can be essentially or partly embodied in the form of a software product that contributes to the prior art.
  • the computer software product is stored in a storage medium and includes several instructions for a computer device (which can be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Graphics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

本申请提供了一种视频处理的方法以及电子设备,该方法包括:获取第一视频,第一视频包括第一视频帧和第二视频帧,第一视频帧和第二视频帧包括至少一个对象,该至少一个对象包括第一目标对象;响应于用户选择第一目标对象的第一操作,获取第二视频,第二视频包括第三视频帧和第四视频帧,第三视频帧和第四视频帧包括第一目标对象,第三视频帧为根据第一视频帧中第一目标对象裁剪获得的,第四视频帧为根据第二视频帧中第一目标对象裁剪获得的;播放所述第二视频。本申请实施例中,可以根据用户的操作确定原视频中的第一目标对象,并播放新的视频,该新的视频以第一目标对象为中心,可以更好的凸显原视频中的第一目标对象,能够提升用户体验。

Description

一种视频处理的方法以及电子设备
本申请要求于2022年11月30日提交中国专利局、申请号为202211519957.8、申请名称为“一种视频处理的方法以及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子设备领域,并且更具体地,涉及一种视频处理的方法以及电子设备。
背景技术
随着移动互联网技术的兴起,观看视频已经成为了用户的日常活动。目前,视频应用程序播放视频时,功能单一,例如,当用户需要单独观看视频中的目标对象时需要手动剪辑视频,以及当用户将视频由横屏播放切换为竖屏播放时,可能会造成视频中的目标对象丢失或目标对象不完整等问题,造成用户体验的下降。
发明内容
本申请提供一种视频处理的方法以及电子设备,可以根据用户的操作,播放凸显原视频中的目标对线的新视频,能够提升用户体验。
第一方面,提供一种视频处理的方法,该方法包括:获取第一视频,该第一视频包括N个视频帧,N≥2且为整数,其中,该N个视频帧包括第一视频帧和第二视频帧,该第一视频帧和该第二视频帧包括至少一个对象,该至少一个对象包括第一目标对象;响应于用户第一操作,该第一操作为选择该第一目标对象的操作;获取第二视频,其中,该第二视频包括第三视频帧和第四视频帧,该第三视频帧和该第四视频帧包括该第一目标对象,该第三视频帧为根据第一视频帧中该第一目标对象裁剪获得的,该第四视频帧为根据该第二视频帧中该第一目标对象裁剪获得的;播放该第二视频。
本申请实施例中,电子设备可以根据用户的操作确定原视频中的第一目标对象,并播放新的视频,该新的视频以第一目标对象为中心,可以更好的凸显原视频中的第一目标对象,能够提升用户体验。
结合第一方面,在第一方面的某些实现方式中,该至少一个对象还包括第二目标对象,该方法还包括:响应于用户第二操作,获取第三视频,其中该第三视频包括第五视频帧和第六视频帧,该第五视频帧包括该第一目标对象和/或该第二目标对象,该第六视频帧包括该第一目标对象和/或该第二目标对象,该第五视频帧为根据该第一视频帧中的该第一目标对象和/或该第二目标对象裁剪获得,该第六视频帧为根据该第二视频帧中的该第一目标对象和/或该第二目标对象裁剪获得。
结合第一方面,在第一方面的某些实现方式中,该至少一个对象还包括第二目标对象,该第三视频帧和该第四视频帧不包括所述第二目标对象。
结合第一方面,在第一方面的某些实现方式中,该第三视频帧中的该第一目标对象与该第一视频帧中的该第一目标对象的尺寸不同。
结合第一方面,在第一方面的某些实现方式中,该第三视频帧中的该第一目标对象和该第四视频帧中的该第一目标对象的尺寸不同。
结合第一方面,在第一方面的某些实现方式中,该第一视频帧中的该第一目标对象和该第二视频帧中的该第一目标对象的尺寸相同,该第一视频帧中的该第一目标对象的帧间速度和该第二视频帧中的该第一目标对象的帧间速度不同。
本申请实施例中,电子设备可以确定原视频中的目标对象,并生成新的视频,该新的视频以目标对象为中心,且新的视频在原视频时的裁剪框的尺寸不同,从而带来了视野的变化,可以更好的凸显原视频中的目标对象,能够提升用户体验。
结合第一方面,在第一方面的某些实现方式中,该第三视频帧中的该第一目标对象和该第四视频帧中的该第一目标对象的尺寸相同。
结合第一方面,在第一方面的某些实现方式中,该第一视频帧中的该第一目标对象和该第二视频帧 中的该第一目标对象的尺寸不同,该第一视频帧中的该第一目标对象的帧间速度和该第二视频帧中的该第一目标对象的帧间速度不同。
本申请实施例中,电子设备可以确定原视频中的目标对象,并生成新的视频,该新的视频以目标对象为中心,且新的视频在原视频时的裁剪框的尺寸不同,从而带来了视野的变化,可以更好的凸显原视频中的目标对象,能够提升用户体验。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:显示第一界面,该第一界面显示第一窗口和第二窗口,其中该第一窗口显示该第一视频帧,该第二窗口显示该第三视频帧。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:播放该第一视频;该播放该第二视频,包括:当检测到该第一目标对象,全屏播放该第二视频;在全屏播放该第二视频之后且未检测到该第一目标对象,该方法还包括:继续播放该第一视频。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:显示第一界面,该第一界面包括第一窗口,在该第一窗口中播放该第一视频;该播放该第二视频,包括:当检测到该第一目标对象,显示第二界面,该第二界面包括第一窗口和第二窗口,其中该第一窗口播放该第一视频,该第二窗口播放该第二视频;在显示第二界面之后,该方法还包括:当未检测到该第一目标对象时,显示第三界面,该第三界面包括第一窗口,在该第一窗口中继续播放该第一视频,且该第三界面不包括该第二窗口。
结合第一方面,在第一方面的某些实现方式中,该响应于用户第一操作之前,该方法还包括:显示第三界面,该第三界面包括第五窗口,该第三窗口包括该第一视频中的该至少一个对象。
结合第一方面,在第一方面的某些实现方式中,该第一操作为在该第五窗口选择该第一目标对象的操作。
结合第一方面,在第一方面的某些实现方式中,该N个视频帧包括该第一目标对象,该第二视频包括M个视频帧,该M个视频帧包括该第一目标对象,M≤N且M为整数。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:根据该第一目标对象在该N个视频帧中的相邻两个视频帧之间的帧间速度确定抽帧间隔。
结合第一方面,在第一方面的某些实现方式中,第一视频为横屏视频,第二视频为竖屏视频,该第一视频帧和该第二视频帧为横屏视频帧,该第三视频帧和该第四视频帧为竖屏视频帧。
结合第一方面,在第一方面的某些实现方式中,该第一视频和该第二视频的时长不同。
结合第一方面,在第一方面的某些实现方式中,该第三视频帧和该第四视频帧的高度不同。
第二方面,为本申请实施例的一种电子设备,所述电子设备包括执行上述方面或者上述方面的任意一种可能的设计的方法的模块/单元;这些模块/单元可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第三方面,为本申请实施例的一种芯片,所述芯片与电子设备中的存储器耦合,用于调用存储器中存储的计算机程序并执行本申请实施例上述方面及其上述方面任一可能设计的技术方案;本申请实施例中“耦合”是指两个部件彼此直接或间接地结合。
第四方面,为本申请实施例的一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,当计算机程序在电子设备上运行时,使得所述电子设备执行如上述方面及其上述方面任一可能设计的技术方案。
第五方面,为本申请实施例的一种计算机程序,所述计算机程序包括指令,当所述指令在计算机上运行时,使得所述计算机执行如上述方面及其上述方面任一可能设计的技术方案。
第六方面,为本申请实施例的一种电子设备上的图形用户界面,所述电子设备具有显示屏、一个或多个存储器、以及一个或多个处理器,所述一个或多个处理器用于执行存储在所述一个或多个存储器中的一个或多个计算机程序,所述图形用户界面包括所述电子设备执行上述方面及其上述方面任一可能设计的技术方案时显示的图形用户界面。
第七方面,为本申请实施例的一种电子设备,该电子设备包括一个或多个处理器;一个或多个存储器;该一个或多个存储器存储有一个或多个计算机程序,该一个或多个计算机程序包括指令,当该指令被该一个或多个处理器执行时,使得上述方面或者上述方面的任意一种可能的实现方式被执行。
其中,第二方面至第七方面的有益效果,请参见第一方面的有益效果,不重复赘述。
附图说明
图1是本申请实施例提供的一种电子设备的结构示意图。
图2是本申请实施例提供的一例电子设备的软件结构框图。
图3是本申请实施例提供的一组GUI。
图4是本申请实施例提供的一组GUI。
图5是本申请实施例提供的一组GUI。
图6是本申请实施例提供的一组GUI。
图7是本申请实施例提供的一种裁剪视频帧的示意图。
图8是本申请实施例提供的一种裁剪视频帧的示意图。
图9是本申请实施例提供的一种裁剪视频帧的示意图。
图10是本申请实施例提供的视频处理的方法的示意性流程图。
图11是本申请实施例提供的确定帧间速度的示意图。
图12是本申请实施例提供的原视频帧和裁剪后的视频帧的横纵比对比示意图。
图13是本申请实施例提供的确定裁剪框的示意图。
图14是本申请实施例提供的视频处理的方法的示意性流程图。
图15是本申请实施例提供的确定裁剪框的示意图。
图16是本申请实施例提供的视频处理方法的示意性流程图。
图17本申请实施例提供的一种电子设备组成示意图。
图18本申请实施例提供的一种服务器组成示意图。
具体实施方式
以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“该”、“上述”、“该”和“这一”旨在也包括例如“一个或多个”这种表达形式,除非其上下文中明确地有相反指示。还应当理解,在本申请以下各实施例中,“至少一个”、“一个或多个”是指一个、两个或两个以上。术语“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系;例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A、B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
以下介绍电子设备、用于这样的电子设备的用户界面、和用于使用这样的电子设备的实施例。在一些实施例中,电子设备可以是还包含其它功能诸如个人数字助理和/或音乐播放器功能的便携式电子设备,诸如手机、平板电脑、具备无线通讯功能的可穿戴电子设备(如智能手表)等。便携式电子设备的示例性实施例包括但不限于搭载或者其它操作系统的便携式电子设备。上述便携式电子设备也可以是其它便携式电子设备,诸如膝上型计算机(Laptop)等。还应当理解的是,在其他一些实施例中,上述电子设备也可以不是便携式电子设备,而是台式计算机。
示例性的,图1示出了电子设备100的结构示意图。电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,指南针190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application  processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的部件,也可以集成在一个或多个处理器中。在一些实施例中,电子设备100也可以包括一个或多个处理器110。其中,控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。在其他一些实施例中,处理器110中还可以设置存储器,用于存储指令和数据。示例性地,处理器110中的存储器可以为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从该存储器中直接调用。这样就避免了重复存取,减少了处理器110的等待时间,因而提高了电子设备100处理数据或执行指令的效率。
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路间(inter-integrated circuit,I2C)接口,集成电路间音频(nter-integrated circuit sound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purpose input/output,GPIO)接口,SIM卡接口,和/或USB接口等。其中,USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。该USB接口130也可以用于连接耳机,通过耳机播放音频。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,WiFi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像、视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD)、有机发光二极管(organic light-emitting diode,OLED)、有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode,AMOLED)、柔性发光二极管(flex light-emitting diode,FLED)、Miniled、MicroLed、Micro-oLed、量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或多个显示屏194。
在本申请的一些实施例中,当显示面板采用OLED、AMOLED、FLED等材料时,上述图1中的显示屏194可以被弯折。这里,上述显示屏194可以被弯折是指显示屏可以在任意部位被弯折到任意角度,并可以在该角度保持,例如,显示屏194可以从中部左右对折。也可以从中部上下对折。
电子设备100的显示屏194可以是一种柔性屏,目前,柔性屏以其独特的特性和巨大的潜力而备受关注。柔性屏相对于传统屏幕而言,具有柔韧性强和可弯曲的特点,可以给用户提供基于可弯折特性的新交互方式,可以满足用户对于电子设备的更多需求。对于配置有可折叠显示屏的电子设备而言,电子设备上的可折叠显示屏可以随时在折叠形态下的小屏和展开形态下大屏之间切换。因此,用户在配置有可折叠显示屏的电子设备上使用分屏功能,也越来越频繁。
电子设备100可以通过ISP、摄像头193、视频编解码器、GPU、显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将该电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点、亮度、肤色进行算法优化。ISP还可以对拍摄场景的曝光、色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或多个摄像头193。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1、MPEG2、MPEG3、MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别、人脸识别、语音识别、文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储一个或多个计算机程序,该一个或多个计算机程序包括指令。处理器110可以通过运行存储在内部存储器121的上述指令,从而使得电子设备100执行本申请一些实施例中所提供的方法,以及各种应用以及数据处理等。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统;该存储程序区还可以存储一个或多个应用(比如图库、联系人等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如照片,联系人等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如一个或多个磁盘存储部件,闪存部件,通用闪存存储器(universal flash storage,UFS)等。在一些实施例中,处理器110可以通过运行存储在内部存储器121的指令,和/或存储在设置于处理器110中的存储器的指令,来使得电子设备100执行本申请实施例中所提供的方法,以及其他应用及数据处理。电子设备100可以通过音频模块170、扬声器170A、受话器170B、麦克风170C、耳机接口170D、以及应用处理器等实现音频功能。例如音乐播放、 录音等。
传感器模块180可以包括压力传感器180A、陀螺仪传感器180B、气压传感器180C、磁传感器180D、加速度传感器180E、距离传感器180F、接近光传感器180G、指纹传感器180H、温度传感器180J、触摸传感器180K、环境光传感器180L、骨传导传感器180M等。
其中,压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测该触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即X、Y和Z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。
图2是本申请实施例的电子设备100的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和系统库,以及内核层。应用程序层可以包括一系列应用程序包。
如图2所示,应用程序包可以包括相机、图库、日历、通话、地图、导航、WLAN、蓝牙、音乐、视频、短信息等应用程序。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架,应用程序框架层包括一些预先定义的函数。
如图2所示,应用程序框架层可以包括窗口管理器、内容提供器、视图系统、电话管理器、资源管理器、通知管理器等。
窗口管理器用于管理窗口程序,窗口管理器可以获取显示屏大小,判断是否有状态栏、锁定屏幕、截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。该数据可以包括视频、图像、音频、拨打和接听的电话、浏览历史和书签、电话簿等。
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串、图标、图片、布局文件、视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息、发出提示音、电子设备振动、指示灯闪烁等。
系统库可以包括多个功能模块。例如:表面管理器(surface manager)、媒体库(media libraries)、三维图形处理库(例如:OpenGL ES)、2D图形引擎(例如:SGL)等。
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频、视频格式回放和录制以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4、H.264、MP3、AAC、AMR、JPG和PNG等。
三维图形处理库用于实现三维图形绘图、图像渲染、合成和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动、摄像头驱动、音频驱动、传感器驱动。
随着移动互联网技术的兴起,观看视频已经成为了用户的日常活动。目前,视频应用程序播放视频时,功能单一,例如,当用户需要单独观看视频中的目标对象时需要手动剪辑视频,以及当用户将视频由横屏播放切换为竖屏播放时,可能会造成视频中的目标对象丢失或目标对象不完整等问题,造成用户体验的下降,基于此,本申请提供了一种视频处理的方法,可以锁定原视频中的目标对象,并可以根据目标对象调整剪辑视频时的裁剪框的尺寸,能够输出凸显原视频中的目标对象的新视频,能够提升用户的体验。
一般来说,用户在观看视频时,目光主要聚焦在视频的目标对象。视频的目标对象可以理解为视频呈现的主要内容,视频的目标对象可以是视频中的人物、动物、物体、动作等。例如,以视频是足球比赛视频为例,足球比赛视频的目标对象可以是足球、带球的球员、明星球员、球员的过人动作、球员的犯规动作等。
为了便于用户更好的观看视频中的目标对象,电子设备可以根据用户的操作确定视频中的目标对象并生成新的视频。
图3示出了本申请实施例提供的一组图形用户界面(graphical user interface,GUI)。
如图3中的(a)所示,电子设备显示有界面301,该界面301是视频应用程序的播放界面,电子设备可以在该界面301播放视频#1,该视频#1的内容是人物由位置#1跳跃至位置#2。该界面301中的加粗实线为视频#1的边界,电子设备可以在边界之外显示其他信息,例如,时间、信号强度等信息。电子设备检测到用户点击界面301的操作,响应于该操作,可以显示如图3中的(b)所示的GUI。
电子设备响应于用户点击界面301的操作,可以在界面301显示一个或多个控件,该一个或多个控件可以对应不同的功能。例如,控件302对应分享功能,控件303对应退出功能,控件304对应生成目标对象视频的功能。生成目标对象视频功能可以理解为电子设备选取原视频中的对象作为目标对象,并根据目标对象生成新的视频。当电子设备检测到用户点击控件304的操作,响应于该操作,可以以图3中的(b)中的虚线框为裁剪框裁剪视频#1,从而可以显示如图3中的(c)所示的GUI。
如图3中的(c)所示,电子设备响应于检测到用户点击控件304的操作,可以以图3中的(b)中的虚线框为裁剪框裁剪视频#1,该裁剪框选取的是视频#1中的人物的跳跃动作为目标对象,将裁剪后的视频帧扩大至与视频#1相同的尺寸以生成视频#2并在界面301播放,该视频#2的内容以人物跳跃动作为中心,可以重点凸显人物的跳跃动作。对比图3中的(b)和(c)可以看出,图3中的(b)示出的视频#1中的人物是由显示屏的左侧跳跃至显示屏的右侧,而图3中的(c)示出的视频#2中的人物处于显示屏的中间位置,且由于视频#2重点凸显人物的跳跃动作,则视频#2中的人物的尺寸大于视频#1中的人物的尺 寸。
在另一些示例中,如图3中的(d)所示,电子设备响应于检测到用户点击控件304的操作,可以选取视频#1中的人物的跳跃动作为目标对象生成视频#2,并在界面302上显示界面305,在该界面305上预览播放视频#2。当电子设备检测到用户点击界面305的操作,响应于该操作,可以全屏播放视频#2。
需要说明的是,本申请实施例中电子设备确定视频中的目标对象并生成新视频的方法将在下文进行详细叙述。
本申请实施例中,电子设备可以根据原视频自动生成新的视频,新的视频保留了原视频的主要内容,能够便于用户观看视频呈现的主要内容,提升了用户体验。
图3示出的示例中,电子设备可以自动识别视频中的人物动作作为目标对象以生成新的视频,在另一些示例中,电子设备还可以根据用户的选择确定目标对象并生成新的视频。
图4示出了本申请实施例提供的另一组GUI。
如图4中的(a)和(b)所示,电子设备显示有界面401,该界面401是视频应用程序的播放界面,电子设备可以在该界面401播放视频#1,该视频#1是足球比赛视频。该视频#1中的对象包括球员#1、球员#2和足球。电子设备还可以在界面401显示控件402,该控件402对应生成目标对象视频的功能。电子设备检测到用户点击控件402的操作,响应于该操作,可以识别视频的对象并显示界面403,该界面403包括视频#1中的对象的信息。当电子设备检测到用户在界面403的选择对象的操作,响应于该操作,可以显示如图4中的(e)或(f)所示的GUI。在该示例中,可以将用户选择的对象称为目标对象。
可选的,在一些示例中,用户还可以在界面403设置生成的新的视频的时长,例如,如图4中的(b)所示,用户设置视频时间为00:00-5:30,则电子设备在生成新的视频时,可以从视频#1的00:00-05:30时间段的视频内容中确定目标对象并生成新的视频,生成的新的视频的时长为5:30。
在另一些示例中,如图4中的(a)和(c)所示,电子设备检测到用户点击控件402的操作,响应于该操作,电子设备可以在界面401框选识别到的对象。当电子设备检测到用户在界面401的选择对象的操作(例如点击球员#2和足球),响应于该操作,可以显示如图4中的(e)或(f)所示的GUI。在该示例中,可以将用户选择的对象称为目标对象。
在另一些示例中,电子设备可以自动识别视频中的目标对象,以检测到的目标对象为射门为例,当电子设备检测到视频#1中球员#2射门时,可以显示如图4中的(d)或(e)所示的GUI。
需要说明的是,电子设备在播放视频#1的过程中检测视频#1中的目标对象可以是根据预设的规则进行检测的,该预设的规则可以是系统设置的,也还可以用户设置的。例如预设的规则是播放足球视频时,自动检测足球视频中的射门。
在另一些示例中,如图4中的(d)所示,电子设备检测到视频#1中射门时,可以显示选项框404,该选项框404包括用于提示用户检测到射门动作的提示信息。当电子设备检测到用户点击控件405的操作,电子设备可以在界面401播放以球员#2射门为中心的视频#2,即电子设备显示如图4中的(e)或(f)所示的GUI。
如图4中的(e)所示,电子设备响应于检测到用户选择足球和球员#2为目标对象的操作或自动识别足球和球员#2为目标对象,可以选取视频#1中的足球和球员#2为目标对象生成视频#2并在界面401播放。
如图4中的(f)所示,电子设备响应于检测到用户选择足球和球员#2为目标对象的操作或自动识别足球和球员#2为目标对象,可以选取视频#1中的足球和球员#2为目标对象生成视频#2,并在界面401上显示界面406,在该界面406上预览播放视频#2。当电子设备检测到用户点击界面406的操作,响应于该操作,可以全屏播放视频#2。
本申请实施例中,电子设备可以根据用户的选择确定目标对象或自动识别目标对象,然后根据原视频自动生成新的视频,新的视频重点凸显了用户选择的目标对象或自动识别的目标对象,能够便于用户观看视频呈现的主要内容,提升了用户体验。
需要说明的是,在上文所示的示例中,电子设备播放的视频可以是视频应用程序的在线视频,或视频也可以是本地视频。
图3和图4所示的示例中,电子设备根据用户的选择确定目标对象,或自动检测目标对象,并根据目标对象播放以目标对象为中心的新视频,在上述示例中,电子设备横屏播放视频,在另一些示例中,当电子设备由横屏变为竖屏时,电子设备也可以播放以目标对象为中心的新视频。
需要说明的是,本申请实施例中所述的以目标对象为中心的新视频是指新的视频的整体视频内容以 目标对象为中心,但这并不代表新的视频的每一个视频帧均是以目标对象为中心,在一些实施例中,电子设备或服务器生成新视频时,为了保证视频的画面流畅,可以对裁剪框做帧间平滑,从而新视频的部分视频帧中的目标对象可能相对于视频的中心位置稍有偏移。
图5示出了本申请实施例提供的一组GUI。
如图5中的(a)所示,电子设备横屏在界面501播放视频#1,当电子设备检测到自身由横屏变为竖屏时,可以显示如图5中的(b)所示的GUI。
如图5中的(b)所示,电子设备检测到自身由横屏变为视频,可以确定视频#1中的目标对象,并播放以目标对象为中心的视频#2。
可以看出的是,电子设备在播放视频#2时,视频帧的纵向长度可以发生变化,空白的部分可以使用黑边或者蒙版填充,具体说明请参见下文。
应理解,针对图5所示的示例中的视频的描述类似于针对图3的描述,为了简洁,在此不再赘述。
本申请实施例中,当电子设备由横屏播放视频变为竖屏播放视频时,可以确定原视频中的目标对象,然后根据原视频自动生成新的视频,新的视频重点凸显了目标对象,能够便于用户观看视频呈现的主要内容,提升了用户体验。
图3至图5所示的示例中,电子设备根据用户的选择确定目标对象,或自动检测目标对象,并根据目标对象播放以目标对象为中心的新视频,在上述示例中,电子设备可以根据默认参数裁剪原视频,但并不限定于此,在本申请的另一些示例中,电子设备可以根据用户配置生成以目标对象为中心的新视频。
图6示出了本申请实施例提供的另一组GUI
如图6所示,电子设备显示有窗口601,电子设备可以在该窗口601显示待裁剪的视频,其中该待裁剪的视频可以是用户上传的,或也可以在线视频。
电子设备可以响应于用户的配置操作,生成以目标对象为中心的新视频。
示例性的,用户可以进行以下中的一项或多项配置操作:
1、视频生成类型,本申请实施例中,可以将视频生成类型分为两种:
一种是根据原视频生成凸显目标对象的视频,即如图3和图4所示的示例,电子设备播放视频#1时,可以根据视频#1生成并播放视频#2,该视频#2凸显了视频#1中的目标对象。
如图6中的(a)所示,电子设备响应于用户选择凸显目标对象的视频生成类型,可以根据视频#1生成视频#2并在窗口601播放视频#2。
一种是根据横屏视频生成凸显目标对象的竖屏视频,即如图5所示的示例,电子设备横屏播放视频#1,当检测到自身由横屏变为竖屏时,可以根据视频#1生成并播放视频#2,该视频#2适用于电子设备竖屏播放且凸显了视频#1中的目标对象。
如图6中的(b)所示,电子设备响应于用户选择横屏视频转为竖屏视频的视频生成类型,可以根据视频#1生成视频#2并在窗口602播放视频#2。
可选的,电子设备仍可以在窗口601播放视频#1。
2、目标对象,电子设备可以识别视频中的对象,从而用户可以选择视频中的对象作为目标对象,或电子设备根据预设的规则识别视频中的目标对象。目标对象可以是视频中的人物、动物、物体、动作等。
如图6中的(a)和(b)所示,假设视频#1为足球比赛视频,电子设备可以根据用户的选择确定足球为目标对象,或根据预设的规则确定足球为目标对象。
3、裁剪框尺寸限值,电子设备根据原视频生成新视频时需要对原视频进行裁剪,本申请实施例中的裁剪框是根据目标对象确定的,针对不同视频帧的裁剪框的尺寸可能不同,本申请实施例中,用户可以定义裁剪框的尺寸的上限值和/或下限值,从而电子设备在裁剪原视频时,裁剪框的尺寸不会小于用户定义的下限值,不会大于用户定义的上限值。
示例性的,如图6中的(a)所示,电子设备确定视频生成类型为凸显目标对象,用户可以设置裁剪框尺寸的上限值和/或下限值,其中l1为裁剪框的纵向长度上限值,w1为裁剪框的横向长度上限值,l2为裁剪框的纵向长度下限值,w2为裁剪框的横向长度下限值。
示例性的,如图6中的(b)所示,电子设备确定视频生成类型为横屏视频转竖屏视频,用户可以设置裁剪框尺寸的上限值和/或下限值,其中w1为裁剪框的横向长度上限值,w2为裁剪框的横向长度下限值。
可以看出的是,当电子设备确定视频生成类型为横屏视频转竖屏视频时,用户可以仅设置裁剪框的 横向长度的上限值和/或下限值,裁剪框的纵向长度可以与原视频的视频帧的纵向长度相等。
本申请实施例中的纵向长度也可以称为高度。
4、目标对象帧间速度阈值,本申请实施例中可以通过用户定义的目标对象帧间速度阈值确定裁剪框的尺寸,具体说明请参见下文。
在一些示例中,目标对象帧间速度阈值可以是一个单一的速度值。例如,目标对象帧间速度阈值为70像素/s。
在一些示例中,目标对象帧间速度阈值可以是一段速度范围。例如,目标对象帧间速度阈值为70像素/s-90像素/s。
5、抽帧间隔,本申请实施例中可以定义抽帧间隔,从而电子设备可以根据该抽帧间隔从原视频中抽取视频帧,并裁剪抽取出的视频帧以生成新视频。
6、视频尺寸,本申请实施例中可以定义视频尺寸,该视频尺寸为生成的视频#2的视频的尺寸。
如图6中的(a)所示,电子设备确定视频生成类型为凸显目标对象,生成的视频是横屏视频,用户可以设置视频尺寸,则该视频尺寸为横屏视频尺寸,其中l3为横屏视频的横向长度,w3为横屏视频的纵向长度。
如图6中的(b)所示,电子设备确定视频生成类型为横屏视频转竖屏视频,生成的视频是竖屏视频,用户可以设置视频尺寸,则该视频尺寸为竖屏视频尺寸,其中l4为竖屏视频的横向长度,w4为竖屏视频的纵向长度。
需要说明的是,图6所示的设置选项还可以显示在图4的界面403中。
本申请实施例中,电子设备可以根据用户的配置和原视频生成新的视频,新的视频重点凸显了目标对象,能够便于用户观看视频呈现的主要内容,提升了用户体验。
需要说明的是,上文在介绍图3至图6示出的GUI时的界面也可以理解为窗口,例如,以图3所示的GUI为例,界面301也可以称为窗口301,界面305也可以称为窗口305。
在图3至图6示出的示例中,电子设备可以确定视频#1中的目标对象,并根据确定的目标对象裁剪视频#1的视频帧,然后根据裁剪后的视频帧生成视频#2,在裁剪视频#1的视频帧时,可以根据目标对象确定每一个视频帧的裁剪框的尺寸。换句话说,每一个视频帧的裁剪框的尺寸可能不同,由于每一个视频帧的裁剪框的尺寸可能不同,则裁剪后的视频帧的视野可能也不相同。裁剪后的视频帧的视野可以理解为用于表征裁剪后的视频帧的尺寸与原视频帧的尺寸比值关系。例如,视频帧#1和视频帧#2的尺寸为a,其中视频帧#1的裁剪框的尺寸为b,视频帧#2的裁剪框的尺寸为c,b>c,将视频帧#1裁剪得到视频帧#3,将视频帧#2裁剪得到视频帧#4,由于b>c且视频帧#1和视频帧#2的尺寸均为a,则视频帧#3与视频帧#1的比值大于视频帧#4与视频帧#2的比值,即视频帧#3的视野大于视频帧#4的视野。
图7示出了本申请实施例提供的一种裁剪视频帧的示意图。
如图7所示,以视频#1的视频帧#1、视频帧#2和视频帧#3的裁剪为例,其中视频帧#1、视频帧#2和视频帧#3的尺寸相同,视频帧#1在视频帧#2之前,视频帧2在视频帧#3之前,电子设备在视频帧#1中确定裁剪框#1,在视频帧#2中确定裁剪框#2,在视频帧#3中确定裁剪框#3,由图7可知,裁剪框#1的尺寸小于裁剪框#2的尺寸,裁剪框#2的尺寸小于裁剪框#3的尺寸,上述裁剪框与视频帧的横纵比可以相等。针对根据目标对象确定裁剪框尺寸可以参见下文说明。
通过裁剪视频帧#1可以得到视频帧#4,裁剪视频帧#2可以得到视频帧#5,裁剪视频帧#3可以得到视频帧#6。由于裁剪框#1的尺寸小于裁剪框#2的尺寸,裁剪框#2的尺寸小于裁剪框#3的尺寸,则视频帧#4的尺寸小于视频帧#5的尺寸,视频帧#5的尺寸小于视频帧#6的尺寸,即视频帧#4的视野小于视频帧#5的视野,视频帧#5的视野小于视频帧#6的视野。
在生成视频#2时,可以将视频帧#4的尺寸扩大至与视频帧#1的相同的尺寸以得到视频帧#7,将视频帧#5的尺寸扩大至与视频帧#2的相同的尺寸以得到视频帧#8,将视频帧#6的尺寸扩大至与视频帧#3的相同的尺寸以得到视频帧#9。
需要说明的是,虽然将视频帧#4、视频帧#5和视频帧#6的尺寸扩大为同一个尺寸以分别得到视频帧#7、视频帧#8和视频帧#9,但仍可以认为,视频帧#7的视野小于视频帧#8的视野,视频帧#8的视野小于视频帧#9的视野。
对比视频#1中的视频帧#1、视频帧#2和视频帧#3和视频#2中的视频帧#7、视频帧#8和视频帧#9可以看出,视频#1中的人物在视频中的尺寸没有变化,而由于视频帧#4、视频帧#5和视频帧#6的尺寸不同, 在将视频帧#4、视频帧#5和视频帧#6调整为同一尺寸后,视频帧中的人物尺寸会发生变化,因此视频#2中的人物在视频中的尺寸逐渐变小。
图8示出了本申请实施例提供的一种裁剪视频帧的示意图。
如图8所示,以视频#1的视频帧#1、视频帧#2和视频帧#3的裁剪为例,其中视频帧#1、视频帧#2和视频帧#3的尺寸相同,视频帧#1在视频帧#2之前,视频帧2在视频帧#3之前。电子设备在视频帧#1中确定裁剪框#1,在视频帧#2中确定裁剪框#2,在视频帧#3中确定裁剪框#3,由图8可知,裁剪框#1的尺寸大于裁剪框#2的尺寸,裁剪框#2的尺寸大于裁剪框#3的尺寸,上述裁剪框与视频帧的横纵比可以相等。
通过裁剪视频帧#1可以得到视频帧#4,裁剪视频帧#2可以得到视频帧#5,裁剪视频帧#3可以得到视频帧#6。由于裁剪框#1的尺寸大于裁剪框#2的尺寸,裁剪框#2的尺寸大于裁剪框#3的尺寸,则视频帧#4的尺寸大于视频帧#5的尺寸,视频帧#5的尺寸大于视频帧#6的尺寸,即视频帧#4的视野大于视频帧#5的视野,视频帧#5的视野大于视频帧#6的视野。
在生成视频#2时,可以将视频帧#4的尺寸扩大至与视频帧#1的相同的尺寸以得到视频帧#7,将视频帧#5的尺寸扩大至与视频帧#2的相同的尺寸以得到视频帧#8,将视频帧#6的尺寸扩大至与视频帧#3的相同的尺寸以得到视频帧#9。
对比视频#1中的视频帧#1、视频帧#2和视频帧#3和视频#2中的视频帧#7、视频帧#8和视频帧#9可以看出,视频#1中的人物在视频中的尺寸没有变化,而由于视频帧#4、视频帧#5和视频帧#6的尺寸不同,在将视频帧#4、视频帧#5和视频帧#6调整为同一尺寸后,视频帧中的人物尺寸会发生变化,因此视频#2中的人物在视频中的尺寸逐渐变大。
需要说明的是,虽然将视频帧#4、视频帧#5和视频帧#6的尺寸扩大为同一个尺寸以分别得到视频帧#7、视频帧#8和视频帧#9,但仍可以认为,视频帧#7的视野大于视频帧#8的视野,视频帧#8的视野大于视频帧#9的视野。
在图7和图8示出的裁剪视频帧的示例中,裁剪框可以随着视频帧逐渐增大或减小,但本申请实施例并不限定于此,在另一些示例中,裁剪框可以随着视频帧逐渐增大再减小,或随着视频帧逐渐减小再增大。
在图7和图8示出的裁剪视频帧的示例中,视频#1为横屏视频,经过裁剪得到的视频#2仍为横屏视频,但本申请实施例并不限定于此,在另一些示例中,视频#1为横屏视频,经过裁剪得到的视频#2可以是竖屏视频。
图9示出了本申请实施例提供的一种裁剪视频帧的示意图。
如图9所示,以视频#1的视频帧#1、视频帧#2和视频帧#3的裁剪为例,其中视频帧#1、视频帧#2和视频帧#3的尺寸相同,视频帧#1在视频帧#2之前,视频帧2在视频帧#3之前。电子设备在视频帧#1中确定裁剪框#1,在视频帧#2中确定裁剪框#2,在视频帧#3中确定裁剪框#3,由图11可知,上述裁剪框的纵向长度相同,裁剪框#1的横向长度尺寸大于裁剪框#2的横向长度,裁剪框#2的横向长度大于裁剪框#3的横向长度,因此裁剪框#1尺寸大于裁剪框#2的尺寸,裁剪框#2的尺寸大于裁剪框#3的尺寸。
通过裁剪视频帧#1可以得到视频帧#4,裁剪视频帧#2可以得到视频帧#5,裁剪视频帧#3可以得到视频帧#6。由于裁剪框#1的尺寸大于裁剪框#2的尺寸,裁剪框#2的尺寸大于裁剪框#3的尺寸,则视频帧#4的尺寸大于视频帧#5的尺寸,视频帧#5的尺寸大于视频帧#6的尺寸,即视频帧#4的视野大于视频帧#5的视野,视频帧#5的视野大于视频帧#6的视野。
在生成视频#2时,与图7和图8示出的将视频帧#4、视频帧#6和视频帧#7的尺寸扩大至与视频帧#1、视频帧#2和视频帧#3相同的尺寸不同的是,在生成竖屏视频时,可以将视频帧#4、视频帧#6和视频帧#7的横向长度调整为竖屏视频的横向长度,该竖屏视频的横向长度可以是用户定义的,或也可以是根据电子设备的屏幕的尺寸确定的。
由于视频帧#4、视频帧#5和视频帧#6的纵向长度相同,但视频帧#4的横向长度大于视频帧#5的横向长度,视频帧#5的横向长度大于视频帧#6的横向长度,因此当将视频帧#4、视频帧#6和视频帧#7的横向长度调整为竖屏视频的横向长度分别得到视频帧#7、视频帧#8和视频帧#9时,视频帧#7的纵向长度小于视频帧#8的纵向长度,视频帧#8的纵向长度小于视频帧#9的纵向长度。
上文结合GUI和裁剪视频示意图介绍了本申请实施例提供的视频处理的方法,下文将结合图10介绍本申请实施例提供的视频处理的方法的示意性流程图。
图10示出了本申请实施例提供的视频处理方法的示意性流程图,如图10所示,该方法包括:
S1001,获取第一视频。
在一些实施例中,电子设备在播放视频时可以获取第一视频,该第一视频可以是在线视频,或可以是本地视频,该第一视频是横屏播放的视频。
在一些实施例中,用户可以上传第一视频以编辑该第一视频,从而电子设备可以获取第一视频。
第一视频包括N个视频帧,该N个视频帧的尺寸可以相同,N>1且为整数。该N个视频帧包括M个对象。该M个对象可以是人物、动物、物体、动作等。
当该M个对象是人物、动物、物体时,该N个视频帧包括M个对象可以理解为该N个视频帧的包括该M个人物、动物、物体。
当该M个对象是动作时,该N个视频帧包括该M个对象可以理解为该N个视频帧组成的视频呈现的内容是该M个动作。
S1002,确定第一视频参数。
在一些实施例中,电子设备响应于用户确定生成视频的操作,确定第一视频参数,该第一视频参数可以是预设的。该第一视频参数包括目标对象以及以下中的一项或多项:视频生成类型、裁剪框尺寸限值、目标对象帧间速度阈值、抽帧间隔、视频尺寸、视频时间。
例如,如图3所示,电子设备响应于用户点击控件304的操作,可以确定第一视频参数,该第一视频参数用于生成视频#2。
在一些实施例中,电子设备检测到第一视频中包括预设的目标对象,确定第一视频参数,该第一视频参数可以是预设的。
例如,如图4所示,电子设备检测到视频#1中包括射门动作,确定第一视频参数,该第一视频参数用于生成视频#2。
在一些实施例中,电子设备检测到用户配置视频参数的操作,确定第一视频参数。
例如,如图6中的(a)所示,用户可以在界面601设置视频参数,其中用户配置的视频生成类型为凸显目标对象,从而电子设备可以响应于用户配置视频参数的操作,确定第一视频参数。
S1003,追踪目标对象。
在一些实施例中,电子设备确定第一视频参数后,可以追踪第一视频中的每一个视频帧的目标对象。
在一些实施例中,电子设备确定第一视频参数后,可以将第一视频参数和第一视频发送给服务器,由服务器追踪第一视频中的每一个视频帧的目标对象。
可选的,在一些实施例中,在S1003,追踪目标对象之前,该方法还包括:
确定第一视频的L个视频帧。
电子设备或服务器确定第一视频参数后,可以从第一视频的N个视频帧中确定第一视频的L个视频帧,该第一视频的L个视频帧包括电子设备确定的目标对象。
电子设备或服务器可以通过以下两种可能的实现方式从第一视频的N个视频帧中确定第一视频的L个视频帧:
一种可能的实现方式,电子设备根据抽帧间隔从第一视频的N个视频帧中确定第一视频的L个视频帧,则N>L,该抽帧间隔可以是用户配置的,或也可以是系统预置的或自动配置的。
需要说明的是,若抽帧间隔是系统自动配置的,则系统在配置抽帧间隔时,可以根据帧间速度确定抽帧间隔。例如,帧间速度为70像素/s,抽帧间隔为2,即每隔两个视频帧抽取一个视频帧;帧间速度为50像素/s,抽帧间隔为3;帧间速度为90像素/s,抽帧间隔为1,换句话说,帧间速度与抽帧间隔呈反比例关系。
一种可能的实现方式,电子设备不进行抽帧处理,则N=L。
S1004,确定裁剪框,裁剪框的横纵比与第一视频的横纵比相同。
在一些实施例中,电子设备确定每一个视频帧的目标对象后,可以确定每一个视频帧的裁剪框。
在一些实施中,服务器确定每一个视频帧的目标对象后,可以确定每一个视频帧的裁剪框。
电子设备或服务器可以通过以下几种可能的实现方式确定视频帧的裁剪框:
一种可能的实现方式:电子设备或服务器确定目标对象的帧间速度,并根据帧间速度和目标对象确定每一个视频帧的裁剪框。目标对象的帧间速度可以理解为相邻两个视频帧之间的目标对象的位移与时间的比值。例如,如图11所示,目标对象在视频帧#1的中心坐标为(x1,y1),在视频帧#2的中心坐标为(x2,y2),视频帧#1和视频帧#2是相邻的视频帧,时间间隔为t1,则目标对象在视频帧#1和视频帧#2 的帧间速度可以使用公式(1)计算得到。
电子设备确定裁剪框需要确定裁剪框的位置和尺寸,其中,电子设备可以根据目标对象在视频帧的位置确定裁剪框的位置,以及可以根据目标对象的帧间速度确定裁剪框的尺寸。
例如,以图8所示的裁剪示意图为例,L=3,电子设备或服务器确定3个视频帧,该3个视频帧的顺序为视频帧#1、视频帧#2和视频帧#3。电子设备或服务器可以识别上述3个视频帧中的目标对象并确定该目标对象在视频帧#1和视频帧#2之间的帧间速度#1,以及该目标对象在视频帧#2和视频帧#3之间的帧间速度#2。电子设备或服务器在确定3个裁剪框时,可以先确定视频帧#1的裁剪框#1。电子设备或服务器可以根据目标对象在视频帧#1的位置确定裁剪框#1的位置,并可以在保证视频帧#1中的目标对象完整的情况下向外扩张一定的距离以确定裁剪框#1的尺寸,向外扩展的距离可以是系统预设的,或可以是用户设置的。电子设备或服务器在确定裁剪框#1的位置和尺寸后可以确定裁剪框#2。电子设备或服务器可以根据目标对象在视频帧#2的位置确定裁剪框#2的位置,并根据裁剪框#1的尺寸和帧间速度#1确定裁剪框#2的尺寸。假设帧间速度#1小于阈值,则电子设备或服务器可以使裁剪框#2的尺寸小于裁剪框#1的尺寸。电子设备或服务器在确定裁剪框#2的位置和尺寸后,可以确定裁剪框#3。电子设备或服务器可以根据目标对象在视频帧#3的位置确定裁剪框#3的位置,并根据裁剪框#2的尺寸和帧间速度#2确定裁剪框#3的尺寸。假设帧间速度#2小于阈值,则电子设备或服务器可以使裁剪框#3的尺寸小于裁剪框#2的尺寸,从而电子设备或服务器确定了3个裁剪框的尺寸,其中裁剪框#1的尺寸大于裁剪框#2的尺寸,裁剪框#2的尺寸大于裁剪框#3的尺寸。
需要说明的是,帧间速度的阈值可以是系统阈值的,或用户配置的,即如图6所示的GUI,用户可以在界面601配置帧间速度的阈值。
一种可能的实现方式:电子设备或服务器根据视频理解算法确定视频帧的裁剪框。
电子设备或服务器可以根据视频理解算法识别第一个视频帧的高层语义,电子设备或服务器可以在保证高层语义不变的情况下确定每一个视频帧的裁剪框的位置和尺寸。
可以理解的是,由于目标对象在第一视频的每一个视频帧中的尺寸可能会发生变化,为了保证高层语义不变,电子设备或服务器确定的每一个视频帧的裁剪框的尺寸可以不同。
本申请实施例对于视频理解算法不作限定,例如视频理解算法可以是改进的密集轨迹特征(improved dense trajectories,idt)算法、慢特征分析算法等。
在一些实施例中,若定义有裁剪框尺寸限值,该裁剪框的尺寸限值可以包括上限值和/或下限值,则电子设备或服务器在确定裁剪框的尺寸时需使得裁剪框的尺寸大于下限值和/或小于上限值。裁剪框尺寸限值可以是系统预设的,或可以是用户配置的,即如图6所示的GUI,用户可以在界面601配置裁剪框尺寸限值。
示例性的,裁剪框尺寸限值可以是裁剪框面积限值,则裁剪框的面积需大于下限值和/或小于上限值。
示例性的,裁剪框尺寸限值可以是裁剪框横向长度、纵向长度限值,则裁剪框的横向长度、纵向长度需大于下限值和/或小于上限值。
示例性的,裁剪框尺寸限值可以是裁剪框周长限值,则裁剪框的周长需大于下限值和/或小于上限值。
在一些实施例中,电子设备或服务器确定的裁剪框的横纵比与第一视频的视频帧的横纵比相同。
例如,如图12中的(a)所示,视频帧的横向长度为a,纵向长度为b,视频帧的横纵比为a/b,裁剪框的横向长度为c,纵向长度为d,裁剪框的横纵比为c/d,其中a/b=c/d。
在另一些实施例中,电子设备或服务器确定的裁剪框的纵向长度与第一视频的视频帧的纵向长度相同,具体说明请参见下文针对图14的说明。
为了更加清楚的说明电子设备确定裁剪框的过程,下面将结合图13进行介绍。
如图13中的(a)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象确定裁剪框#1。电子设备确定裁剪框#1后,可以根据目标对象在视频帧#2的位置确定裁剪框#2的位置,并可以根据帧间速度或视频理解算法确定裁剪框#2的尺寸小于裁剪框#1的尺寸。在图13中的(a)示出的视频帧#1、视频帧#2、 裁剪框#1和裁剪框#2的横纵比相同。
如图13中的(b)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象确定裁剪框#1。电子设备确定裁剪框#1后,可以根据目标对象在视频帧#2的位置确定裁剪框#2的位置,并可以根据帧间速度或视频理解算法确定裁剪框#2的尺寸大于裁剪框#1的尺寸。在图13中的(b)示出的视频帧#1、视频帧#2、裁剪框#1和裁剪框#2的横纵比相同。
如图13中的(c)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象#1和目标对象#2确定裁剪框#1,以及根据视频帧#2中的目标对象#1和目标对象#2确定裁剪框#2,由图可知,相较于视频帧#1,视频帧#2中的目标对象#1和目标对象#2的距离增大,则裁剪框#2的尺寸大于裁剪框#1的尺寸。在图13中的(c)示出的视频帧#1、视频帧#2、裁剪框#1和裁剪框#2的横纵比相同。
如图13中的(d)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象#1和目标对象#2确定裁剪框#1,以及根据视频帧#2中的目标对象#1和目标对象#2确定裁剪框#2,由图可知,相较于视频帧#1,视频帧#2中的目标对象#1和目标对象#2的距离减小,则裁剪框#2的尺寸小于裁剪框#1的尺寸。在图13中的(d)示出的视频帧#1、视频帧#2、裁剪框#1和裁剪框#2的横纵比相同。
在一些实施例中,当电子设备或服务器确定多个目标对象,电子设备可以根据视频理解算法确定每一个视频帧的裁剪框。
例如,如图13中的(e)所示,以目标对象#1为足球、目标对象#2为球员为例,其中视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,视频帧#1的内容是球员将球踢出,视频帧#2的内容是足球进门,则电子设备或服务器在确定视频帧#1的裁剪框时,为了凸显整个射门动作,裁剪框#1中包括球员和足球,即包括目标对象#1和目标对象#2,在确定视频帧#2的裁剪框时,为了凸显足球进门的画面,裁剪框#2可以仅包括足球,即仅包括目标对象#2。
在一些实施例中,当电子设备或服务器确定多个目标对象,电子设备或服务器可以确定每一个目标对象的优先级,并根据每一个目标对象的优先级和每一个目标对象的帧间速度确定L个裁剪框。
例如,如图13中的(f)所示,视频帧#1可以是电子设备或服务器确定的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象#1和目标对象#2确定裁剪框#1,其中目标对象#1的优先级高于目标对象#2的优先级,若目标对象#1和目标对象#2的帧间速度均小于阈值,且由于目标对象#1的优先级高于目标对象#2的优先级,为了在新视频中凸显目标对象#1,相较于裁剪框#1,可以向着目标对象#1的方向缩小裁剪框的尺寸以得到裁剪框#2。在图13中的(f)示出的视频帧#1、视频帧#2、裁剪框#1和裁剪框#2的横纵比相同。
再例如,如图13中的(g)所示,视频帧#1可以是电子设备或服务器确定的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象#1和目标对象#2确定裁剪框#1,其中目标对象#1的优先级高于目标对象#2的优先级,若目标对象#1和目标对象#2的帧间速度中的任意一个大于阈值,相较于裁剪框#1,可以扩张裁剪框的尺寸以得到裁剪框#2。在图13中的(g)示出的视频帧#1、视频帧#2、裁剪框#1和裁剪框#2的横纵比相同。
对比图13中的(f)和(g)可知,当多个目标对象的帧间速度小于阈值,电子设备或服务器在确定裁剪框时,可以向优先级高的目标对象的方向缩小,当多个电子设备的帧间速度中的任意一个大于阈值,可以扩张裁剪框的尺寸。
S1005,根据裁剪框裁剪第一视频。
电子设备或服务器确定裁剪框后,可以裁剪第一视频以得到裁剪后的视频帧,由于裁剪框的横纵比与第一视频的视频帧的横纵比相同,则裁剪后的视频帧的横纵比与第一视频的视频帧的横纵比相同。
S1006,生成第二视频。
电子设备或服务器得到裁剪后的视频帧,可以进行重采样处理以得到第二视频。
需要说明的是,由于电子设备或服务器确定的裁剪框的尺寸不同,电子设备或服务器得到的裁剪后的视频帧的尺寸也可以不同,但横纵比相同,经过重采样处理后,第二视频的视频帧的尺寸相同。
例如,如图7或8所示,电子设备或服务器经过裁剪得到视频帧#4、视频帧#5和视频帧#6,电子设备或服务器通过对上述视频帧进行重采样处理以得到尺寸相同的视频帧#7、视频帧#8和视频帧#9,但视 频帧#7、视频帧#8和视频帧#9的视野不同。
本申请实施例中,电子设备可以确定原视频中的目标对象,并生成新的视频,该新的视频以目标对象为中心,且新的视频在原视频时的裁剪框的尺寸不同,从而带来了视野的变化,可以更好的凸显原视频中的目标对象,能够提升用户体验。
需要说明的是,本申请实施例中所述的以目标对象为中心的新视频是指新的视频的整体视频内容以目标对象为中心,但这并不代表新的视频的每一个视频帧均是以目标对象为中心,在一些实施例中,电子设备或服务器生成新视频时,为了保证视频的画面流畅,可以对裁剪框做帧间平滑,从而新视频的部分视频帧中的目标对象可能相对于视频的中心位置稍有偏移。
上文结合图10介绍了电子设备或服务器根据原视频生成与原视频横纵比相同的新视频的方法,下文将介绍电子设备或服务器根据原横屏视频生成新的竖屏视频的方法,新的竖屏视频的横纵比与原横屏视频的横纵比不同。
图14示出了本申请实施例提供的视频处理方法的示意性流程图,如图所示,该方法包括:
S1401,获取第一视频。
S1402,确定第一视频参数。
在一些实施例中,第一视频为横屏视频,电子设备检测到自身由横屏变为视频,确定第一视频参数,该第一视频参数可以是预设的。
例如,如图5所示,电子设备检测到自身由横屏变为视频,确定第一视频参数,该第一视频参数用于生成视频#2。
在一些实施例中,电子设备检测到用户配置视频参数的操作,确定第一视频参数。
例如,如图6中的(b)所示,用户可以在界面601设置视频参数,其中用户配置的视频生成类型为横屏竖屏转竖屏视频,从而电子设备可以响应于用户配置视频参数的操作,确定第一视频参数。
S1403,追踪目标对象。
可选的,在一些实施例中,在S1403,追踪目标对象之前,该方法还包括:
确定第一视频的L个视频帧。
应理解,针对S1401-S1403的详细描述,可以参见针对S1001-S1003的描述,为了简洁,在此不再赘述。
S1404,确定裁剪框,裁剪框的纵向长度相同。
电子设备或服务器确定裁剪框的位置和尺寸,电子设备或服务器确定裁剪框的位置和尺寸的方法与上文类似,在此不再赘述。与图10所示的方法不同的是,在该方法中确定的裁剪框的纵向长度相同,横向长度不同,换句话说,裁剪框的横纵比不同。
例如,如图12中的(b)所示,视频帧的横向长度为a,纵向长度为b,裁剪框的横向长度为c,纵向长度为b。
为了更加清楚的说明电子设备确定裁剪框的过程,下面将结合图15进行介绍。
如图15中的(a)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象确定裁剪框#1。电子设备确定裁剪框#1后,若根据帧间速度或视频理解算法确定裁剪框#2的尺寸大于裁剪框#1的尺寸,则在扩张裁剪框#2时,可以向目标对象移动的方向进行扩张。在图15中的(a)示出的视频帧#1和裁剪框#1的纵向长度相同,横向长度不同,即视频帧#1和裁剪框#1的横纵比不同。类似的,视频帧#2和裁剪框#2的横纵比不同
如图15中的(b)所示,视频帧#1可以是电子设备或服务器确定的第一视频的第一个视频帧,视频帧#2是视频帧#1后的视频帧,电子设备或服务器可以根据视频帧#1中的目标对象确定裁剪框#1。电子设备确定裁剪框#1后,若根据帧间速度或视频理解算法确定裁剪框#2的尺寸小于裁剪框#1的尺寸,则在缩小裁剪框#2时,可以向目标对象移动的方向进行缩小。在图15中的(b)示出的视频帧#1和裁剪框#1的横纵比不同,视频帧#1和裁剪框#2的横纵比不同。
S1405,根据裁剪框裁剪第一视频。
电子设备或服务器根据裁剪框裁剪第一视频以得到裁剪后的视频帧,裁剪后的视频帧的纵向长度相同,横向长度可以不同。
S1406,生成第二视频。
电子设备或服务器得到裁剪后的视频帧后,可以将裁剪后的视频帧进行重采样处理,重采样处理后的视频帧的横向长度相同,从而电子设备根据重采样处理后的视频帧的横向长度生成第二视频,该第二视频为竖屏竖屏。
例如,如图9所示,电子设备或服务器经过裁剪得到视频帧#4、视频帧#5和视频帧#6,电子设备或服务器通过对上述视频帧进行重采样处理以得到横向长度相同的视频帧#7、视频帧#8和视频帧#9。
本申请实施例中,电子设备可以将横屏视频转化为竖屏竖屏,在转化竖屏视频时,该竖屏视频以目标对象为中心,且竖屏视频在原视频时的裁剪框的尺寸不同,从而带来了视野的变化,可以更好的凸显原视频中的目标对象,能够提升用户体验。
图16示出了本申请实施例提供的视频处理方法的示意性流程图,如图16所示,该方法包括:
S1601,获取第一视频。
在一些实施例中,电子设备在播放视频时可以获取第一视频,该第一视频可以是在线视频,或可以是本地视频。
在一些实施例中,用户可以上传第一视频以编辑该第一视频,从而电子设备可以获取第一视频。
第一视频包括N个视频帧,N≥2且为整数。该N个视频帧包括第一视频帧和第二视频帧。第一视频帧和第二视频帧包括至少一个对象,该至少一个对象包括第一目标对象,该第一目标对象可以是人物、动物、物体、动作等。
当第一目标对象是人物、动物、物体时,该第一视频帧和该第二视频帧包括第一目标对象可以理解为该第一视频帧和第二视频帧包括该人物、动物、物体。
当第一目标对象是动作时,该第一视频帧和该第二视频帧包括该第一目标对象可以理解为该第一视频帧和该第二视频帧组成的视频呈现的内容是该动作。
S1602,响应于用户第一操作,第一操作为选择第一目标对象的操作。
电子设备可以响应于用户的第一操作,获取第二视频,该第一操作为选择第一目标对象的操作。
例如,如图4所示的GUI,电子设备响应于用户点击控件402的操作,标记视频#1中的对象,然后响应于用户选择目标对象的操作,可以获取视频#2并播放视频#2。
再例如,如图5所示的GUI,电子设备检测到用户将电子设备由横屏变为竖屏,确定视频#1中的目标对象,然后获取视频#2并播放视频#2。
S1603,获取第二视频,第二视频包括第三视频帧和第四视频帧,第三视频帧和第四视频帧包括第一目标对象。
电子设备获取的第二视频包括第三视频帧和第四视频帧,该第三视频帧和第四视频帧包括第一目标对象,其中,第三视频帧为根据第一视频帧中的第一目标对象裁剪获得的,第四视频帧为根据第二视频帧中的第一目标对象裁剪获得的,第三视频帧的尺寸和第四视频的尺寸相同。
可选的,在一些实施例中,该至少一个对象还包括第二目标对象,第三视频帧和第四视频帧不包括第二目标对象。
第一视频帧和第二视频帧包括第一目标对象和第二目标对象,由于用户仅选择了第一目标对象,则第三视频帧和第四视频帧中可以不包括第二目标对象。
可选的,在一些实施例中,第一视频帧的第一目标对象与第三视频帧的第一目标对象的尺寸不同。
可选的,在一些实施例中,第三视频帧和第四视频帧中的第一目标对象的尺寸不同。
示例性的,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸相同,但第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同,则第一视频帧的裁剪框的尺寸与第二视频帧的裁剪框的尺寸可能不同,而第三视频帧和第四视频帧分别是根据裁剪后的第一视频帧和裁剪后的视频帧经过重采样生成的,第三视频帧和第四视频帧是相同尺寸的视频帧,则第三视频帧和第四视频帧中的第一目标对象的尺寸不同。
例如,如图7或8所示,视频帧#1、视频帧#2和视频帧#3中的人物的尺寸相同,电子设备或服务器经过裁剪得到视频帧#4、视频帧#5和视频帧#6,电子设备或服务器通过对上述视频帧进行重采样处理以得到尺寸相同的视频帧#7、视频帧#8和视频帧#9,但视频帧#7、视频帧#8和视频帧#9中的人物的尺寸不同。
示例性的,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸相同,电子设备根据视频理解算法确定第一视频帧的裁剪框的尺寸与第二视频帧的裁剪框的尺寸不同,且第三视频帧和第 四视频帧是相同尺寸的视频帧,则第三视频帧和第四视频帧中的第一目标对象的尺寸不同。
可选的,在一些实施例中,第三视频帧和第四视频帧中的第一目标对象的尺寸相同。
示例性的,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸不同,第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同,则第一视频帧的裁剪框的尺寸与第二视频帧的裁剪框的尺寸不同,且第三视频帧和第四视频帧是相同尺寸的视频帧,则第三视频帧和第四视频帧中的第一目标对象的尺寸可能相同。
示例性的,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸不同,电子设备根据视频理解算法确定第一视频帧的裁剪框的尺寸与第二视频帧的裁剪框的尺寸不同,且第三视频帧和第四视频帧是相同尺寸的视频帧,则第三视频帧和第四视频帧中的第一目标对象的尺寸可能相同。
可选的,在一些实施例中,第二视频的视频帧的数量和第一视频的视频帧的数量相同。
可选的,在一些实施例中,第二视频的视频帧的数量是根据第一视频的视频帧数量和抽帧间隔确定的,即第二视频的视频帧的数量小于第二视频的视频帧的数量。
可选的,在一些实施例中,该N个视频帧包括第一目标对象,该第二视频包括M个视频帧,该M个视频帧包括第一目标对象,M≤N且M为整数。
一种可能的实现方式:本申请实施例中的抽帧间隔可以是用户设置的。
一种可能的实现方式:本申请实施例中的抽帧间隔可以是系统自定义的。
一种可能的实现方式:本申请实施例中的抽帧间隔可以根据第一目标对象在N个视频帧中的相邻两个视频帧之间的帧间速度确定的。
可选的,在一些实施例中,第一视频和第二视频的时长不同。
例如,第一视频的时长为10分钟,第二视频的时长为2分钟。
可选的,在一些实施例中,第一视频为横屏视频,第二视频为竖屏视频,第一视频帧和第二视频帧为横屏视频帧,第三视频帧和第四视频帧为竖屏视频帧,该第三视频帧和第四视频帧的高度(或纵向长度)不同。
例如,如图9所示,视频帧#7、视频帧#8和视频帧#9为竖屏视频帧且视频帧#7、视频帧#8和视频帧#9的纵向长度不同。
S1604,播放第二视频。
电子设备获取第二视频后,可以播放该第二视频。
本申请实施例中,电子设备可以根据用户的操作确定原视频中的第一目标对象,并播放新的视频,该新的视频以第一目标对象为中心,可以更好的凸显原视频中的第一目标对象,能够提升用户体验。
可选的,在一些实施例中,该方法还包括:
播放第一视频;
S1604,播放第二视频,包括:
当检测到第一目标对象,全屏播放第二视频;
在全屏播放所述第二视频之后且未检测到所述第一目标对象,该方法还包括:
继续播放第一视频。
以图4所示的GUI为例,电子设备获取第一视频后,可以在界面401播放第一视频,当用户点击控件402后,电子设备可以识别第一视频的对象并响应于用户的选择确定目标对象,可以在界面401播放视频#2,当在界面401播放完视频#2后且未检测到目标对象,则继续播放视频#1。
可选的,在一些实施例中,该方法还包括:
显示第一界面,该第一界面显示第一窗口和第二窗口,其中第一窗口显示第一视频帧,第二窗口显示第三视频帧。
例如,如图6中的(b)所示,电子设备显示有窗口601和窗口602,该窗口601可以显示第一视频帧,窗口603可以显示第三视频帧。
可选的,在一些实施例中,该方法还包括:
显示第一界面,该第一界面包括第一窗口,在第一窗口播放第一视频;
S1604,播放第二视频,包括:
当检测到第一目标对象,显示第二界面,第二界面包括第一窗口和第二窗口,其中第一窗口播放第一视频,第二窗口播放第二视频;
在显示第二界面之后,该方法还包括:
当未检测到第一目标对象时,显示第三界面,第三界面包括第一窗口,在第一窗口中继续播放第一视频,且第三界面不包括第二窗口。
可以理解的是,第一窗口为播放视频#1的窗口,第二窗口为播放视频#2的窗口,第一窗口可以是全屏窗口,则第一窗口的面积与界面的面积相同,第二窗口可以是小窗口,或悬浮窗,可以显示在第一窗口之上。
例如,以图3中的(a)和(d)所示,电子设备首先在界面中显示窗口301播放视频#1,当检测到目标对象,在界面中显示窗口301和窗口305,其中窗口301播放视频#1,窗口305播放视频#2。当电子设备在窗口305播放完视频#2后,若未检测到目标对象,可以在界面中仅显示窗口301以播放视频#1。
可选的,在一些实施例中,在S1602,响应于用户第一操作之前,该方法还包括:
显示第三界面,该第三界面包括第五窗口,该第五窗口包括第一视频中的至少一个对象。
例如,以图4所示的GUI为例,电子设备获取第一视频后,可以在界面(或窗口)401播放第一视频,当用户点击控件402后,电子设备可以识别第一视频的对象并在界面(或窗口)403显示识别到的对象。
可选的,在一些实施例中,第一操作为在第五窗口选择第一目标对象的操作。
例如,以图4所示的GUI为例,电子设备获取第一视频后,可以在界面(或窗口)401播放第一视频,当用户点击控件402后,电子设备可以识别第一视频的对象并在界面(或窗口)403显示识别到的对象,用户可以在界面(或窗口)403选择目标对象。
可选的,在一些实施例中,该至少一个对象还包括第二目标对象,该方法还包括:
响应于用户第二操作,获取第三视频,其中第三视频包括第五视频帧和第六视频帧,第五视频帧包括第一目标对象和/或第二目标对象,第六视频帧包括第一目标对象和/或第二目标对象,第五视频帧为根据第一视频帧中的第一目标对象和/或第二目标对象裁剪获得,第六视频帧为根据第二视频帧中的第一目标对象和/或第二目标对象裁剪获得。
该第二操作为用于生成凸显目标对象的操作。例如,如图3所示,用户点击控件304的操作。
第一视频帧和第二视频帧包括第一目标对象和第二目标对象时,电子设备可以根据视频理解算法或第一目标对象和第二目标对象的帧间速度以及第一目标对象和第二目标对象的优先级确定裁剪后的视频帧,该裁剪后的视频帧可以既同时包括第一目标对象和第二目标对象,也可以仅包括第一目标对象和第二目标对象中的任意一项。
例如,如图13中的(e)所示,视频帧#1裁剪后的视频帧包括目标对象#1和目标对象#2,视频帧#2裁剪后的视频帧包括目标对象#1但不包括目标对象#2。
再例如,如图13中的(f)和(g)所示,视频帧#1和视频帧#2裁剪后得到的视频帧均包括目标对象#1和目标对象#2。
上述主要从电子设备以及服务器的角度对本申请实施例提供的一种视频处理的方法进行了介绍。可以理解的是,电子设备以及服务器为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本申请实施例可以根据上述方法示例对电子设备以及服务器中的处理器进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。需要说明的是,本申请实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用了对应各个功能的划分各个功能模块的情况下,图17示出了本申请实施例提供的一种电子设备组成示意图,如图17所示,电子设备1700包括:获取模块1710、视频处理模块1720。
收发模块1710,用于获取第一视频。
视频处理模块1720,用于响应于用户的操作,获取第二视频。
视频处理模块1720,还用于播放第二视频。
视频处理模块1720,还用于响应于用户第二操作,获取第三视频。
可选的,在一些实施例中,第二视频包括第三视频帧和第四视频帧,第三视频帧中的第一目标对象和第四视频帧中的第一目标对象的尺寸不同。
可选的,在一些实施例中,第一视频包括第一视频帧和第二视频帧,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸相同,第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同。
可选的,在一些实施例中,第二视频包括第三视频帧和第四视频帧,第三视频帧中的第一目标对象和第四视频帧中的第一目标对象的尺寸相同。
可选的,在一些实施例中,第一视频包括第一视频帧和第二视频帧,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸不同,第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同。
视频处理模块1720,还用于显示第一界面,该第一界面显示第一窗口和第二窗口,其中第一窗口显示第一视频帧,第二窗口显示第三视频帧。
视频处理模块1720,还用于:
播放第一视频;
视频处理模块1720,具体用于当检测到所述第一目标对象,全屏播放所述第二视频;
视频处理模块1720,还用于在全屏播放所述第二视频之后且未检测到所述第一目标对象时继续播放所述第一视频。
可选的,在一些实施例中,第二视频的视频帧数量是根据第一视频帧的视频帧数量和抽帧间隔确定的。
视频处理模块1720,还用于根据第一目标对象的帧间速度确定抽帧间隔。
可选的,在一些实施例中,第一视频和第二视频的时长不同。
可选的,在一些实施例中,第二视频包括第三视频帧和第四视频帧,第三视频帧和第四视频帧为竖屏视频帧,第三视频帧的纵向长度与第四视频帧的纵向长度不同。
在采用了对应各个功能的划分各个功能模块的情况下,图18示出了本申请实施例提供的一种服务器组成示意图,如图18所示,电子设备1800包括:收发模块1810、视频处理模块1820。
收发模块1810,用于获取第一视频。
视频处理模块1820,用于响应于用户的第一操作,生成第二视频。
收发模块1810,还用于向电子设备发送第二视频。
在一些实施例中,视频处理模块1820,具体用于:响应于用户的第一操作操作确定第一视频的裁剪框;
根据第一视频的和第一视频的裁剪框组生成第二视频。
可选的,在一些实施例中,视频处理模块1820,还用于响应于用户的第二操作,生成第三视频。
可选的,在一些实施例中,第二视频包括第三视频帧和第四视频帧,第三视频帧中的第一目标对象和第四视频帧中的第一目标对象的尺寸不同。
可选的,在一些实施例中,第一视频包括第一视频帧和第二视频帧,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸相同,第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同。
可选的,在一些实施例中,第二视频包括第三视频帧和第四视频帧,第三视频帧中的第一目标对象和第四视频帧中的第一目标对象的尺寸相同。
可选的,在一些实施例中,第一视频包括第一视频帧和第二视频帧,第一视频帧中的第一目标对象和第二视频帧中的第一目标对象的尺寸不同,第一视频帧中的第一目标对象的帧间速度和第二视频帧中的第一目标对象的帧间速度不同。
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。本申请实施例提供的电子设备,用于执行上述视频处理的方法,因此可以达到与上述相同的效果。
本申请实施例还提供了一种电子设备,包括:处理器、存储器、应用程序以及计算机程序。上述各器件可以通过一个或多个通信总线连接。其中,该一个或多个计算机程序被存储在上述存储器中并被配置为被该一个或多个处理器执行,该一个或多个计算机程序包括指令,上述指令可以用于使电子设备执 行上述各实施例中电子设备的各个步骤。
示例性地,上述处理器具体可以为图1所示的处理器110,上述存储器具体可以为图1所示的内部存储器120和/或与电子设备连接的外部存储器。
本申请实施例还提供一种芯片,所述芯片包括处理器和通信接口,所述通信接口用于接收信号,并将所述信号传输至所述处理器,所述处理器处理所述信号,使得如前文中任一种可能的实现方式中所述的视频处理的方法被执行。
本实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现上述实施例中的视频处理的方法。
本实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的视频处理的方法。
以上实施例中所用,根据上下文,术语“当…时”或“当…后”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (19)

  1. 一种视频处理的方法,其特征在于,所述方法包括:
    获取第一视频,所述第一视频包括N个视频帧,N≥2且为整数,其中,所述N个视频帧包括第一视频帧和第二视频帧,所述第一视频帧和所述第二视频帧包括至少一个对象,所述至少一个对象包括第一目标对象;
    响应于用户第一操作,所述第一操作为选择所述第一目标对象的操作;
    获取第二视频,其中,所述第二视频包括第三视频帧和第四视频帧,所述第三视频帧和所述第四视频帧包括所述第一目标对象,所述第三视频帧为根据第一视频帧中所述第一目标对象裁剪获得的,所述第四视频帧为根据所述第二视频帧中所述第一目标对象裁剪获得的;
    播放所述第二视频。
  2. 根据权利要求1所述的方法,其特征在于,所述至少一个对象还包括第二目标对象,所述方法还包括:
    响应于用户第二操作,所述第二操作用于选择所述第一目标对象和所述第二目标对象;
    获取第三视频,其中所述第三视频包括第五视频帧和第六视频帧,所述第五视频帧包括所述第一目标对象和/或所述第二目标对象,所述第六视频帧包括所述第一目标对象和/或所述第二目标对象,所述第五视频帧为根据所述第一视频帧中的所述第一目标对象和/或所述第二目标对象裁剪获得,所述第六视频帧为根据所述第二视频帧中的所述第一目标对象和/或所述第二目标对象裁剪获得。
  3. 根据权利要求1所述的方法,其特征在于,所述至少一个对象还包括第二目标对象,所述第三视频帧和所述第四视频帧不包括所述第二目标对象。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述第三视频帧中的所述第一目标对象与所述第一视频帧中的所述第一目标对象的尺寸不同。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第三视频帧中的所述第一目标对象和所述第四视频帧中的所述第一目标对象的尺寸不同。
  6. 根据权利要求5所述的方法,其特征在于,所述第一视频帧中的所述第一目标对象和所述第二视频帧中的所述第一目标对象的尺寸相同,所述第一视频帧中的所述第一目标对象的帧间速度和所述第二视频帧中的所述第一目标对象的帧间速度不同。
  7. 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:
    显示第一界面,所述第一界面显示第一窗口和第二窗口,其中所述第一窗口显示所述第一视频帧,所述第二窗口显示所述第三视频帧。
  8. 根据权利要求1至6中任一项所述的方法,其特征在于,在播放所述第二视频之前,所述方法还包括:
    全屏播放所述第一视频;
    所述播放所述第二视频,包括:
    当检测到所述第一目标对象,全屏播放所述第二视频;
    在全屏播放所述第二视频之后,所述方法还包括:
    当未检测到所述第一目标对象时,继续全屏播放所述第一视频。
  9. 根据权利要求1至6中任一项所述的方法,其特征在于,所述方法还包括:
    显示第一界面,所述第一界面包括第一窗口,在所述第一窗口中播放所述第一视频;
    所述播放所述第二视频,包括:
    当检测到所述第一目标对象,显示第二界面,所述第二界面包括第一窗口和第二窗口,其中所述第一窗口播放所述第一视频,所述第二窗口播放所述第二视频;
    在显示第二界面之后,所述方法还包括:
    当未检测到所述第一目标对象时,显示第三界面,所述第三界面包括第一窗口,在所述第一窗口中继续播放所述第一视频,且所述第三界面不包括所述第二窗口。
  10. 根据权利要求1至9任一所述的方法,其特征在于,在所述响应于用户第一操作之前,所述方法还包括:
    显示第三界面,所述第三界面包括第五窗口,所述第五窗口包括所述第一视频中的所述至少一个对象。
  11. 根据权利要求10所述的方法,其特征在于,所述第一操作为在所述第五窗口选择所述第一目标对象的操作。
  12. 根据权利要求1至11中任一项所述的方法,其特征在于,所述N个视频帧包括所述第一目标对象,所述第二视频包括M个视频帧,所述M个视频帧包括所述第一目标对象,M≤N且M为整数。
  13. 根据权利要求12所述的方法,其特征在于,所述方法还包括:
    根据所述第一目标对象在所述N个视频帧中的相邻两个视频帧之间的帧间速度确定抽帧间隔。
  14. 根据权利要求1至11中任一项所述的方法,其特征在于,第一视频为横屏视频,第二视频为竖屏视频,所述第一视频帧和所述第二视频帧为横屏视频帧,所述第三视频帧和所述第四视频帧为竖屏视频帧。
  15. 根据权利要求14所述的方法,其特征在于,所述第三视频帧和所述第四视频帧的高度不同。
  16. 一种电子设备,其特征在于,包括一个或多个处理器;一个或多个存储器;所述一个或多个存储器存储有一个或多个计算机程序,所述一个或多个计算机程序包括指令,当所述指令被所述一个或多个处理器执行时,使得如权利要求1至15中任一项所述的方法被执行。
  17. 一种芯片,其特征在于,所述芯片包括处理器和通信接口,所述通信接口用于接收信号,并将所述信号传输至所述处理器,所述处理器处理所述信号,使得如权利要求1至15中任一项所述的方法被执行。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机指令,当所述计算机指令在计算机上运行时,使得如权利要求1至15中任一项所述的方法被执行。
  19. 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在计算机
    上运行时,使得所述计算机执行如权利要求1至15中任一项所述的方法。
PCT/CN2023/134290 2022-11-30 2023-11-27 一种视频处理的方法以及电子设备 Ceased WO2024114569A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP23896720.2A EP4618561A4 (en) 2022-11-30 2023-11-27 VIDEO PROCESSING METHOD AND ELECTRONIC DEVICE
US19/223,863 US20250291467A1 (en) 2022-11-30 2025-05-30 Video Processing Method and Electronic Device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211519957.8A CN118118734A (zh) 2022-11-30 2022-11-30 一种视频处理的方法以及电子设备
CN202211519957.8 2022-11-30

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US19/223,863 Continuation US20250291467A1 (en) 2022-11-30 2025-05-30 Video Processing Method and Electronic Device

Publications (1)

Publication Number Publication Date
WO2024114569A1 true WO2024114569A1 (zh) 2024-06-06

Family

ID=91220007

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/134290 Ceased WO2024114569A1 (zh) 2022-11-30 2023-11-27 一种视频处理的方法以及电子设备

Country Status (4)

Country Link
US (1) US20250291467A1 (zh)
EP (1) EP4618561A4 (zh)
CN (1) CN118118734A (zh)
WO (1) WO2024114569A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN121217980A (zh) * 2024-06-26 2025-12-26 北京字跳网络技术有限公司 一种视频处理方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019127868A1 (zh) * 2017-12-29 2019-07-04 广州优视网络科技有限公司 横竖屏切换方法、装置和终端
CN112135188A (zh) * 2020-09-16 2020-12-25 咪咕文化科技有限公司 视频裁剪方法、电子设备及计算机可读存储介质
CN113438436A (zh) * 2020-03-23 2021-09-24 阿里巴巴集团控股有限公司 一种视频播放方法、视频会议方法、直播方法及相关设备
CN114724055A (zh) * 2021-01-05 2022-07-08 华为技术有限公司 视频切换方法、装置、存储介质及设备
CN114816210A (zh) * 2019-06-25 2022-07-29 华为技术有限公司 一种移动终端的全屏显示方法及设备
CN115174994A (zh) * 2021-04-01 2022-10-11 腾讯科技(深圳)有限公司 视频处理方法、装置、计算机设备及存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10623662B2 (en) * 2016-07-01 2020-04-14 Snap Inc. Processing and formatting video for interactive presentation
US10084970B2 (en) * 2016-12-05 2018-09-25 International Institute Of Information Technology, Hyderabad System and method for automatically generating split screen for a video of a dynamic scene
CN113014793A (zh) * 2019-12-19 2021-06-22 华为技术有限公司 一种视频处理方法及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019127868A1 (zh) * 2017-12-29 2019-07-04 广州优视网络科技有限公司 横竖屏切换方法、装置和终端
CN114816210A (zh) * 2019-06-25 2022-07-29 华为技术有限公司 一种移动终端的全屏显示方法及设备
CN113438436A (zh) * 2020-03-23 2021-09-24 阿里巴巴集团控股有限公司 一种视频播放方法、视频会议方法、直播方法及相关设备
CN112135188A (zh) * 2020-09-16 2020-12-25 咪咕文化科技有限公司 视频裁剪方法、电子设备及计算机可读存储介质
CN114724055A (zh) * 2021-01-05 2022-07-08 华为技术有限公司 视频切换方法、装置、存储介质及设备
CN115174994A (zh) * 2021-04-01 2022-10-11 腾讯科技(深圳)有限公司 视频处理方法、装置、计算机设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4618561A1

Also Published As

Publication number Publication date
CN118118734A (zh) 2024-05-31
EP4618561A4 (en) 2026-01-28
EP4618561A1 (en) 2025-09-17
US20250291467A1 (en) 2025-09-18

Similar Documents

Publication Publication Date Title
CN109496423B (zh) 一种拍摄场景下的图像显示方法及电子设备
CN112717370B (zh) 一种控制方法和电子设备
CN113760427B (zh) 显示页面元素的方法和电子设备
CN111768416B (zh) 照片裁剪方法及装置
WO2020259452A1 (zh) 一种移动终端的全屏显示方法及设备
WO2021000881A1 (zh) 一种分屏方法及电子设备
WO2023280021A1 (zh) 一种生成主题壁纸的方法及电子设备
CN113099146B (zh) 一种视频生成方法、装置及相关设备
CN111526314A (zh) 视频拍摄方法及电子设备
WO2021104485A1 (zh) 一种拍摄方法及电子设备
CN111768352B (zh) 图像处理方法及装置
CN110830645B (zh) 一种操作方法和电子设备及计算机存储介质
WO2020113534A1 (zh) 一种拍摄长曝光图像的方法和电子设备
WO2022156473A1 (zh) 一种播放视频的方法及电子设备
WO2022228010A1 (zh) 一种生成封面的方法及电子设备
WO2021204103A1 (zh) 照片预览方法、电子设备和存储介质
CN114079725A (zh) 视频防抖方法、终端设备和计算机可读存储介质
WO2023036084A1 (zh) 一种图像处理方法及相关装置
CN115115679A (zh) 一种图像配准方法及相关设备
CN114257775B (zh) 视频特效添加方法、装置及终端设备
CN110704145A (zh) 一种热区调整方法与装置、电子设备与存储介质
US20250291467A1 (en) Video Processing Method and Electronic Device
US20250350829A1 (en) Video Recording Method and Electronic Device
WO2024152676A1 (zh) 一种窗口管理方法以及电子设备
WO2024109198A1 (zh) 窗口调整方法及相关装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23896720

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023896720

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2023896720

Country of ref document: EP

Effective date: 20250613

NENP Non-entry into the national phase

Ref country code: DE

WWP Wipo information: published in national office

Ref document number: 2023896720

Country of ref document: EP