WO2022214101A1 - 一种视频生成方法、装置、电子设备及存储介质 - Google Patents

一种视频生成方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2022214101A1
WO2022214101A1 PCT/CN2022/086090 CN2022086090W WO2022214101A1 WO 2022214101 A1 WO2022214101 A1 WO 2022214101A1 CN 2022086090 W CN2022086090 W CN 2022086090W WO 2022214101 A1 WO2022214101 A1 WO 2022214101A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
interest
original
preset template
segment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2022/086090
Other languages
English (en)
French (fr)
Inventor
欧桐桐
邱博恒
何超
李世楠
宋月嵘
汪志成
任士博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to EP22784167.3A priority Critical patent/EP4322521A4/en
Publication of WO2022214101A1 publication Critical patent/WO2022214101A1/zh
Priority to US18/483,289 priority patent/US12592260B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/02Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
    • G11B27/031Electronic editing of digitised analogue information signals, e.g. audio or video signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/10Indexing; Addressing; Timing or synchronising; Measuring tape travel
    • G11B27/34Indicating arrangements 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/41Structure of client; Structure of client peripherals
    • H04N21/422Input-only peripherals, i.e. input devices connected to specially adapted client devices, e.g. global positioning system [GPS]
    • H04N21/4223Cameras
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47205End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for manipulating displayed content, e.g. interacting with MPEG-4 objects, editing locally
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/64Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/2624Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects for obtaining an image which is composed of whole input images, e.g. splitscreen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces

Definitions

  • the embodiments of the present disclosure relate to the field of computer technologies, for example, to a video generation method, apparatus, electronic device, and storage medium.
  • Embodiments of the present disclosure provide a video generation method, device, electronic device, and storage medium, which realize automatic editing and synthesis of videos, and improve the processing effect of videos.
  • an embodiment of the present disclosure provides a video generation method, the method comprising:
  • an embodiment of the present disclosure further provides a video generation device, the device comprising:
  • a shooting module configured to receive a trigger operation acting on the video shooting page, and to shoot the original video in response to the trigger operation;
  • a determining module configured to determine the video segment of interest in the original video
  • the processing module is configured to perform video synthesis processing based on the video segment of interest and the original video to obtain a target video.
  • an embodiment of the present disclosure further provides a device, the device comprising:
  • storage means arranged to store at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor implements the video generation method according to any one of the embodiments of the present disclosure.
  • an embodiment of the present disclosure further provides a storage medium containing computer-executable instructions, when executed by a computer processor, the computer-executable instructions are used to perform the video generation according to any of the embodiments of the present disclosure method.
  • FIG. 1 is a schematic flowchart of a video generation method according to Embodiment 1 of the present disclosure
  • FIG. 2 is a schematic interface diagram of video shooting when a user performs a task setting according to Embodiment 1 of the present disclosure
  • FIG. 3 is a schematic flowchart of a method for generating a video according to Embodiment 2 of the present disclosure
  • FIG. 4 is a schematic flowchart of a video generation method provided in Embodiment 3 of the present disclosure.
  • FIG. 5 is a schematic diagram of a title animation image interface provided by Embodiment 3 of the present disclosure.
  • 6a-6e are schematic diagrams of image interfaces of an in-film animation according to Embodiment 3 of the present disclosure.
  • FIG. 7 is a schematic diagram of an end animation image interface provided by Embodiment 3 of the present disclosure.
  • FIG. 8 is a schematic structural diagram of a video generating apparatus according to Embodiment 4 of the present disclosure.
  • FIG. 9 is a schematic structural diagram of an electronic device according to Embodiment 5 of the present disclosure.
  • the term “including” and variations thereof are open-ended inclusions, ie, "including but not limited to”.
  • the term “based on” is “based at least in part on.”
  • the term “one embodiment” means “at least one embodiment”; the term “another embodiment” means “at least one additional embodiment”; the term “some embodiments” means “at least some embodiments”. Relevant definitions of other terms will be given in the description below.
  • FIG. 1 is a schematic flowchart of a video generation method provided in Embodiment 1 of the present disclosure, and the method can be applied to automatically edit and synthesize an original video shot by a user, so as to obtain a video with richer information,
  • the target video with a higher degree of completion and more exciting, the entire video generation process is completed automatically, without the user's manual operation, which improves the video processing effect and efficiency, helps to improve the user experience, and enhances the user stickiness of the application product.
  • the video generation method may be performed by a video generation apparatus, which may be implemented in the form of software and/or hardware.
  • the video generation method provided by this embodiment includes the following steps:
  • Step 110 Receive a trigger operation acting on the video shooting page, and shoot an original video in response to the trigger operation.
  • the camera may be receiving a trigger operation acting on a target shooting control on the video shooting page, and shooting the original video in response to the trigger operation. For example, when the user clicks on the target shooting control, the camera is started, and the video within the shooting range of the camera is captured; when the user clicks the target shooting control again, the shooting ends.
  • the original video may be a video obtained by shooting for a user, or a video obtained by shooting a scene or thing.
  • the original video includes a video obtained by photographing a picture of a user performing a set task.
  • the set task may be any form of task, for example, the user himself or his friend imitates a funny video, the user sings a song, or the user dances a hot dance.
  • the set task may also include a tongue twister challenge game, and/or a quiz game, and/or a video imitation game, etc.
  • This embodiment does not limit the content and execution method of the set task.
  • the user is required to repeat a certain tongue twister clearly and fluently within a limited time, and video recording is performed when the user repeats a tongue twister. While recording the user’s real-time performance, it can also analyze whether the user’s speech is clear and accurate based on the original video. Whether the duration is shorter than other users, etc., enhances the fun and entertainment of the game.
  • the setting task may include at least one sub-task.
  • the prompt information of the setting task may be displayed on the video shooting page. , to guide the user to perform the setting task.
  • the set task includes multiple subtasks
  • prompt information of the multiple subtasks may be sequentially displayed in the non-shooting area of the current interface according to the difficulty level of the multiple subtasks.
  • the set task is a tongue twister, which includes 2 subtasks, the first tongue twister and the second tongue twister, the difficulty of the second tongue twister is greater than the difficulty of the first tongue twister, and the prompt information of the first tongue twister is on the video shooting page.
  • the display sequence of precedes the second tongue twister, thereby increasing user stickiness.
  • the prompt information of the first tongue twister is displayed on the video shooting page
  • the user is guided to perform the first tongue twister task
  • the picture of the user performing the first tongue twister task is captured at the same time
  • the prompt information of the second tongue twister is displayed on the video shooting page.
  • the prompt information may include the name, introduction and/or countdown identifier of the setting task.
  • the content details of the tongue twister are displayed in the non-shooting area of the current interface "Red Phoenix Pink Phoenix, Red Pink Phoenix Flower. "Phoenix” 210, and the countdown logo "2s" 220, where the reference number 230 represents the shooting area.
  • the content details and countdown logo of the next tongue twister that is more difficult are automatically displayed, for example, "Niuniu pulls Niuniu, Niuniu pulls Niuniu”.
  • the vertical screen mode is used to shoot the video of the screen when the user performs the setting task, so as to obtain the original vertical screen video.
  • Step 120 Determine the video segment of interest in the original video.
  • the video clip of interest in the original video may refer to a video clip including a preset expression
  • the preset expression may be a laughing or crying expression.
  • the video clip of interest may include a smile. , video clip of crying emoji.
  • the preset expression may also be an exaggerated expression of a laughing, frustrated and crying face, and correspondingly, the video clip of interest may also be a video clip with an exaggerated expression (laughing out loud, crying in frustration).
  • the expression recognition model can be used to perform expression recognition on each frame of the original video, and mark the image frames including the set expressions, so as to obtain the video clips of interest based on the marked image frames. For example, a video segment composed of 20 image frames before and after a marked image frame is intercepted as a video segment of interest.
  • the set expression is, for example, an expression when laughing, an expression on a crying face, or the like.
  • Step 130 Perform video synthesis processing based on the video segment of interest and the original video to obtain a target video.
  • the video clips of interest may be used to create images of some highlights, and then the images of the highlights may be used as the opening or ending, and the original video may be used to generate an in-film video with some animation special effects.
  • the original video may be played in the middle of the template in combination with the set template, and some animation effects may be added at other positions of the template. For example, if the user's retelling of the current tongue twister is clear and fluent, the animation special effect "You are amazing” can be displayed; if the user's current tongue twister recitation is not very clear and fluent, the animation special effect "Continue to come on” can be displayed, and the animation form can also be displayed. the "microphone", etc. Finally, the title, middle and end titles obtained through processing are synthesized and spliced to obtain the target video.
  • the target video can be generated as a landscape video.
  • an original video is captured in response to the trigger operation; an interesting video segment in the original video is determined;
  • the above-mentioned original video is processed by video synthesis to obtain the technical means of the target video, which realizes the automatic editing and synthesis of the video, and improves the processing effect of the video.
  • FIG. 3 is a schematic flowchart of a video generation method according to Embodiment 2 of the present disclosure.
  • this embodiment refines the foregoing step 120 of "determining the video segment of interest in the original video", and provides an optional implementation manner for determining the video segment of interest.
  • the content that is the same as or similar to the above-mentioned embodiment will not be repeated in this embodiment, and reference may be made to the explanation of the above-mentioned embodiment.
  • the method includes:
  • Step 310 Receive a trigger operation acting on the video shooting page, and shoot an original video in response to the trigger operation.
  • Step 320 Determine the video segment of interest in the original video based on the image recognition.
  • performing facial expression recognition on the image frames of the original video based on the facial expression recognition model and recording the timestamp of the first image frame including the set facial expression and the facial expression score corresponding to at least one of the first image frames;
  • the first image frame whose expression score reaches the set threshold is determined as the second image frame; and the video segment of interest is acquired according to the timestamp of the second image frame.
  • the expression recognition model may be an algorithm constructed based on a neural network and implemented through the principle of image recognition for recognizing expressions in images.
  • each image frame of the original video is sequentially input to the expression recognition model, and the expression recognition model outputs whether the recognition result of the set expression and the corresponding expression score are included. For example, if the recognition result is "1", it means that the current image frame includes the set expression.
  • the expression score is a quantity used to characterize the degree of expression change, eg a smiling expression score is lower than a laughing expression score.
  • the acquiring the video segment of interest according to the timestamp of the second image frame includes:
  • a video of a set duration is intercepted within the duration time interval of the task corresponding to the second image frame as the video segment of interest.
  • the sub-task is the tongue twister "Red Phoenix Pink Phoenix, Red Phoenix Flower Phoenix"
  • the default time for the user to repeat the tongue twister is 5s, assuming that the user starts to repeat the tongue twister
  • the duration of the subtask is the 1s
  • the duration time interval of the subtask is from the 1s to the 5th
  • the timestamp of the second image frame is the 3s
  • the duration of the video clip of interest is 1s
  • the 3s is the reference point
  • take The image frames within 0.5s before and after the reference point constitute the video segment of interest, that is, the image frames whose timestamps fall within the 2.5s-3.5s are determined as the image frames of the video segment of interest.
  • the timestamp of the second image frame is the 4.7s, if it is taken backward by 0.5s (that is, the 5.2s), it exceeds the historical time interval (1s to 5s) of the subtask, and the timestamp is taken at this time.
  • the image frames that fall within the 4th to 5th s are the image frames of the video clip of interest, that is, the time stamp of the second image frame is used as a reference time point, and the duration of the task corresponding to the second image frame is used.
  • a set number of image frames close to the second image frame within the interval are determined as the video segment of interest.
  • the video clips of interest can also be determined for each subtask and then synthesized into the final video clips of interest for the setting task, and the method of determining the video clips of interest for each subtask can also be determined. Similar to the above, it is not repeated here.
  • Step 330 Perform video synthesis processing based on the video segment of interest and the original video to obtain a target video.
  • the expression recognition is performed on the image frames of the original video based on the expression recognition model, and the timestamp of the first image frame including the set expression and at least one of the first image frames corresponding to each other are recorded.
  • the expression score of The set number of image frames of the second image frame are determined as the video segment of interest, which realizes accurate determination of the video segment of interest and provides a data basis for obtaining the target video.
  • FIG. 4 is a schematic flowchart of a video generation method according to Embodiment 3 of the present disclosure.
  • this embodiment refines the above step 130 of “performing video synthesis processing based on the video clip of interest and the original video to obtain the target video”, and provides video clipping and synthesis processing.
  • optional implementation The content that is the same as or similar to the above-mentioned embodiment will not be repeated in this embodiment, and reference may be made to the explanation of the above-mentioned embodiment.
  • the method includes the following steps:
  • Step 410 Receive a trigger operation acting on the video shooting page, and shoot an original video in response to the trigger operation.
  • Step 420 Determine the video segment of interest in the original video.
  • Step 430 Generate title video data and/or end title video data based on the video segment of interest, and generate in-title video data based on the original video.
  • the generating the title video data based on the video segment of interest includes:
  • the title video data is generated based on the video segment of interest and the first preset template.
  • the identification (eg serial number, name, introduction) information of the setting task and/or the user's identification (eg nickname) are displayed in the second setting position of the first preset template to obtain the title video data.
  • the video segment of interest a small video of about 1 s
  • the introduction information of the set task such as a challenge tongue twister
  • the user's nickname as shown in FIG. 5 : challenger
  • forest output the second set position 520 of the first preset template.
  • the generation of video data in-slice based on the original video includes:
  • In-slice video data is generated based on the original video and the second preset template.
  • the original video is added to the third setting position of the second preset template, so as to play the original video at the third setting position;
  • the fourth setting position of the second preset template displays the matching animation, and/or the association of the setting task is displayed at the fifth setting position of the second preset template according to the content of the setting task information; thereby generating the in-slice video data.
  • the original video includes multiple partial videos, and each partial video corresponds to a subtask, and it may be determined based on the original video that the user performs a single setting.
  • Partial video during a task i.e. a subtask.
  • the original video is a video when the user performs a tongue twister challenge.
  • the user performs four tongue twister challenges. Based on the difficulty of each tongue twister, the user first challenges the simpler tongue twister, and then challenges the more difficult tongue twister. .
  • the user first challenges the tongue twister "red phoenix pink phoenix, red phoenix phoenix flower phoenix", when the user finishes repeating the current tongue twister, the next more difficult tongue twister such as "niu niu niu niu, niu niu pull niu niu” is automatically displayed; then proceed to the third A tongue twister challenge, such as "Li Xiaoli's family raised red carp, green carp and a donkey"; and finally a fourth tongue twister challenge, such as "Blue coach is a female coach, Lu coach is a male coach".
  • the video when the user performs each tongue twister challenge is determined as a partial video when the subtask is performed.
  • the video when the user repeats the tongue twister "Red Phoenix Pink Phoenix, Red Phoenix Flower Phoenix” is a partial video, and the user repeats the tongue twister "Niuniu”.
  • the video when "Morning Niu Niu Niu Niu Niu Niu Niu Niu Niu Niu Niu Niu Niu Niu” is another partial video.
  • a plurality of the partial videos are respectively added to the third setting position of the corresponding second preset template, so as to play the partial videos in the third setting position of the second preset template, wherein each partial video corresponds to An independent second preset template; a matching animation is displayed at the fourth setting position of the second preset template according to the situation that the user performs the set task; in the fifth setting of each of the second preset templates The position displays the associated information of the corresponding subtask, and obtains the video data in the film.
  • a plurality of the partial videos are respectively added to the third setting position 610 of the corresponding second preset template (No. The middle position of the second template) to play the partial video at the third set position 610 of the second preset template.
  • the associated information of the corresponding subtask is displayed, and the associated information includes at least one of the following: content detail information of the subtask (for example, the "red" in Fig.
  • the content information, the microphone, the countdown reminder logo and the game category can all be added in the form of information stickers in the setting position 620 of the second preset template, for example, the positions on the left and right sides of the second preset template.
  • special effects can also be added according to the content of the information. For example, when the content of the information is "Grandma Liu likes to drink durian milk", stickers with rendering effects can be added, such as the stickers of the "milk” picture.
  • Each tongue twister has a system-set completion time, and a countdown stopwatch can be displayed accordingly.
  • a matching animation is displayed at the fourth setting position of the second preset template according to the situation that the user performs the setting task, including at least one of the following:
  • an animation matching the preset word is displayed at the fourth set position.
  • the animation effect of "black face” is displayed in the fourth setting position of the second preset template, and the fourth setting The position may be the position where the user's face image is displayed, that is, the face becomes a black face to enhance the animation effect and improve the interest.
  • an animation matching the set action is displayed at the fourth set position.
  • the special effect of the big head is displayed in the fourth setting position to realize the effect of magnifying the expression.
  • the fourth set position may be a position where the face image of the user is displayed, that is, adding a special effect of a big head to the face and amplifying the expression of the user, so as to enhance the animation effect and improve the interest.
  • an animation matching the accuracy is displayed at the fourth set position.
  • the accuracy and completeness of the user's recitation are determined through speech recognition, and the evaluation is given according to the accuracy and completeness, such as "perfect", “excellent”, “average” in the form of animation. ", "Come on”, etc.
  • the generation of end-credit video data based on the video clip of interest includes:
  • the end animation number is generated based on the video clip of interest and the third preset template.
  • the generating of the end-credit video data based on the video clip of interest and the third preset template includes:
  • the matching content is displayed in the seventh setting position of the third preset template according to the situation that the user performs the setting task.
  • the matching content includes at least one of the following: title information and compliment information that match the situation that the user performs the set task.
  • FIG. 7 a schematic diagram of an end-credits video data image, a face image is displayed at the sixth setting position of the third preset template, and title information “Little Achievement” and praise information are displayed at the seventh setting position "Like”, “Come on”, “Aoli give!!!” and so on.
  • Step 440 Generating a target video by splicing at least one of the title video data and the ending video data with the in-title video data.
  • the title video data is generated based on the video clip of interest, and then the title video data is spliced and synthesized with the original video to obtain the target video; the ending video data can also be generated based on the video clip of interest, and then The title video data and the original video are spliced and synthesized to obtain the target video; the title video data and the title video data can also be generated respectively based on the video clips of interest, and then the title video data, the original video and the title video data are spliced, Synthesis processing to obtain the target video.
  • the introductory video data and the introductory video data can be generated based on the video clips of interest respectively, the in-credits video data can be generated based on the original video, and then the introductory video data, the in-credits video data, and the in-credits video data can be generated. Perform splicing and synthesis processing to obtain the target video.
  • the title video data is generated based on the video clip of interest and the first preset template, and it is exemplified that funny expressions such as laughing, exaggerated, etc. are added to the first setting of the first preset template. position, and display relevant game prop introduction information and challenge user nicknames and other information in the second set position, generate in-film video data based on the original video and the second preset template, and generate video data based on the video clip of interest and the third preset template.
  • FIG. 8 provides a video generation apparatus according to Embodiment 4 of the present disclosure.
  • the apparatus includes a shooting module 810 , a determination module 820 , and a processing module 830 .
  • the shooting module 810 is set to receive a trigger operation acting on the video shooting page, and shoot the original video in response to the trigger operation;
  • the determination module 820 is set to determine the video segment of interest in the original video;
  • the processing module 830 It is set to perform video synthesis processing based on the video segment of interest and the original video to obtain a target video.
  • the original video includes a video obtained by shooting a picture of a user performing a set task; correspondingly, the device further includes:
  • the display module is configured to display prompt information on the video shooting page in response to the trigger operation, so as to guide the user to perform the setting task.
  • the display module is set to:
  • prompt information of each setting task is displayed on the video shooting page in sequence.
  • the determining module 820 is set to:
  • Video segments of interest in the original video are determined based on image recognition.
  • the determining module 820 includes:
  • a recognition recording unit configured to perform facial expression recognition on the image frame of the original video based on the facial expression recognition model, and record the timestamp including the first image frame of the set facial expression and the facial expression score corresponding to each of the first image frames;
  • the obtaining unit is configured to determine the first image frame whose expression score reaches the set threshold as the second image frame; and obtain the video segment of interest according to the timestamp of the second image frame.
  • the acquisition unit is set to:
  • a video of a set duration is intercepted within the duration time interval of the currently set task as the video segment of interest.
  • the processing module 830 includes:
  • a first generating unit configured to generate credit video data and/or credit video data based on the video segment of interest
  • the second generation unit is set to generate video data in the slice based on the original video
  • a splicing unit configured to splicing at least one of the video data in the title and the video data at the end of the title and the video data in the title to generate a target video.
  • the first generating unit includes:
  • a first generating subunit configured to generate title video data based on the video segment of interest and the first preset template
  • the second generation subunit is set to generate video data in a slice based on the original video and the second preset template
  • a third generating subunit is configured to generate end-cap video data based on the video segment of interest and a third preset template.
  • the first generation subunit is set to:
  • the introduction information of the set task and/or the user's identification is displayed at the second set position of the first preset template, so as to obtain the title video data.
  • the second generation subunit is set to:
  • a matching animation is displayed in the fourth setting position of the second preset template according to the situation that the user performs the set task, and/or the fifth setting of the second preset template is displayed according to the content of the set task.
  • the related information of the set task is displayed at a fixed position, thereby generating the in-film video data.
  • the associated information includes at least one of the following: content detail information of a single set task, a microphone, a countdown reminder identifier, and a game category to which the set task belongs.
  • a matching animation is displayed at the fourth setting position of the second preset template according to the situation that the user performs the setting task, including at least one of the following:
  • an animation matching the accuracy is displayed in the fourth setting position of the second preset template.
  • the third generation subunit is set to:
  • the matching content is displayed in the seventh setting position of the third preset template according to the situation that the user performs the setting task.
  • the matching content includes at least one of the following: title information and praise information that match the situation in which the user performs the set task.
  • the set task includes a tongue twister challenge game and/or a quiz game.
  • the original video includes a portrait video
  • the target video includes a landscape video
  • an original video is captured in response to the trigger operation; an interesting video segment in the original video is determined;
  • the original video is processed by video synthesis to obtain the target video, which realizes the automatic editing and synthesis of the video, and improves the processing effect of the video.
  • the video generation apparatus provided by the embodiment of the present disclosure can execute the video generation method provided by any embodiment of the present disclosure, and has functional modules corresponding to the execution method.
  • FIG. 9 it shows a schematic structural diagram of an electronic device (eg, a terminal device or a server in FIG. 9 ) 400 suitable for implementing an embodiment of the present disclosure.
  • Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), PAD (tablet computers), portable multimedia players (Portable Media Players) , PMP), mobile terminals such as in-vehicle terminals (eg, in-vehicle navigation terminals), etc., as well as fixed terminals such as digital televisions (Television, TV), desktop computers, and the like.
  • PDA Personal Digital Assistant
  • PAD tablet computers
  • PMP portable multimedia players
  • PMP portable multimedia players
  • the electronic device 400 may include a processing device (such as a central processing unit, a graphics processor, etc.) 401, which may be stored in a read-only memory (Read-Only Memory, ROM) 402 according to a program or from a storage device 406 is a program loaded into a random access memory (Random Access Memory, RAM) 403 to perform various appropriate actions and processes.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the processing device 401, the ROM 402, and the RAM 403 are connected to each other through a bus 404.
  • An Input/Output (I/O) interface 405 is also connected to the bus 404 .
  • the following devices can be connected to the I/O interface 405: input devices 406 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a Liquid Crystal Display (LCD) output device 407 , speaker, vibrator, etc.; storage device 406 including, eg, magnetic tape, hard disk, etc.; and communication device 409 .
  • Communication means 409 may allow electronic device 400 to communicate wirelessly or by wire with other devices to exchange data.
  • FIG. 9 shows electronic device 400 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a non-transitory computer readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication device 409, or from the storage device 406, or from the ROM 402.
  • the processing apparatus 401 When the computer program is executed by the processing apparatus 401, the above-mentioned functions defined in the methods of the embodiments of the present disclosure are executed.
  • the terminal provided by the embodiments of the present disclosure and the video generation method provided by the above embodiments belong to the same inventive concept, and the technical details not described in detail in the embodiments of the present disclosure may refer to the above embodiments.
  • Embodiments of the present disclosure provide a computer storage medium on which a computer program is stored, and when the program is executed by a processor, implements the video generation method provided by the foregoing embodiments.
  • the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above.
  • Computer readable storage media may include, but are not limited to, electrical connections having at least one wire, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable Read memory (Erasable Programmable Read-Only Memory, EPROM or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or any of the above suitable combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • the program code embodied on the computer-readable medium may be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
  • the client and server can use any currently known or future developed network protocol such as HTTP (HyperText Transfer Protocol) to communicate, and can communicate with digital data in any form or medium Communication (eg, a communication network) interconnects.
  • HTTP HyperText Transfer Protocol
  • Examples of communication networks include Local Area Networks (LANs), Wide Area Networks (WANs), the Internet (eg, the Internet), and end-to-end networks (eg, ad hoc Known or future developed networks.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries at least one program, and when the above-mentioned at least one program is executed by the electronic device, causes the electronic device to:
  • Computer program code for performing the operations of the present disclosure may be written in at least one programming language, including but not limited to object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional procedural programming language - such as "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains at least one configurable function for implementing the specified logical function. Execute the instruction.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner. Wherein, the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the editable content display unit may also be described as an "editing unit".
  • exemplary types of hardware logic components include: Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (Application Specific Standard Products) Standard Parts, ASSP), system on chip (System on Chip, SOC), complex programmable logic device (Complex Programmable Logic Device, CPLD) and so on.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSP Application Specific Standard Products
  • SOC System on Chip
  • complex programmable logic device Complex Programmable Logic Device, CPLD
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with the instruction execution system, apparatus or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include at least one wire-based electrical connection, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM) or flash memory), fiber optics, compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM erasable programmable read only memory
  • flash memory flash memory
  • fiber optics compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • CD-ROM compact disk read only memory
  • magnetic storage devices or any suitable combination of the foregoing.
  • Example 1 provides a video generation method, the method includes:
  • Example 2 provides a method for generating a video.
  • the original video includes a video obtained by shooting a picture of a user performing a set task;
  • the method also includes:
  • prompt information is displayed on the video shooting page to guide the user to perform the setting task.
  • Example 3 provides a video generation method.
  • the determining the video segment of interest in the original video includes:
  • Video segments of interest in the original video are determined based on image recognition.
  • Example 4 provides a video generation method.
  • the determining an interesting video segment in the original video based on image recognition includes:
  • the video segment of interest is acquired according to the timestamp of the second image frame.
  • Example 5 provides a video generation method.
  • the acquiring the video segment of interest according to the timestamp of the second image frame includes:
  • a video of a set duration is intercepted within the duration time interval of the task corresponding to the second image frame as the video segment of interest.
  • Example 6 provides a video generation method.
  • performing video synthesis processing based on the video segment of interest and the original video to obtain a target video includes:
  • a target video is generated by splicing at least one of the intro video data and the end video data with the in-slice video data.
  • Example 7 provides a video generation method, optionally,
  • the generating of the title video data based on the video segment of interest includes:
  • the generating video data in-slice based on the original video includes:
  • the generating the end-credit video data based on the video clip of interest includes:
  • End-credit video data is generated based on the video segment of interest and the third preset template.
  • Example 8 provides a video generation method, optionally,
  • the generating of title video data based on the video segment of interest and the first preset template includes:
  • the title video data is thereby generated.
  • Example 9 provides a video generation method, optionally, generating in-slice video data based on the original video and a second preset template, including:
  • a matching animation is displayed in the fourth setting position of the second preset template according to the situation that the user performs the set task, and/or the fifth setting of the second preset template is displayed according to the content of the set task. Display the associated information of the set task at a fixed position;
  • Example 10 provides a video generation method.
  • the fourth set position of shows a matching animation, including at least one of the following:
  • an animation matching the accuracy is displayed at the fourth set position.
  • Example 11 provides a video generation method.
  • the generation of end-credit video data based on the video segment of interest and a third preset template includes:
  • the end-credit video data is thereby generated.
  • Example 12 provides a video generation method.
  • the original video includes a portrait video
  • the target video includes a landscape video.
  • Example thirteen provides a video generation apparatus, the apparatus comprising:
  • a shooting module configured to receive a trigger operation acting on the video shooting page, and to shoot the original video in response to the trigger operation;
  • a determining module configured to determine the video segment of interest in the original video
  • the processing module is configured to perform video synthesis processing based on the video segment of interest and the original video to obtain a target video.
  • Example 14 provides an electronic device, the electronic device comprising:
  • storage means arranged to store at least one program
  • the at least one processor When the at least one program is executed by the at least one processor, the at least one processor implements the following video generation method:
  • Example 15 provides a storage medium containing computer-executable instructions, when executed by a computer processor, the computer-executable instructions are used to perform the following video generation method:

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Computer Interaction (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

本公开实施例公开了一种视频生成方法、装置、电子设备及存储介质,该方法包括:接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;确定所述原始视频中的感兴趣视频片段;基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。

Description

一种视频生成方法、装置、电子设备及存储介质
本申请要求在2021年4月9日提交中国专利局、申请号为202110384712.8的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本公开实施例涉及计算机技术领域,例如涉及一种视频生成方法、装置、电子设备及存储介质。
背景技术
随着智能终端的普及,安装于智能终端的各种类型的应用层出不穷。例如各种形式的视频应用,当前用户可以观看其他用户分享至平台的视频,当前用户也可以录制自己的视频,然后分享给平台的其他用户。
目前,用户在基于各类小视频应用进行视频拍摄时,可以利用应用提供的一些拍摄道具或特效获得效果较好的视频。但是,随着用户的要求越来越高,相关技术中的视频应用生成的视频较为单一无法满足用户需求。此外,相关技术中的视频应用,或者对拍摄视频的处理效果不好,或者处理方式复杂,需要用户较多的手动操作,影响用户体验。
发明内容
本公开实施例提供一种视频生成方法、装置、电子设备及存储介质,实现了视频的自动剪辑与合成,提升了视频的处理效果。
第一方面,本公开实施例提供了一种视频生成方法,该方法包括:
接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定所述原始视频中的感兴趣视频片段;
基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
第二方面,本公开实施例还提供了一种视频生成装置,该装置包括:
拍摄模块,设置为接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定模块,设置为确定所述原始视频中的感兴趣视频片段;
处理模块,设置为基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
第三方面,本公开实施例还提供了一种设备,所述设备包括:
至少一个处理器;
存储装置,设置为存储至少一个程序,
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如本公开实施例任一所述的视频生成方法。
第四方面,本公开实施例还提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如本公开实施例任一所述的视频生成方法。
附图说明
贯穿附图中,相同或相似的附图标记表示相同或相似的元素。应当理解附图是示意性的,原件和元素不一定按照比例绘制。
图1为本公开实施例一所提供的一种视频生成方法流程示意图;
图2为本公开实施例一所提供的一种在用户进行设定任务时进行视频拍摄的界面示意图;
图3为本公开实施例二所提供的一种视频生成方法流程示意图;
图4为本公开实施例三所提供的一种视频生成方法流程示意图;
图5为本公开实施例三所提供的一种片头动画图像界面的示意图;
图6a-图6e为本公开实施例三所提供的一种片中动画的图像界面示意图;
图7为本公开实施例三所提供的一种片尾动画图像界面的示意图;
图8为本公开实施例四所提供的一种视频生成装置结构示意图;
图9为本公开实施例五所提供的一种电子设备结构示意图。
具体实施方式
下面将参照附图更详细地描述本公开的实施例。虽然附图中显示了本公开的某些实施例,然而应当理解的是,本公开可以通过各种形式来实现,而且不应该被解释为限于这里阐述的实施例,相反提供这些实施例是为了更加透彻和完整地理解本公开。应当理解的是,本公开的附图及实施例仅用于示例性作用,并非用于限制本公开的保护范围。
应当理解,本公开的方法实施方式中记载的各个步骤可以按照不同的顺序执行,和/或并行执行。此外,方法实施方式可以包括附加的步骤和/或省略执行示出的步骤。本公开的范围在此方面不受限制。
本文使用的术语“包括”及其变形是开放性包括,即“包括但不限于”。术语“基于”是“至少部分地基于”。术语“一个实施例”表示“至少一个实施例”;术语“另一实施例”表示“至少一个另外的实施例”;术语“一些实施例”表示“至少一些实施例”。其他术语的相关定义将在下文描述中给出。
需要注意,本公开中提及的“第一”、“第二”等概念仅用于对不同的装置、模块或单元进行区分,并非用于限定这些装置、模块或单元所执行的功能的顺序或者相互依存关系。
需要注意,本公开中提及的“一个”、“多个”的修饰是示意性而非限制性的,本领域技术人员应当理解,除非在上下文另有明确指出,否则应该理解为“至少一个”。
实施例一
图1为本公开实施例一所提供的一种视频生成方法的流程示意图,该方法可适用于对用户拍摄的原始视频进行自动剪辑与合成,以获得一个相对于原始视频,信息量更丰富、完成度更高、更精彩的目标视频,整个视频生成过程自 动化完成,无需用户的手动操作,提高了视频的处理效果与效率,有助于提升用户的使用体验,增强应用产品的用户粘性。该视频生成方法可以由视频生成装置来执行,该装置可以通过软件和/或硬件的形式实现。
如图1所示,本实施例提供的视频生成方法包括如下步骤:
步骤110、接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频。
其中,可以是接收作用于视频拍摄页面的一目标拍摄控件的触发操作,响应于所述触发操作拍摄原始视频。例如,当用户点击目标拍摄控件时启动摄像头,对摄像头拍摄范围内的画面进行视频拍摄;当用户再次点击目标拍摄控件时,拍摄结束。所述原始视频可以是针对用户进行拍摄获得的视频,也可以是针对景物或者事物进行拍摄获得的视频。
示例性的,所述原始视频包括对用户执行设定任务的画面进行拍摄得到的视频。所述设定任务可以是任意形式的任务,例如用户自己或者用户与其好友一起模仿一段搞笑视频、用户高歌一曲或者用户跳一段热舞等。
可选的,所述设定任务还可以包括绕口令挑战游戏,和/或问答类游戏,和/或视频模仿游戏等,本实施例不对设定任务的内容和执行方式进行限定。例如在限定时间内规定用户清晰、流利地复述某绕口令,在用户复述一段绕口令时进行视频拍摄,记录用户实时表现的同时还可以基于拍摄的原始视频分析用户说的是否清晰、准确,以及所耗时长是否比其他用户短等,增强了游戏的趣味性和娱乐性。
所述设定任务可以包括至少一个子任务,相应的,在拍摄原始视频时,也即在拍摄用户执行设定任务时的画面时,可以在视频拍摄页面上显示所述设定任务的提示信息,以引导用户执行所述设定任务。在所述设定任务包括多个子任务的情况下,可以按照多个子任务的难易程度依次先后将多个子任务的提示信息显示于当前界面的非拍摄区。示例性的,所述设定任务为绕口令,其包括2个子任务,第一绕口令和第二绕口令,第二绕口令的难度大于第一绕口令的难 度,第一绕口令的提示信息在所述视频拍摄页面的展示顺序先于所述第二绕口令,由此增加用户黏度。
相应的,在视频拍摄页面上显示第一绕口令的提示信息时,引导用户执行第一绕口令任务,同时拍摄用户执行第一绕口令任务的画面,之后,在视频拍摄页面上显示第二绕口令的提示信息时,引导用户执行第二绕口令任务,同时拍摄用户执行第二绕口令任务的画面。
所述提示信息可以包括设定任务的名称、介绍和/或倒计时标识。对应的,参考图2所示的一种在用户进行设定任务时进行视频拍摄的界面示意图,可以看到在当前界面的非拍摄区显示有绕口令的内容详情“红凤凰粉凤凰,红粉凤凰花凤凰”210,以及倒计时标识“2s”220,其中标号230表示拍摄区。当用户复述完当前绕口令时,自动显示难度较难一点的下一绕口令的内容详情和倒计时标识,例如“妞妞牵牛牛,牛牛拉妞妞”。
通常,为了降低应用的开发难度以及对系统性能的开销,采用竖屏模式对用户进行设定任务时的画面进行视频拍摄,获得竖屏的原始视频。
步骤120、确定所述原始视频中的感兴趣视频片段。
示例性的,所述原始视频中的感兴趣视频片段可以指包括预设表情的视频片段,所述预设表情可以是笑、哭的表情,相应的,所述感兴趣视频片段可以是包括笑、哭的表情的视频片段。所述预设表情也可以是也可以是哈哈大笑、沮丧哭脸的夸张表情,相应的,所述感兴趣视频片段也可以是(哈哈大笑、沮丧哭脸)表情夸张的视频片段。可通过表情识别模型对原始视频中的每帧图像进行表情识别,并对包括设定表情的图像帧进行打点标记,以基于被标记的图像帧获取感兴趣视频片段。例如截取被标记的某一图像帧前后20张的图像帧组成的视频片段作为感兴趣视频片段。所述设定表情例如是大笑时的表情、哭脸表情等。
步骤130、基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
可选的,可利用感兴趣视频片段制作一些精彩时刻的图像,然后将精彩时刻的图像作为片头或者片尾,利用原始视频生成带有一些动画特效的片中视频。示例性的,可结合设定的模板,在模板的中间位置播放原始视频,在模板的其它位置添加一些动画特效。例如若用户当前绕口令复述的清晰且流利,则可显示动画特效“你太棒了”;若用户当前绕口令复述的不是很清晰流利,则可显示动画特效“继续加油哦”,还可以显示动画形式的“话筒”等。最后将通过加工处理获得的片头、片中以及片尾进行合成、拼接获得目标视频。
为了提高视频的处理效果,可将目标视频生成为横屏视频。
本公开实施例的技术方案,通过接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;确定所述原始视频中的感兴趣视频片段;基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频的技术手段,实现了视频的自动剪辑与合成,提升了视频的处理效果。
实施例二
图3为本公开实施例二所提供的一种视频生成方法的流程示意图。在上述实施例的基础上,本实施例对上述步骤120“确定所述原始视频中的感兴趣视频片段”进行了细化,给出了确定感兴趣视频片段的可选实施方式。其中,与上述实施例相同或相似的内容,本实施例不再赘述,可参考上述实施例的解释说明。
如图3所示,所述方法包括:
步骤310、接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频。
步骤320、基于图像识别确定所述原始视频中的感兴趣视频片段。
示例性的,基于表情识别模型对所述原始视频的图像帧进行表情识别,并记录包括设定表情的第一图像帧的时间戳以及至少一个所述第一图像帧分别对应的表情得分;将表情得分达到设定阈值的第一图像帧确定为第二图像帧;根 据所述第二图像帧的时间戳获取感兴趣视频片段。
其中,表情识别模型可以是基于神经网络搭建的、通过图像识别原理实现的用于识别图像中表情的算法。示例性的,将原始视频的每个图像帧依次输入至表情识别模型,表情识别模型输出是否包括设定表情的识别结果以及对应的表情得分。例如若识别结果为“1”,则表示当前图像帧包括设定表情。表情得分为用于表征表情变化程度的量,例如微笑的表情得分低于大笑的表情得分。通过记录包括设定表情的第一图像帧的时间戳为获得感兴趣视频片段提供参考依据。通过记录每个第一图像帧的表情得分,在获得感兴趣视频片段阶段可从中挑选表情得分较高的图像帧作为参考依据,有利于实现精彩视频片段的获取。
可选的,所述根据所述第二图像帧的时间戳获取感兴趣视频片段,包括:
以所述第二图像帧的时间戳为参考时间点,在所述第二图像帧所对应的任务的历时时间区间内截取设定时长的视频作为所述感兴趣视频片段。例如,以所述设定任务包括一个子任务为例,所述子任务为绕口令“红凤凰粉凤凰,红粉凤凰花凤凰”,默认规定用户复述该绕口令的时长为5s,假设用户开始复述该绕口令的时间为第1s,则该子任务的历时时间区间为第1s至第5s,第二图像帧的时间戳为第3s,感兴趣视频片段的时长为1s,则以第3s为参考点,取参考点前后0.5s内的图像帧组成感兴趣视频片段,即将时间戳落在第2.5s-第3.5s内的图像帧确定为感兴趣视频片段的图像帧。假设第二图像帧的时间戳为第4.7s,若向后取0.5s(即第5.2s)则超出了所述子任务的历史时间区间(第1s至第5s),此时则取时间戳落在第4s-第5s内的图像帧为感兴趣视频片段的图像帧,即以所述第二图像帧的时间戳为参考时间点,将所述第二图像帧所对应的任务的历时时间区间内靠近所述第二图像帧的设定数量的图像帧确定为所述感兴趣视频片段。此外,对于每个子任务,所述第二图像帧也可以为多个,此时,可以根据多个第二图像帧分别确定感兴趣视频片段再合成为所述子任务的最终感兴趣视频片段。
当所述设定任务包括多个子任务时,也可以针对每个子任务分别确定感兴 趣视频片段再合成为所述设定任务的最终感兴趣视频片段,针对每个子任务确定感兴趣视频片段的方式与上述类似,此处不再赘述。
步骤330、基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
本公开实施例的技术方案,通过基于表情识别模型对所述原始视频的图像帧进行表情识别,并记录包括设定表情的第一图像帧的时间戳以及至少一个所述第一图像帧分别对应的表情得分;将表情得分达到设定阈值的第一图像帧确定为第二图像帧;以所述第二图像帧的时间戳为参考时间点,将当前设定任务的历时时间区间内靠近所述第二图像帧的设定数量的图像帧确定为所述感兴趣视频片段,实现了感兴趣视频片段的精确确定,为获得目标视频提供了数据基础。
实施例三
图4为本公开实施例三所提供的一种视频生成方法的流程示意图。在上述实施例的基础上,本实施例对上述步骤130“基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频”进行了细化,给出了视频剪辑、合成处理的可选实施方式。其中,与上述实施例相同或相似的内容,本实施例不再赘述,可参考上述实施例的解释说明。
如图4所示,所述方法包括如下步骤:
步骤410、接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频。
步骤420、确定所述原始视频中的感兴趣视频片段。
步骤430、基于所述感兴趣视频片段生成片头视频数据和/或片尾视频数据,基于所述原始视频生成片中视频数据。
示例性的,所述基于所述感兴趣视频片段生成片头视频数据,包括:
基于所述感兴趣视频片段以及第一预设模板生成片头视频数据。
例如,将所述感兴趣视频片段添加至所述第一预设模板的第一设定位置,以在所述第一设定位置播放所述感兴趣视频片段;
在所述第一预设模板的第二设定位置展示所述设定任务的标识(例如序号、名称、介绍)信息和/或用户的标识(例如昵称),以获得片头视频数据。参考图5所示的一种片头动画图像界面的示意图,其中,将所述感兴趣视频片段(大约1s左右的小视频)添加至所述第一预设模板的第一设定位置510(三个圆圈处),在所述第一预设模板的第二设定位置520展示所述设定任务的介绍信息(如挑战绕口令)和/或用户的昵称(如图5所示的:挑战者:林输出)。
和/或,所述基于所述原始视频生成片中视频数据,包括:
基于所述原始视频以及第二预设模板生成片中视频数据。
示例性的,将所述原始视频添加至所述第二预设模板的第三设定位置,以在所述第三设定位置播放所述原始视频;根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画、和/或根据所述设定任务的内容在所述第二预设模板的第五设定位置展示所述设定任务的关联信息;由此生成所述片中视频数据。
示例性的,当所述设定任务包括多个子任务,相应的,所述原始视频包括多个局部视频,每个局部视频对应一个子任务,可以基于所述原始视频分别确定用户执行单个设定任务(即子任务)时的局部视频。例如所述原始视频是用户进行绕口令挑战时的视频,在视频中用户共进行了四个绕口令挑战,基于每个绕口令的难易程度,用户先挑战较简单的绕口令,然后再挑战较难的绕口令。例如用户先挑战绕口令“红凤凰粉凤凰,红粉凤凰花凤凰”,当用户复述完当前绕口令时,自动显示难度较难一点的下一绕口令例如“妞妞牵牛牛,牛牛拉妞妞”;接着进行第三个绕口令的挑战,例如“李小莉家养了红鲤鱼与绿鲤鱼与驴”;最后进行第四个绕口令的挑战,例如“蓝教练是女教练,吕教练是男教练”。将用户分别进行每个绕口令挑战时的视频确定为执行子任务时的局部视频,例如,用户复述绕口令“红凤凰粉凤凰,红粉凤凰花凤凰”时的视频为一个局部 视频,用户复述绕口令“妞妞牵牛牛,牛牛拉妞妞”时的视频为另一个局部视频。
将多个所述局部视频分别添加至对应的第二预设模板的第三设定位置,以在第二预设模板的第三设定位置播放所述局部视频,其中,每个局部视频对应独立的第二预设模板;根据用户执行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画;在每个所述第二预设模板的第五设定位置展示对应的子任务的关联信息,获得片中视频数据。
示例性的,参考图6a-图6e所示的一种片中动画的图像界面示意图,其中将多个所述局部视频分别添加至对应的第二预设模板的第三设定位置610(第二模板的中间位置),以在第二预设模板的第三设定位置610播放所述局部视频。在每个所述第二预设模板的第五设定位置620展示对应的子任务的关联信息,所述关联信息包括下述至少一种:子任务的内容详情信息(例如图6a的“红凤凰粉凤凰,红粉凤凰花凤凰”、图6b的“妞妞牵牛牛,牛牛拉妞妞”、图6c的“李小莉家养了红鲤鱼与绿鲤鱼与驴”、图6d的“蓝教练是女教练,吕教练是男教练”、图6e的“发废话会花话费,回发废话费话费,发废话花费话费会后悔”)、话筒、倒计时提醒标识以及设定任务所属的游戏类别,如图6a-图6e所示的“挑战绕口令”。所述内容信息、话筒、倒计时提醒标识以及游戏类别均可以以信息化贴纸的形式添加在第二预设模板的设定位置620,例如第二预设模板左右两边的位置。同时还可以根据信息的内容添加特效,例如信息内容为“刘奶奶爱喝榴莲牛奶”时,可添加带有渲染效果的贴纸,如“牛奶”图画的贴纸。每个绕口令都有系统设定的完成时间,可对应显示倒计时秒表。
可选的,根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画,包括下述至少一种:
在用户语音说出预设词语时,在所述第四设定位置展示与所述预设词语匹配的动画。例如,当用户说到“化肥挥发会发黑”中的“发黑”时,在所述第二预设模板的第四设定位置展示“黑脸”的动画效果,所述第四设定位置可以 是展示用户人脸图像的位置,即人脸变成黑脸,以增强动画效果,提高趣味性。
在用户做出设定动作时,在所述第四设定位置展示与所述设定动作匹配的动画。例如当用户哈哈大笑时,在第四设定位置展示大头特效,实现放大表情的效果。所述第四设定位置可以是展示用户人脸图像的位置,即给人脸添加大头特效,放大用户表情,以增强动画效果,提高趣味性。
根据用户进行设定任务的准确度在所述第四设定位置展示与所述准确度匹配的动画。例如,在用户复述完每个绕口令时,通过语音识别确定用户复述的准确度以及完整度,并根据准确度以及完整度给予评价,例如以动画的形式显示“完美”、“优秀”、“一般”、“加油”等词汇。
和/或,所述基于所述感兴趣视频片段生成片尾视频数据,包括:
基于所述感兴趣视频片段以及第三预设模板生成片尾动画数。
示例性的,所述基于所述感兴趣视频片段以及第三预设模板生成片尾视频数据,包括:
基于所述感兴趣视频片段截取人脸头像;
将所述人脸头像添加至第三预设模板的第六设定位置,以在所述第六设定位置显示所述人脸头像;
根据用户进行设定任务的情况在所述第三预设模板的第七设定位置展示匹配内容。所述匹配内容包括下述至少一种:与用户进行设定任务的情况匹配的称号信息以及夸赞信息。如图7所示的一种片尾视频数据图像的示意图,在第三预设模板的第六设定位置显示有人脸图像,在第七设定位置展示有称号信息“小有成就”以及夸赞信息“点赞”、“加油”、“奥力给!!!”等。
步骤440、将所述片头视频数据和片尾视频数据两者的至少其中之一与所述片中视频数据拼接,生成目标视频。
可选的,基于所述感兴趣视频片段生成片头视频数据,然后将该片头视频数据与原始视频进行拼接、合成处理,获得目标视频;还可以基于所述感兴趣视频片段生成片尾视频数据,然后将该片尾视频数据与原始视频进行拼接、合 成处理,获得目标视频;还可以基于所述感兴趣视频片段分别生成片头和片尾视频数据,然后将片头视频数据、原始视频以及片尾视频数据进行拼接、合成处理,获得目标视频。为了提高视频的处理程度与效果,可基于所述感兴趣视频片段分别生成片头和片尾视频数据,基于所述原始视频生成片中视频数据,然后将片头视频数据、片中视频数据以及片尾视频数据进行拼接、合成处理,获得目标视频。
本实施例的技术方案,基于感兴趣视频片段以及第一预设模板生成片头视频数据,示例性的是将例如大笑的、夸张的等搞笑表情添加至第一预设模板的第一设定位置,并在第二设定位置展示相关的游戏道具介绍信息以及挑战用户的昵称等信息,基于原始视频以及第二预设模板生成片中视频数据,基于感兴趣视频片段以及第三预设模板生成片尾动画数,将片头视频数据、片中视频数据以及片尾视频数据进行拼接合成,获得目标视频,实现了视频的混剪、合成处理,提高了视频的处理效果,可获得完成度更高,更精彩的目标视频,增强了趣味性,提升了用户的使用体验。
实施例四
图8为本公开实施例四提供的一种视频生成装置,该装置包括:拍摄模块810、确定模块820和处理模块830。
其中,拍摄模块810,设置为接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;确定模块820,设置为确定所述原始视频中的感兴趣视频片段;处理模块830,设置为基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
在上述各技术方案的基础上,所述原始视频包括对用户执行设定任务的画面进行拍摄得到的视频;对应的,所述装置还包括:
显示模块,设置为响应于所述触发操作,在所述视频拍摄页面显示提示信息,以引导用户执行所述设定任务。
在上述各技术方案的基础上,所述显示模块设置为:
按照各设定任务的难易程度,依次将各设定任务的提示信息显示于所述视频拍摄页面。
在上述各技术方案的基础上,确定模块820设置为:
基于图像识别确定所述原始视频中的感兴趣视频片段。
在上述各技术方案的基础上,确定模块820包括:
识别记录单元,设置为基于表情识别模型对所述原始视频的图像帧进行表情识别,并记录包括设定表情的第一图像帧的时间戳以及各所述第一图像帧分别对应的表情得分;
获取单元,设置为将表情得分达到设定阈值的第一图像帧确定为第二图像帧;根据所述第二图像帧的时间戳获取感兴趣视频片段。
在上述各技术方案的基础上,所述获取单元设置为:
以所述第二图像帧的时间戳为参考时间点,在当前设定任务的历时时间区间内截取设定时长的视频作为所述感兴趣视频片段。
在上述各技术方案的基础上,处理模块830包括:
第一生成单元,设置为基于所述感兴趣视频片段生成片头视频数据和/或片尾视频数据;
第二生成单元,设置为基于所述原始视频生成片中视频数据;
拼接单元,设置为将所述片头视频数据和片尾视频数据两者的至少其中之一与所述片中视频数据拼接,生成目标视频。
在上述各技术方案的基础上,所述第一生成单元包括:
第一生成子单元,设置为基于所述感兴趣视频片段以及第一预设模板生成片头视频数据;
和/或,第二生成子单元,设置为基于所述原始视频以及第二预设模板生成片中视频数据;
和/或,第三生成子单元,设置为基于所述感兴趣视频片段以及第三预设模 板生成片尾视频数据。
在上述各技术方案的基础上,所述第一生成子单元设置为:
将所述感兴趣视频片段添加至所述第一预设模板的第一设定位置,以在所述第一预设模板的第一设定位置播放所述感兴趣视频片段;
在所述第一预设模板的第二设定位置展示所述设定任务的介绍信息和/或用户的标识,以获得片头视频数据。
在上述各技术方案的基础上,所述第二生成子单元设置为:
将所述原始视频添加至所述第二预设模板的第三设定位置,以在所述第三设定位置播放所述原始视频;
根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画、和/或根据所述设定任务的内容在所述第二预设模板的第五设定位置展示所述设定任务的关联信息,由此生成所述片中视频数据。
在上述各技术方案的基础上,所述关联信息包括下述至少一种:单个设定任务的内容详情信息、话筒、倒计时提醒标识以及设定任务所属的游戏类别。
在上述各技术方案的基础上,根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画,包括下述至少一种:
在用户语音说出预设词语时,在所述第四设定位置展示与所述预设词语匹配的动画;
在用户做出设定动作时,在所述第四设定位置展示与所述设定动作匹配的动画;
根据用户进行设定任务的准确度在所述第二预设模板的第四设定位置展示与所述准确度匹配的动画。
在上述各技术方案的基础上,所述第三生成子单元设置为:
基于所述感兴趣视频片段截取人脸头像;
将所述人脸头像添加至第三预设模板的第六设定位置,以在所述第六设定位置显示所述人脸头像;
根据用户进行设定任务的情况在第三预设模板的第七设定位置展示匹配内容。
在上述各技术方案的基础上,所述匹配内容包括下述至少一种:与用户进行设定任务的情况匹配的称号信息以及夸赞信息。
在上述各技术方案的基础上,所述设定任务包括绕口令挑战游戏,和/或问答类游戏。
在上述各技术方案的基础上,所述原始视频包括竖屏视频,所述目标视频包括横屏视频。
本公开实施例的技术方案,通过接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;确定所述原始视频中的感兴趣视频片段;基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频,实现了视频的自动剪辑与合成,提升了视频的处理效果。
本公开实施例所提供的视频生成装置可执行本公开任意实施例所提供的视频生成方法,具备执行方法相应的功能模块。
值得注意的是,上述装置所包括的各个单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,各功能单元的名称也只是为了便于相互区分,并不用于限制本公开实施例的保护范围。
实施例五
下面参考图9,其示出了适于用来实现本公开实施例的电子设备(例如图9中的终端设备或服务器)400的结构示意图。本公开实施例中的终端设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、PAD(平板电脑)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视(Television,TV)、台式计算机等等的固定终端。图9示 出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图9所示,电子设备400可以包括处理装置(例如中央处理器、图形处理器等)401,其可以根据存储在只读存储器(Read-Only Memory,ROM)402中的程序或者从存储装置406加载到随机访问存储器(Random Access Memory,RAM)403中的程序而执行各种适当的动作和处理。在RAM 403中,还存储有电子设备400操作所需的各种程序和数据。处理装置401、ROM 402以及RAM 403通过总线404彼此相连。输入/输出(Input/Output,I/O)接口405也连接至总线404。
通常,以下装置可以连接至I/O接口405:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置406;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置407;包括例如磁带、硬盘等的存储装置406;以及通信装置409。通信装置409可以允许电子设备400与其他设备进行无线或有线通信以交换数据。虽然图9示出了具有各种装置的电子设备400,但是应理解的是,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在非暂态计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置409从网络上被下载和安装,或者从存储装置406被安装,或者从ROM 402被安装。在该计算机程序被处理装置401执行时,执行本公开实施例的方法中限定的上述功能。
本公开实施例提供的终端与上述实施例提供的视频生成方法属于同一发明构思,未在本公开实施例中详尽描述的技术细节可参见上述实施例。
实施例六
本公开实施例提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述实施例所提供的视频生成方法。
需要说明的是,本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有至少一个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器((Erasable Programmable Read-Only Memory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(HyperText Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN),网际网(例如,互联网)以及端对端网络(例如,ad hoc 端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
上述计算机可读介质承载有至少一个程序,当上述至少一个程序被该电子设备执行时,使得该电子设备:
接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定所述原始视频中的感兴趣视频片段;
基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
可以以至少一种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括但不限于面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含至少一个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用 执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定,例如,可编辑内容显示单元还可以被描述为“编辑单元”。
本文中以上描述的功能可以至少部分地由至少一个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(Field Programmable Gate Array,FPGA)、专用集成电路(Application Specific Integrated Circuit,ASIC)、专用标准产品(Application Specific Standard Parts,ASSP)、片上系统(System on Chip,SOC)、复杂可编程逻辑设备(Complex Programmable Logic Device,CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于至少一个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
根据本公开的至少一个实施例,【示例一】提供了一种视频生成方法,该方法包括:
接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定所述原始视频中的感兴趣视频片段;
基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标 视频。
根据本公开的至少一个实施例,【示例二】提供了一种视频生成方法,可选的,所述原始视频包括对用户执行设定任务的画面进行拍摄得到的视频;
所述方法还包括:
响应于所述触发操作,在所述视频拍摄页面显示提示信息,以引导用户执行所述设定任务。
根据本公开的至少一个实施例,【示例三】提供了一种视频生成方法,可选的,所述确定所述原始视频中的感兴趣视频片段,包括:
基于图像识别确定所述原始视频中的感兴趣视频片段。
根据本公开的至少一个实施例,【示例四】提供了一种视频生成方法,可选的,所述基于图像识别确定所述原始视频中的感兴趣视频片段,包括:
基于表情识别模型对所述原始视频的图像帧进行表情识别,并记录包括设定表情的第一图像帧的时间戳以及各所述第一图像帧分别对应的表情得分;
将表情得分达到设定阈值的第一图像帧确定为第二图像帧;
根据所述第二图像帧的时间戳获取感兴趣视频片段。
根据本公开的至少一个实施例,【示例五】提供了一种视频生成方法,可选的,所述根据所述第二图像帧的时间戳获取感兴趣视频片段,包括:
以所述第二图像帧的时间戳为参考时间点,在所述第二图像帧所对应的任务的历时时间区间内截取设定时长的视频作为所述感兴趣视频片段。
根据本公开的至少一个实施例,【示例六】提供了一种视频生成方法,可选的,所述基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频,包括:
基于所述感兴趣视频片段生成片头视频数据和/或片尾视频数据;
基于所述原始视频生成片中视频数据;
将所述片头视频数据和片尾视频数据两者的至少其中之一与所述片中视频数据拼接,生成目标视频。
根据本公开的至少一个实施例,【示例七】提供了一种视频生成方法,可选的,
所述基于所述感兴趣视频片段生成片头视频数据,包括:
基于所述感兴趣视频片段以及第一预设模板生成片头视频数据;和/或,
所述基于所述原始视频生成片中视频数据,包括:
基于所述原始视频以及第二预设模板生成片中视频数据;和/或,
所述基于所述感兴趣视频片段生成片尾视频数据,包括:
基于所述感兴趣视频片段以及第三预设模板生成片尾视频数据。
根据本公开的至少一个实施例,【示例八】提供了一种视频生成方法,可选的,
所述基于所述感兴趣视频片段以及第一预设模板生成片头视频数据,包括:
将所述感兴趣视频片段添加至所述第一预设模板的第一设定位置,以在所述第一预设模板的第一设定位置播放所述感兴趣视频片段;
在所述第一预设模板的第二设定位置展示所述设定任务的介绍信息和/或用户的标识信息;
由此生成所述片头视频数据。
根据本公开的至少一个实施例,【示例九】提供了一种视频生成方法,可选的,基于所述原始视频以及第二预设模板生成片中视频数据,包括:
将所述原始视频添加至所述第二预设模板的第三设定位置,以在所述第三设定位置播放所述原始视频;
根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画、和/或根据所述设定任务的内容在所述第二预设模板的第五设定位置展示所述设定任务的关联信息;
由此生成所述片中视频数据根据本公开的至少一个实施例,【示例十】提供了一种视频生成方法,可选的,根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画,包括下述至少一种:
在用户语音说出预设词语时,在所述第四设定位置展示与所述预设词语匹配的动画;
在用户做出设定动作时,在所述第四设定位置展示与所述设定动作匹配的动画;
根据用户进行设定任务的准确度在所述第四设定位置展示与所述准确度匹配的动画。
根据本公开的至少一个实施例,【示例十一】提供了一种视频生成方法,可选的,所述基于所述感兴趣视频片段以及第三预设模板生成片尾视频数据,包括:
基于所述感兴趣视频片段截取人脸头像;
将所述人脸头像添加至所述第三预设模板的第六设定位置,以在所述第六设定位置显示所述人脸头像;
根据用户的任务完成度在所述第三预设模板的第七设定位置展示与所述完成度相对应的匹配内容;
由此生成所述片尾视频数据。
根据本公开的至少一个实施例,【示例十二】提供了一种视频生成方法,可选的,所述原始视频包括竖屏视频,所述目标视频包括横屏视频。
根据本公开的至少一个实施例,【示例十三】提供了一种视频生成装置,该装置包括:
拍摄模块,设置为接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定模块,设置为确定所述原始视频中的感兴趣视频片段;
处理模块,设置为基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
根据本公开的至少一个实施例,【示例十四】提供了一种电子设备,所述电子设备包括:
至少一个处理器;
存储装置,设置为存储至少一个程序,
当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如下所述的视频生成方法:
接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定所述原始视频中的感兴趣视频片段;
基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
根据本公开的至少一个实施例,【示例十五】提供了一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行下述的视频生成方法:
接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
确定所述原始视频中的感兴趣视频片段;
基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
此外,虽然采用特定次序描绘了各操作,但是这不应当理解为要求这些操作以所示出的特定次序或以顺序次序执行来执行。在一定环境下,多任务和并行处理可能是有利的。同样地,虽然在上面论述中包含了若干具体实现细节,但是这些不应当被解释为对本公开的范围的限制。在单独的实施例的上下文中描述的某些特征还可以组合地实现在单个实施例中。相反地,在单个实施例的上下文中描述的各种特征也可以单独地或以任何合适的子组合的方式实现在多个实施例中。
尽管已经采用特定于结构特征和/或方法逻辑动作的语言描述了本主题,但是应当理解所附权利要求书中所限定的主题未必局限于上面描述的特定特征或动作。相反,上面所描述的特定特征和动作仅仅是实现权利要求书的示例形式。

Claims (15)

  1. 一种视频生成方法,包括:
    接收作用于视频拍摄页面的触发操作,响应于所述触发操作拍摄原始视频;
    确定所述原始视频中的感兴趣视频片段;
    基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
  2. 根据权利要求1所述的方法,其中,所述原始视频包括对用户执行设定任务的画面进行拍摄得到的视频;
    所述方法还包括:
    响应于拍摄所述原始视频,在所述视频拍摄页面显示提示信息,以引导用户执行所述设定任务。
  3. 根据权利要求2所述的方法,其中,所述确定所述原始视频中的感兴趣视频片段,包括:
    基于图像识别确定所述原始视频中的感兴趣视频片段。
  4. 根据权利要求3所述的方法,其中,所述基于图像识别确定所述原始视频中的感兴趣视频片段,包括:
    基于表情识别模型对所述原始视频的图像帧进行表情识别,并记录包括设定表情的第一图像帧的时间戳及对应的表情得分;
    将所述表情得分达到设定阈值的第一图像帧确定为第二图像帧;
    根据所述第二图像帧的时间戳获取感兴趣视频片段。
  5. 根据权利要求4所述的方法,其中,所述根据所述第二图像帧的时间戳获取感兴趣视频片段,包括:
    以所述第二图像帧的时间戳为参考时间点,在所述第二图像帧所对应的任务的历时时间区间内截取设定时长的视频作为所述感兴趣视频片段。
  6. 根据权利要求2所述的方法,其中,所述基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频,包括:
    基于所述感兴趣视频片段生成片头视频数据和片尾视频数据中的至少之一;
    基于所述原始视频生成片中视频数据;
    将所述片头视频数据和片尾视频数据两者的至少其中之一与所述片中视频数据拼接,生成目标视频。
  7. 根据权利要求6所述的方法,其中,所述视频生成方法满足以下至少之一:
    所述基于所述感兴趣视频片段生成片头视频数据,包括:
    基于所述感兴趣视频片段以及第一预设模板生成片头视频数据;
    所述基于所述原始视频生成片中视频数据,包括:
    基于所述原始视频以及第二预设模板生成片中视频数据;
    所述基于所述感兴趣视频片段生成片尾视频数据,包括:
    基于所述感兴趣视频片段以及第三预设模板生成片尾视频数据。
  8. 根据权利要求7所述的方法,其中,所述基于所述感兴趣视频片段以及第一预设模板生成片头视频数据,包括:
    将所述感兴趣视频片段添加至所述第一预设模板的第一设定位置,以在所述第一设定位置播放所述感兴趣视频片段;
    在所述第一预设模板的第二设定位置展示所述任务的标识信息和所述用户的标识信息中的至少之一;
    由此生成所述片头视频数据。
  9. 根据权利要求7所述的方法,其中,基于所述原始视频以及第二预设模板生成片中视频数据,包括:
    将所述原始视频添加至所述第二预设模板的第三设定位置,以在所述第三设定位置播放所述原始视频;
    执行根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画和根据所述设定任务的内容在所述第二预设模板的第五设定位置展示所述设定任务的关联信息中的至少之一;
    由此生成所述片中视频数据。
  10. 根据权利要求9所述的方法,其中,所述根据用户进行设定任务的情况在所述第二预设模板的第四设定位置展示匹配的动画,包括下述至少一种:
    在用户语音说出预设词语的情况下,在所述第四设定位置展示与所述预设词语匹配的动画;
    在用户做出设定动作的情况下,在所述第四设定位置展示与所述设定动作匹配的动画;
    根据用户进行设定任务的准确度在所述第四设定位置展示与所述准确度匹配的动画。
  11. 根据权利要求7所述的方法,其中,所述基于所述感兴趣视频片段以及第三预设模板生成片尾视频数据,包括:
    基于所述感兴趣视频片段截取目标图像;
    将所述目标图像添加至所述第三预设模板的第六设定位置,以在所述第六设定位置显示所述目标图像;
    根据用户的任务完成度在所述第三预设模板的第七设定位置展示与所述完成度相对应的匹配内容;
    由此生成所述片尾视频数据。
  12. 根据权利要求1-11任一项所述的方法,其中,所述原始视频包括竖屏视频,所述目标视频包括横屏视频。
  13. 一种视频生成装置,包括:
    拍摄模块,设置为接收作用于视频拍摄页面的触发操作,及响应于所述触发操作拍摄原始视频;
    确定模块,设置为确定所述原始视频中的感兴趣视频片段;
    处理模块,设置为基于所述感兴趣视频片段以及所述原始视频进行视频合成处理,获得目标视频。
  14. 一种电子设备,包括:
    至少一个处理器;
    存储装置,设置为存储至少一个程序,
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-12中任一项所述的视频生成方法。
  15. 一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-12中任一项所述的视频生成方法。
PCT/CN2022/086090 2021-01-27 2022-04-11 一种视频生成方法、装置、电子设备及存储介质 Ceased WO2022214101A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22784167.3A EP4322521A4 (en) 2021-01-27 2022-04-11 VIDEO GENERATING METHOD AND APPARATUS, ELECTRONIC DEVICE AND STORAGE MEDIUM
US18/483,289 US12592260B2 (en) 2021-01-27 2023-10-09 Video generation method and apparatus, electronic device, and storage medium

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN202110112638 2021-01-27
CN202110384712.8 2021-04-09
CN202110384712.8A CN113099129A (zh) 2021-01-27 2021-04-09 一种视频生成方法、装置、电子设备及存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/483,289 Continuation-In-Part US12592260B2 (en) 2021-01-27 2023-10-09 Video generation method and apparatus, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
WO2022214101A1 true WO2022214101A1 (zh) 2022-10-13

Family

ID=76675987

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/086090 Ceased WO2022214101A1 (zh) 2021-01-27 2022-04-11 一种视频生成方法、装置、电子设备及存储介质

Country Status (4)

Country Link
US (1) US12592260B2 (zh)
EP (1) EP4322521A4 (zh)
CN (1) CN113099129A (zh)
WO (1) WO2022214101A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113099129A (zh) * 2021-01-27 2021-07-09 北京字跳网络技术有限公司 一种视频生成方法、装置、电子设备及存储介质
CN113870133B (zh) * 2021-09-27 2024-03-12 抖音视界有限公司 多媒体显示及匹配方法、装置、设备及介质
CN115550550B (zh) * 2022-09-20 2026-04-17 成都光合信号科技有限公司 拍摄与生成视频的方法及相关设备
CN116112743B (zh) * 2023-02-01 2025-09-19 北京有竹居网络技术有限公司 视频处理的方法、装置、设备和存储介质
CN120186287A (zh) * 2023-12-14 2025-06-20 荣耀终端股份有限公司 视频处理方法、电子设备、芯片系统及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180122422A1 (en) * 2016-11-02 2018-05-03 Lr Acquisition, Llc Multimedia creation, production, and presentation based on sensor-driven events
CN108833934A (zh) * 2018-06-21 2018-11-16 广州酷狗计算机科技有限公司 获取视频数据的方法、服务器和系统
CN109714644A (zh) * 2019-01-22 2019-05-03 广州虎牙信息科技有限公司 一种视频数据的处理方法、装置、计算机设备和存储介质
CN111654619A (zh) * 2020-05-18 2020-09-11 成都市喜爱科技有限公司 智能拍摄方法、装置、服务器及存储介质
CN111988638A (zh) * 2020-08-19 2020-11-24 北京字节跳动网络技术有限公司 一种拼接视频的获取方法、装置、电子设备和存储介质
CN113099129A (zh) * 2021-01-27 2021-07-09 北京字跳网络技术有限公司 一种视频生成方法、装置、电子设备及存储介质

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8364633B2 (en) * 2005-01-12 2013-01-29 Wandisco, Inc. Distributed computing systems and system components thereof
US8121618B2 (en) * 2009-10-28 2012-02-21 Digimarc Corporation Intuitive computing methods and systems
WO2011146711A1 (en) * 2010-05-21 2011-11-24 Hsbc Technologies Inc. Account opening computer system architecture and process for implementing same
US20120052930A1 (en) * 2010-06-24 2012-03-01 Dr. Elliot McGucken System and method for the heros journey mythology code of honor video game engine and heros journey code of honor spy games wherein one must fake the enemy's ideology en route to winning
US9760123B2 (en) * 2010-08-06 2017-09-12 Dynavox Systems Llc Speech generation device with a projected display and optical inputs
CN103915106B (zh) * 2014-03-31 2017-01-11 宇龙计算机通信科技(深圳)有限公司 片头生成方法及生成系统
US20150356093A1 (en) * 2014-06-06 2015-12-10 Mohamad Abbas Methods and systems relating to ratings
CN105519091A (zh) * 2014-08-29 2016-04-20 深圳市大疆创新科技有限公司 用于摄影机的片头片尾自动生成方法和系统
US9632664B2 (en) * 2015-03-08 2017-04-25 Apple Inc. Devices, methods, and graphical user interfaces for manipulating user interface objects with visual and/or haptic feedback
US11956516B2 (en) * 2015-04-16 2024-04-09 W.S.C. Sports Technologies Ltd. System and method for creating and distributing multimedia content
US20160365124A1 (en) * 2015-06-11 2016-12-15 Yaron Galant Video editing method using participant sharing
KR101777242B1 (ko) * 2015-09-08 2017-09-11 네이버 주식회사 동영상 컨텐츠의 하이라이트 영상을 추출하여 제공하는 방법과 시스템 및 기록 매체
US20180132006A1 (en) * 2015-11-02 2018-05-10 Yaron Galant Highlight-based movie navigation, editing and sharing
US9609230B1 (en) * 2015-12-30 2017-03-28 Google Inc. Using a display as a light source
US20210019982A1 (en) * 2016-10-13 2021-01-21 Skreens Entertainment Technologies, Inc. Systems and methods for gesture recognition and interactive video assisted gambling
US10412139B2 (en) * 2017-05-26 2019-09-10 Streamsure Solutions Limited Communication event
US10740620B2 (en) * 2017-10-12 2020-08-11 Google Llc Generating a video segment of an action from a video
US10567707B2 (en) * 2017-10-13 2020-02-18 Blue Jeans Network, Inc. Methods and systems for management of continuous group presence using video conferencing
KR102045347B1 (ko) * 2018-03-09 2019-11-15 에스케이브로드밴드주식회사 영상제작지원장치 및 그 동작 방법
US11594028B2 (en) * 2018-05-18 2023-02-28 Stats Llc Video processing for enabling sports highlights generation
US20190373322A1 (en) * 2018-05-29 2019-12-05 Sony Interactive Entertainment LLC Interactive Video Content Delivery
US10650861B2 (en) * 2018-06-22 2020-05-12 Tildawatch, Inc. Video summarization and collaboration systems and methods
CN109168015B (zh) * 2018-09-30 2021-04-09 北京亿幕信息技术有限公司 一种云剪直播剪辑方法和系统
US11080532B2 (en) * 2019-01-16 2021-08-03 Mediatek Inc. Highlight processing method using human pose based triggering scheme and associated system
CN109819179B (zh) * 2019-03-21 2022-02-01 腾讯科技(深圳)有限公司 一种视频剪辑方法和装置
US11025964B2 (en) * 2019-04-02 2021-06-01 Wangsu Science & Technology Co., Ltd. Method, apparatus, server, and storage medium for generating live broadcast video of highlight collection
CN110191357A (zh) * 2019-06-28 2019-08-30 北京奇艺世纪科技有限公司 视频片段精彩度评估、动态封面生成方法及装置
CN110347872B (zh) * 2019-07-04 2023-10-24 腾讯科技(深圳)有限公司 视频封面图像提取方法及装置、存储介质及电子设备
US11343474B2 (en) * 2019-10-02 2022-05-24 Qualcomm Incorporated Image capture based on action recognition
US11154773B2 (en) * 2019-10-31 2021-10-26 Nvidia Corpration Game event recognition
US11170471B2 (en) * 2020-01-20 2021-11-09 Nvidia Corporation Resolution upscaling for event detection
CN111432290B (zh) * 2020-04-10 2022-04-19 深圳市乔安科技有限公司 基于音频调节的视频生成方法
CN111556363B (zh) * 2020-05-21 2021-09-28 腾讯科技(深圳)有限公司 视频特效处理方法、装置、设备及计算机可读存储介质
US11468915B2 (en) * 2020-10-01 2022-10-11 Nvidia Corporation Automatic video montage generation
WO2021077141A2 (en) * 2021-02-05 2021-04-22 Innopeak Technology, Inc. Highlight moment detection for slow-motion videos

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180122422A1 (en) * 2016-11-02 2018-05-03 Lr Acquisition, Llc Multimedia creation, production, and presentation based on sensor-driven events
CN108833934A (zh) * 2018-06-21 2018-11-16 广州酷狗计算机科技有限公司 获取视频数据的方法、服务器和系统
CN109714644A (zh) * 2019-01-22 2019-05-03 广州虎牙信息科技有限公司 一种视频数据的处理方法、装置、计算机设备和存储介质
CN111654619A (zh) * 2020-05-18 2020-09-11 成都市喜爱科技有限公司 智能拍摄方法、装置、服务器及存储介质
CN111988638A (zh) * 2020-08-19 2020-11-24 北京字节跳动网络技术有限公司 一种拼接视频的获取方法、装置、电子设备和存储介质
CN113099129A (zh) * 2021-01-27 2021-07-09 北京字跳网络技术有限公司 一种视频生成方法、装置、电子设备及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4322521A4

Also Published As

Publication number Publication date
US20240038273A1 (en) 2024-02-01
EP4322521A1 (en) 2024-02-14
EP4322521A4 (en) 2024-08-14
US12592260B2 (en) 2026-03-31
CN113099129A (zh) 2021-07-09

Similar Documents

Publication Publication Date Title
WO2022214101A1 (zh) 一种视频生成方法、装置、电子设备及存储介质
US12271578B2 (en) Audio sharing method and apparatus, device and medium
CN113207025B (zh) 视频处理方法、装置、电子设备和存储介质
WO2021196903A1 (zh) 视频处理方法、装置、可读介质及电子设备
US20250209760A1 (en) Special effect video determination method and apparatus, electronic device and storage medium
WO2020083021A1 (zh) 视频录制方法、视频播放方法、装置、设备及存储介质
WO2022083148A1 (zh) 特效展示方法、装置、电子设备及计算机可读介质
WO2020207080A1 (zh) 视频拍摄方法、装置、电子设备及存储介质
WO2023056847A1 (zh) 表情显示的方法、装置、设备及存储介质
WO2024001802A1 (zh) 图像处理方法、装置、电子设备及存储介质
US12555607B2 (en) Audio data processing method and apparatus, and device and storage medium
WO2023241377A1 (zh) 视频数据的处理方法、装置、设备、系统及存储介质
WO2023040749A1 (zh) 图像处理方法、装置、电子设备及存储介质
CN112887796A (zh) 视频生成方法、装置、设备及介质
CN116233561A (zh) 一种虚拟礼物生成方法、装置、设备及介质
US11908490B2 (en) Video recording method and device, electronic device and storage medium
CN112312163B (zh) 视频生成方法、装置、电子设备及存储介质
EP4496317A1 (en) Video generation method and apparatus, and device, storage medium and program product
CN109286760B (zh) 一种娱乐视频制作方法及其终端
CN115243087A (zh) 音视频合拍处理方法、装置、终端设备及存储介质
CN113891108A (zh) 字幕优化方法、装置、电子设备和存储介质
CN116016817B (zh) 视频剪辑方法、装置、电子设备及存储介质
US12301896B2 (en) Audio bullet screen processing method and device
CN117556066A (zh) 多媒体内容生成方法和电子设备
WO2023045786A1 (zh) 交互方法、装置、电子设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22784167

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022784167

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2022784167

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022784167

Country of ref document: EP

Effective date: 20231109

ENP Entry into the national phase

Ref document number: 2022784167

Country of ref document: EP

Effective date: 20231108