WO2023040743A1 - 一种视频处理方法、装置、设备及存储介质 - Google Patents
一种视频处理方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2023040743A1 WO2023040743A1 PCT/CN2022/117803 CN2022117803W WO2023040743A1 WO 2023040743 A1 WO2023040743 A1 WO 2023040743A1 CN 2022117803 W CN2022117803 W CN 2022117803W WO 2023040743 A1 WO2023040743 A1 WO 2023040743A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- script
- multimedia
- sub
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/81—Monomedia components thereof
- H04N21/816—Monomedia components thereof involving special video data, e.g 3D video
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/34—Indicating arrangements
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/031—Electronic editing of digitised analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/10—Indexing; Addressing; Timing or synchronising; Measuring tape travel
- G11B27/11—Indexing; Addressing; Timing or synchronising; Measuring tape travel by using information not detectable on the record carrier
Definitions
- the present disclosure relates to the field of computer technology, and in particular to a video processing method, device, equipment and storage medium.
- embodiments of the present disclosure provide a video processing method, device, device, and storage medium.
- the present disclosure provides a video processing method, the method comprising:
- the material editing area of the video clip is displayed; wherein, the material editing area is divided into a plurality of sub-areas, and one of the sub-areas corresponds to a script node in the first script structure, and the The first script structure is used to indicate the content paragraph structure of the target video, and one of the script nodes is used to indicate a content paragraph of the target video;
- the target multimedia material is displayed according to the time axis track; wherein, the target multimedia material is a multimedia material selected for a target script node, and the target script node is the first script a script node in the structure corresponding to said target sub-area;
- the interface layout of the multiple sub-areas in the material editing area is vertically aligned.
- the multimedia segment in the multimedia material is edited.
- a multimedia segment corresponding to the text content is added at the position of the time axis in the multimedia material.
- the order adjustment operation the order of the multimedia materials in the sub-areas respectively corresponding to the second script node and the third script node in the material clipping area is adjusted.
- the target multimedia material has an alternative multimedia material, and before generating the target video according to the multimedia material displayed in the material editing area, further includes:
- the target multimedia material displayed in the target sub-area is switched to the alternative multimedia material.
- the present disclosure also provides a video processing device, the device comprising:
- the first display module is used to display the material editing area of the video clip according to the first script structure; wherein, the material editing area is divided into a plurality of sub-areas, one of the sub-areas and one of the first script structure Corresponding to the script node, the first script structure is used to indicate the content paragraph structure of the target video, and one of the script nodes is used to indicate a content paragraph of the target video;
- the second display module is used to display the target multimedia material according to the time axis track in the target sub-area among the plurality of sub-areas; wherein, the target multimedia material is a multimedia material selected for a target script node, and the target script The node is a script node corresponding to the target sub-area in the first script structure;
- a generation module configured to generate the target video according to the multimedia materials displayed in the material editing area; wherein, the target content paragraphs of the target video are filled with the target multimedia materials, and the target content paragraphs are identical to the target content paragraphs corresponding to the target script node described above.
- the present disclosure provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device is made to implement the above method.
- the present disclosure provides a device, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the computer program, Implement the above method.
- the present disclosure provides a computer program product, where the computer program product includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the above method is implemented.
- An embodiment of the present disclosure provides a video processing method, which displays a material editing area of a video clip according to a first script structure, so that sub-areas in the material editing area correspond to script nodes in the first script structure.
- the multimedia material selected for the target script node corresponding to the target sub-area is displayed according to the time axis track, and then the target video is generated according to the multimedia material displayed in the material editing area.
- the embodiments of the present disclosure can realize video editing based on a material editing area including multiple sub-areas corresponding to script nodes, enrich video processing methods, and further meet people's diverse video editing needs.
- FIG. 1 is a schematic flowchart of a video processing method provided by an embodiment of the present disclosure
- FIG. 2 is a schematic diagram of the relationship between a script node, a content paragraph, and a sub-area provided by an embodiment of the present disclosure
- Fig. 3a is a schematic diagram of an alignment method of a material editing area provided by an embodiment of the present disclosure
- Fig. 3b is a schematic diagram of an alignment method of another material editing area provided by an embodiment of the present disclosure.
- Fig. 4a is a schematic diagram of a material editing area and a first script structure provided by an embodiment of the present disclosure
- Fig. 4b is a schematic diagram of another material editing area and a first script structure provided by an embodiment of the present disclosure
- Fig. 4c is a schematic diagram of another material editing area and a first script structure provided by an embodiment of the present disclosure
- FIG. 5 is a schematic diagram of the relationship between a target script node, a target sub-area, a target content paragraph, and a target multimedia material provided by an embodiment of the present disclosure
- Fig. 6a is a schematic diagram showing a target multimedia material provided by an embodiment of the present disclosure.
- FIG. 6b is a schematic diagram showing another target multimedia material provided by an embodiment of the present disclosure.
- FIG. 7 is a schematic diagram of generating a target video provided by an embodiment of the present disclosure.
- FIG. 8 is a schematic diagram of switching between a target multimedia material and an alternative multimedia material according to an embodiment of the present disclosure
- FIG. 9 is a schematic structural diagram of a video processing device provided by an embodiment of the present disclosure.
- Fig. 10 is a schematic structural diagram of a video processing device provided by an embodiment of the present disclosure.
- an embodiment of the present disclosure proposes a video processing method.
- the material editing area of the video clip is displayed; wherein, the material editing area Be divided into a plurality of subregions, a subregion corresponds to a script node in the first script structure, the first script structure indicates the content paragraph structure of the target video, and a script node indicates a content paragraph of the target video;
- the target multimedia material is displayed according to the time axis track; wherein, the target multimedia material is a multimedia material selected for a target script node, and the target script node is the same as the first script node in the first script structure.
- the script node corresponding to the target sub-area and then, generate the target video according to the multimedia material displayed in the material editing area; wherein, the target content paragraph of the target video is filled with the target multimedia material, and the target content paragraph is consistent with the target Corresponds to script nodes.
- the embodiment of the present disclosure displays the material editing area of the video clip according to the first script structure, so that the sub-areas in the material editing area correspond to the script nodes in the first script structure.
- the multimedia material selected for the target script node corresponding to the target sub-area is displayed according to the time axis track, and then the target video is generated according to the multimedia material displayed in the material editing area.
- the embodiments of the present disclosure can realize video editing based on a material editing area including multiple sub-areas corresponding to script nodes, enrich video processing methods, and further meet people's diverse video editing needs.
- FIG. 1 it is a schematic flowchart of a video processing method provided by an embodiment of the present disclosure.
- the method can be executed by a video processing device, wherein the device can use software And/or hardware implementation, generally can be integrated in electronic equipment.
- the method may include:
- Step 101 according to the structure of the first script, display the material editing area of the video clip.
- the material editing area is divided into a plurality of sub-areas, one sub-area corresponds to a script node in the first script structure, the first script structure is used to indicate the content paragraph structure of the target video, and one script node is used to indicate the content paragraph structure of the target video A content paragraph of .
- the script is the draft in the process of film and television creation, and usually the script includes the description content of multiple screens to guide the photographer to shoot and generate corresponding film and television works.
- the script includes the relevant description content of a picture, which is used to indicate the shooting of the first shot, and the relevant description content of the b picture, which is used to indicate the shooting of the second shot, etc.
- the shooting The operator can shoot according to the description content of frame a to obtain the first shot containing video clip A, shoot according to the description content of frame b to obtain the second shot containing video clip B, and then splicing the second shot on the first shot Afterwards, the film and television works corresponding to the script are obtained.
- the first script structure may refer to the structure of the script above, such as the paragraph structure of the description content.
- the description content of the first shot corresponds to the first script node in the first script structure
- the description content of the second shot corresponds to the second script in the first script structure node.
- the material editing area of the video clip is displayed according to the first script structure, and the multimedia material to be edited can be displayed in the material editing area.
- the material editing area may be divided according to the script nodes in the first script structure to obtain sub-areas corresponding to each script node. Wherein, each sub-area corresponds to a script node in the first script structure.
- the description may be made in conjunction with the content shown in FIG. 2 .
- the first script structure in Fig. 2 includes Q script nodes, wherein Q is a positive integer, based on the first script structure including Q script nodes, the material editing area can be divided into Q sub-areas, and each script node corresponds to A subarea within the material editing area of the .
- the vertical alignment as shown in FIG. 3a means that different rows are vertically aligned.
- each sub-area is aligned to the left, or each sub-area may also be aligned to the right.
- the horizontal alignment as shown in FIG. 3 b that is, different columns are aligned in the horizontal direction.
- each sub-region is aligned upward, or each sub-region can also be aligned downward.
- the script nodes in the first script structure may include: script comments and/or script paragraphs in the script, that is, the script node and the script comments and/or script paragraphs in the script have a corresponding relationship.
- the script annotation is used to generally represent the content of the multimedia material corresponding to the script node
- the script paragraph includes the detailed text content corresponding to the script node.
- the detailed text content included in the script paragraph may be text information obtained through speech recognition of the video.
- an example of a material editing area displaying a video clip according to the structure of the first script is as follows:
- Example 1 The script node in the first script structure includes script comments, as shown in Figure 4a, assuming that the first script structure includes the first script comment and the second script comment, wherein the first script comment is "//opening remarks" , the second script comment is "//environment introduction”.
- the first sub-area of the material editing area corresponds horizontally to "//opening remarks”
- the second sub-area of the material editing area corresponds horizontally to "//environment introduction”.
- Example 2 The script nodes in the first script structure include script paragraphs, as shown in Figure 4b, assuming that the first script structure includes a first script paragraph and a second script paragraph, wherein the first script paragraph is obtained through speech recognition
- the opening text the second script paragraph is the environment introduction text obtained through speech recognition
- the material editing area displayed according to the structure of the first script the first sub-area of the material editing area corresponds horizontally to the opening text script paragraph
- the second sub-area is horizontally corresponding to the environment introduction text script paragraph.
- Example 3 The script nodes in the first script structure include script comments and script paragraphs, as shown in Figure 4c, assuming that the first script structure includes the first script node and the second script node, wherein the first script node includes the first Script comment and the first script paragraph, the first script comment is "//opening remarks", the first script paragraph is the opening text obtained by speech recognition, the second script node includes the second script comment and the second script paragraph, the second script comment It is "//environment introduction", the second script paragraph is the environment introduction text obtained by speech recognition, the first sub-area in the material editing area displayed according to the first script structure and the opening text script paragraph in the first script structure and The first script annotations correspond horizontally, and the second sub-area in the material editing area corresponds horizontally to the environment introduction text script paragraph and the second script annotation in the first script structure.
- step 102 After the material editing area is displayed according to the first script structure, the following step 102 is continued.
- Step 102 in a target sub-area among the multiple sub-areas, display the target multimedia material according to the track of the time axis.
- the target multimedia material is a multimedia material selected for the target script node
- the target script node is a script node corresponding to the target sub-area in the first script structure.
- the target sub-area may be any one of multiple sub-areas in the material editing area, and the target sub-area has a corresponding target script node in the first script structure, and can be based on the target script node Select the corresponding target multimedia material.
- the target multimedia material may be voice recognized, and the voice recognition result is text-matched with each script node in the first script structure, To determine the target script node corresponding to the target multimedia material, and then display the target multimedia material according to the time axis track in the target sub-area corresponding to the target script node.
- the target multimedia material in the embodiments of the present disclosure may be the entire video obtained by shooting, or may be a segment of the entire video obtained by shooting, which is not limited in this embodiment.
- the target multimedia material is selected according to the target script node, and the target node also corresponds to the target sub-area, so that the target multimedia material displayed in the target sub-area can be determined.
- the first sub-area in the material editing area is the target sub-area
- the method for selecting the multimedia material according to the target script node may include, by performing image recognition and/or speech recognition on the multimedia material to be selected, determining the multimedia material with the highest matching degree with the "//opening remarks" script node as the target multimedia material , to display the target multimedia material in the first sub-area.
- the multimedia material includes the video material of the prologue
- speech recognition is performed on the prologue material to obtain the corresponding prologue text
- use the prologue text as the target script node in the first script structure select the target multimedia material with the highest matching degree among the multimedia materials according to the prologue text, for example: select the prologue video material as the target multimedia material, and then select the target multimedia material in the first sub-area
- the opening video material is displayed according to the timeline track in .
- Step 103 generate a target video according to the multimedia material displayed in the material editing area.
- the target content paragraph of the target video is filled with the target multimedia material, and the target content segment corresponds to the target script node.
- the target video can be generated according to the multimedia materials displayed in the material editing area.
- the target video includes Q content paragraphs, where Q is a positive integer, each content paragraph has a corresponding relationship with a script node in the first script structure, and each content paragraph can be filled with the content
- the multimedia material selected by the script node corresponding to the paragraph, the multimedia material includes but not limited to: any one or more of video and audio.
- the first script structure is used to indicate the content paragraph structure of the target video
- a script node in the first script structure is used to indicate a content paragraph of the target video, that is, the content paragraph corresponding to the script node conforms to
- content paragraphs can be adjusted according to the script nodes in the first script structure, so as to generate a target video conforming to the first script structure.
- the first script structure includes Q script nodes, where Q is a positive integer, and each script node has a corresponding content paragraph, and the target sub-area, target content paragraph, and target can be determined according to the target script node Correspondence among the three multimedia materials, and furthermore, each content paragraph is filled with the corresponding target multimedia material, and each content paragraph is spliced according to the first script structure, so as to obtain the corresponding target video.
- the first sub-area of the material editing area displays the opening video material
- the opening video material includes n frames
- the second sub-area of the material editing area displays the environment introduction material
- the The environment introduction video material includes m frames, wherein n and m are positive integers, according to the first script structure, determine the first content paragraph of the corresponding target video of the first sub-region, and the second content paragraph of the corresponding target video of the second sub-region, thereby Use n frames of prologue video material to fill the first content paragraph, use m frames of environment introduction video material to fill the second content paragraph, and then generate a target video.
- the video processing method of the embodiment of the present disclosure displays the material editing area of the video clip according to the first script structure, so that the sub-areas in the material editing area correspond to the script nodes in the first script structure.
- the multimedia material selected for the target script node corresponding to the target sub-area is displayed according to the time axis track, and then the target video is generated according to the multimedia material displayed in the material editing area.
- the embodiments of the present disclosure can realize video editing based on a material editing area including multiple sub-areas corresponding to script nodes, enrich video processing methods, and further meet people's diverse video editing needs.
- a video work is generated by multiple sub-video clips.
- the sub-videos need to be edited according to the time axis corresponding to the sub-videos, and the sub-videos should be spliced according to the time axis corresponding to the total video.
- this timeline-based editing method is complex in the editing process related to language content, and needs to repeatedly compare the contents of each picture frame in the sub-video timeline, so this technical solution cannot realize fast and convenient video editing.
- Clipping operation so the clipping operation of the video can be realized based on the above-mentioned embodiments. Specifically, before the target video is generated according to the multimedia material displayed in the material editing area, corresponding operation steps can be added according to the requirements.
- the examples are as follows:
- the steps that need to be added before step 103 of the above-mentioned embodiment include:
- the first script node in the first script structure has a corresponding multimedia material
- the first script node is the text content corresponding to the multimedia material.
- manually configured subtitles, etc. The user can adjust the target text content in the first script node according to requirements, and in response to the adjustment, determine the multimedia material corresponding to the first script node in the material editing area, and in order to determine the content that needs to be adjusted, it is also necessary to determine A multimedia segment corresponding to the target text content in the multimedia material.
- the multimedia segment in the multimedia material is edited.
- the editing includes but is not limited to: deletion, shifting, etc.
- the target text content is "Good morning and noon”
- the multimedia material is a greeting video
- the corresponding relationship between the target text content and the multimedia material is: "Morning” corresponds to the first frame of the greeting video, and “Up” corresponds to the second frame of the greeting video frame, “middle” corresponds to the third frame of the greeting video, “noon” corresponds to the fourth frame of the greeting video, and "good” corresponds to the fifth frame of the greeting video.
- “morning” is a slip of the tongue and needs to be deleted in the target video Corresponding segment, so the target text content can be operated, delete "morning” in "good morning, noon", and the first frame and second frame in the corresponding greeting video will also be deleted.
- the target text content can be associated with the multimedia material through the time stamp, and the time stamp can establish a relationship between the text content and the time axis of the multimedia material.
- the target text The content is "morning”
- the multimedia material is a greeting video
- the corresponding relationship between the text content and the multimedia material is: “morning” corresponds to the 0th to 1.5th second of the multimedia material, and “noon” corresponds to the 1.5th to the 3rd of the multimedia material Seconds, "good” corresponds to the 3rd to 4th second of the multimedia material, operate on the target text content, delete the "morning" in "good morning, noon", and correspond to the 0th to 1.5th second of the greeting video been deleted.
- the troublesome operation of manually positioning the text to be processed on the time axis of the multimedia material is avoided, and the efficiency and accuracy of video processing are improved.
- step 103 of the above-mentioned embodiment includes:
- the second script node is the text content corresponding to the multimedia material.
- the text content there are many ways to obtain the text content, including: according to the speech recognition technology Identify the acquired text information, manually configured subtitles, etc.
- the user can add text content at the target text position of the second script node according to requirements, and in response to the adjustment, determine the multimedia material corresponding to the second script node in the material editing area, and in order to determine the position where the multimedia segment needs to be added, it is also required Determine the time axis position corresponding to the target text position in the multimedia material.
- a multimedia segment corresponding to the text content is added at the position of the time axis in the multimedia material.
- the multimedia material before the time axis position can be determined as the front multimedia material
- the multimedia material after the time axis position can be determined as the rear multimedia material
- the adding operation can be the front multimedia material Connect the multimedia clip after the material, and connect the rear multimedia clip after the multimedia clip.
- the second script node is "Hello everyone"
- the multimedia material is a greeting video
- the corresponding relationship between the target text content and the multimedia material is: “big” corresponds to the first frame of the greeting video, and "home” corresponds to the second frame of the greeting video Frame, "good” corresponds to the third frame of the greeting video.
- "noon” needs to be added between “home” and "good”.
- the "noon” "The corresponding video segment includes the first frame of the noon video and the second frame of the noon video, so the first frame and the second frame of the noon video are connected after the second frame of the greeting video, and the second frame of the noon video is connected Frame 3 of the greeting video.
- step 103 of the above embodiment when the multimedia material is edited, the script node corresponding to the multimedia material will also change accordingly.
- the steps that need to be added before step 103 of the above embodiment include :
- the user performs an editing operation on the target multimedia segment in the first multimedia material in the material editing area, in response to this operation, it is necessary to perform corresponding operations on the first script structure, so it is necessary to determine the first multimedia
- the script node corresponding to the body material is determined, and the text content corresponding to the target multimedia segment in the script node is determined. Furthermore, the text content in the script node is adjusted accordingly according to the clipping operation on the first multimedia material.
- the corresponding relationship between the text content in the script node and the greeting video is as follows: “early” corresponds to the first frame of the greeting video, “up” corresponds to the second frame of the greeting video, and “middle” corresponds to the second frame of the greeting video. " corresponds to the third frame of the greeting video, " ⁇ ” corresponds to the fourth frame of the greeting video, and "good” corresponds to the fifth frame of the greeting video.
- the third and fourth frames of the greeting video are deleted, and according to the The deletion operation of the 3rd and 4th frames of the video correspondingly deletes " ⁇ " and " ⁇ ” in the text content of the script node, and the script node after processing is "Good morning".
- the changes of the multimedia material and the corresponding script node are unified, and the consistency of the multimedia material and the script node is maintained.
- the order of the multimedia material can be adjusted based on the first script structure, then the steps that need to be added before step 103 of the above-mentioned embodiment include:
- the user When the user needs to adjust the order of multimedia materials, he can adjust the second script node and the third script node in the first script structure, and the second script node has a corresponding second sub-area, and the third script node also has a corresponding
- the second sub-area and the third sub-area are determined in the material editing area, and the second sub-area and the third sub-area are adjusted according to the user's adjustment to the script structure.
- the sequence of multimedia materials can be adjusted by adjusting the structure of the first script, which improves the efficiency of video processing, and also saves the step of manually viewing multimedia materials to determine the content of multimedia materials, making video processing more intuitive.
- the second script node in the first script structure is "//Introduction”
- the third script node is "//Environment Introduction Video”
- “//Introduction” is located in “//Environment Introduction Video ”
- the prologue material in the corresponding material editing area is located after the environment introduction material.
- the user needs to move the prologue video to before the environment introduction video.
- "//prologue” can be moved to before "//environment introduction video”.
- the prologue material in the material editing area is moved to Before the environment introduces the material.
- step 103 of the above-mentioned embodiment in order to improve the quality of the generated target video when shooting, multiple similar types of videos will be shot, so the target multimedia material has alternative multimedia materials, and then selected from the alternative multimedia materials If the one with the best effect is obtained, the steps that need to be added before step 103 of the above-mentioned embodiment include:
- the target multimedia material displayed in the target sub-area is switched to the alternative multimedia material.
- the candidate multimedia material can be set by the user, or it can be obtained by comparing the similarity between the target multimedia material and the image recognition or speech recognition technology.
- the user can switch the target multimedia material to the candidate multimedia material.
- the switching operation switches the target multimedia material displayed in the material editing area to an alternative multimedia material.
- the target script node corresponding to the target sub-region in the first script structure may be adjusted to the text information corresponding to the candidate multimedia material according to the candidate multimedia material.
- the alternative operation can conveniently and quickly select the one that best meets the user's needs from a plurality of multimedia materials, thereby improving the efficiency of video processing.
- the target sub-area is the first sub-area
- the target multimedia material in the first sub-area is the target prologue material
- the alternative multimedia materials are the first alternative prologue material and the second prologue material.
- the alternative opening remarks material, in the material alternative area also includes an alternative display control, the alternative display control will display the alternative multimedia material in the first sub-area in response to the user's touch operation, in this example, the user touches the alternative Display controls, and click the second alternative opening material to switch with the target multimedia material, and then display the second alternative opening material in the first sub-area.
- the video processing method of the embodiment of the present disclosure can intuitively and conveniently adjust the target video and/or the first script structure based on the corresponding relationship between sub-regions, content paragraphs, and multimedia materials established by the first script structure. , and at the same time reduce the complexity of editing and processing videos with language content or plot as the core, and improve video processing efficiency.
- the present disclosure also provides a video processing device.
- FIG. 9 it is a schematic structural diagram of a video processing device provided by an embodiment of the present disclosure, and the device includes:
- the first display module 901 is configured to display the material editing area of the video clip according to the first script structure; wherein, the material editing area is divided into a plurality of sub-areas, and one of the sub-areas is related to the first script structure.
- the first script structure is used to indicate the content paragraph structure of the target video
- one script node is used to indicate a content paragraph of the target video;
- the second display module 902 is configured to display the target multimedia material according to the time axis track in the target sub-area of the plurality of sub-areas; wherein, the target multimedia material is a multimedia material selected for the target script node, and the target The script node is a script node corresponding to the target sub-area in the first script structure;
- the generating module 903 is configured to generate the target video according to the multimedia material displayed in the material editing area; wherein, the target content paragraph of the target video is filled with the target multimedia material, and the target content paragraph is related to The target script node corresponds.
- the interface layout of the multiple sub-areas in the material editing area is vertically aligned.
- the device further includes:
- a first determining module configured to determine the multimedia material corresponding to the first script node in the material editing area in response to an adjustment operation on the target text content of the first script node in the first script structure, and determining a multimedia segment corresponding to the target text content in the multimedia material;
- the clipping module is configured to clip the multimedia segment in the multimedia material according to the adjustment operation.
- the device further includes:
- the second determining module is configured to determine the multimedia corresponding to the second script node in the material editing area in response to the operation of adding text content to the target text position of the second script node in the first script structure material, and determining a time axis position corresponding to the target text position in the multimedia material;
- An adding module configured to add a multimedia segment corresponding to the text content at the position of the time axis in the multimedia material according to the operation of adding text content.
- the device further includes:
- a third determining module configured to determine a script node corresponding to the first multimedia material and determine the script in response to an editing operation on a target multimedia segment of the first multimedia material in the material editing area Text content corresponding to the target multimedia segment in the node;
- the first adjustment module is configured to adjust the text content in the script node according to the editing operation.
- the device further includes:
- a fourth determining module configured to determine the second script node and the third script node in the material clipping area in response to the order adjustment operation between the second script node and the third script node in the first script structure.
- the sub-areas respectively corresponding to the third script node;
- the second adjustment module is configured to adjust the order of the multimedia materials in the sub-areas respectively corresponding to the second script node and the third script node in the material editing area according to the order adjustment operation.
- the target multimedia material has an alternative multimedia material
- the device further includes:
- a switching module configured to switch the target multimedia material displayed in the target sub-area to the alternative multimedia material in response to a switching operation on the target multimedia material and the alternative multimedia material in the target sub-area material.
- the material editing area of the video clip is displayed according to the first script structure, so that the sub-areas in the material editing area correspond to the script nodes in the first script structure.
- the multimedia material selected for the target script node corresponding to the target sub-area is displayed according to the time axis track, and then the target video is generated according to the multimedia material displayed in the material editing area.
- an embodiment of the present disclosure also provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are run on a terminal device, the terminal device realizes this The video processing method described in the embodiment is disclosed.
- the embodiment of the present disclosure also provides a computer program product, the computer program product includes a computer program/instruction, and when the computer program/instruction is executed by a processor, the video processing method described in the embodiment of the present disclosure is implemented.
- an embodiment of the present disclosure also provides a video processing device, as shown in FIG. 10 , which may include:
- Processor 1001 , memory 1002 , input device 1003 and output device 1004 The number of processors 1001 in the video processing device may be one or more, and one processor is taken as an example in FIG. 10 .
- the processor 1001 , the memory 1002 , the input device 1003 and the output device 1004 may be connected through a bus or in other ways, wherein connection through a bus is taken as an example in FIG. 10 .
- the memory 1002 can be used to store software programs and modules, and the processor 1001 executes various functional applications and data processing of the video processing device by running the software programs and modules stored in the memory 1002 .
- the memory 1002 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function, and the like.
- the memory 1002 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage devices.
- the input device 1003 can be used to receive input numbers or character information, and generate signal input related to user settings and function control of the video processing device.
- the processor 1001 will load the executable files corresponding to the process of one or more application programs into the memory 1002 according to the following instructions, and the processor 1001 will run the executable files stored in the memory 1002.
- Application programs so as to realize various functions of the above-mentioned video processing device.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Television Signal Processing For Recording (AREA)
- Management Or Editing Of Information On Record Carriers (AREA)
- Studio Circuits (AREA)
Abstract
Description
Claims (11)
- 一种视频处理方法,其特征在于,所述方法包括:按照第一脚本结构,展示视频剪辑的素材编辑区域;其中,所述素材编辑区域被划分为多个子区域,一个所述子区域与所述第一脚本结构中的一个脚本节点相对应,所述第一脚本结构用于指示目标视频的内容段落结构,一个所述脚本节点用于指示所述目标视频的一个内容段落;在所述多个子区域中的目标子区域中,按照时间轴轨道展示目标多媒体素材;其中,所述目标多媒体素材为针对目标脚本节点选取的多媒体素材,所述目标脚本节点为所述第一脚本结构中与所述目标子区域对应的脚本节点;按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频;其中,在所述目标视频的目标内容段落中填充有所述目标多媒体素材,所述目标内容段落与所述目标脚本节点相对应。
- 根据权利要求1所述的方法,其特征在于,所述素材编辑区域中的所述多个子区域的界面布局方式为纵向对齐排列。
- 根据权利要求1所述的方法,其特征在于,所述按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频之前,还包括:响应于针对所述第一脚本结构中的第一脚本节点的目标文本内容的调整操作,在所述素材编辑区域中确定所述第一脚本节点对应的多媒体素材,以及确定所述多媒体素材中与所述目标文本内容对应的多媒体片段;根据所述调整操作,对所述多媒体素材中的所述多媒体片段进行剪辑。
- 根据权利要求1所述的方法,其特征在于,所述按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频之前,还包括:响应于在所述第一脚本结构中的第二脚本节点的目标文本位置增加文本内容的操作,在所述素材编辑区域中确定与所述第二脚本节点对应的多媒体素材,以及确定所述多媒体素材中与所述目标文本位置对应的时间轴位置;根据所述增加文本内容的操作,在所述多媒体素材中的所述时间轴位置添加与所述文本内容对应的多媒体片段。
- 根据权利要求1所述的方法,其特征在于,所述按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频之前,还包括:响应于针对所述素材编辑区域中的第一多媒体素材的目标多媒体片段的剪辑操作,确定所述第一多媒体素材对应的脚本节点,并确定所述脚本节点中与所述目标多媒体片段对应的文本内容;根据所述剪辑操作,对所述脚本节点中的所述文本内容进行调整。
- 根据权利要求1所述的方法,其特征在于,所述按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频之前,还包括:响应于针对所述第一脚本结构中的第二脚本节点和第三脚本节点之间的顺序调整操作,在所述素材剪辑区域中确定所述第二脚本节点和所述第三脚本节点分别对应的子区域;根据所述顺序调整操作,对所述素材剪辑区域中与所述第二脚本节点和所述第三脚本节点分别对应的子区域中的多媒体素材进行顺序调整。
- 根据权利要求1所述的方法,其特征在于,所述目标多媒体素材具有备选多媒体素材,所述按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频之前,还包括:响应于针对所述目标子区域中的所述目标多媒体素材与所述备选多媒体素材的切换操作,将所述目标子区域中展示的目标多媒体素材切换为所述备选多媒体素材。
- 一种视频处理装置,其特征在于,所述装置包括:第一展示模块,用于按照第一脚本结构,展示视频剪辑的素材编辑区域;其中,所述素材编辑区域被划分为多个子区域,一个所述子区域与所述第一脚本结构中的一个脚本节点相对应,所述第一脚本结构用于指示目标视频的内容段落结构,一个所述脚本节点用于指示所述目标视频的一个内容段落;第二展示模块,用于在所述多个子区域中的目标子区域中,按照时间轴轨道展示目标多媒体素材;其中,所述目标多媒体素材为针对目标脚本节点选取的多媒体素材,所述目标脚本节点为所述第一脚本结构中与所述目标子区域对应的脚本节点;生成模块,用于按照所述素材编辑区域中展示的多媒体素材,生成所述目标视频;其中,在所述目标视频的目标内容段落中填充有所述目标多媒体素材,所述目标内容段落与所述目标脚本节点相对应。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有指令,当所述指令在终端设备上运行时,使得所述终端设备实现如权利要求1-7任一项所述的方法。
- 一种设备,其特征在于,包括:存储器,处理器,及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,实现如权利要求1-7任一项所述的方法。
- 一种计算机程序产品,其特征在于,所述计算机程序产品包括计算机程序/指令,所述计算机程序/指令被处理器执行时实现如权利要求1-7任一项所述的方法。
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP22869112.7A EP4340372A4 (en) | 2021-09-15 | 2022-09-08 | VIDEO PROCESSING METHOD, APPARATUS, AND DEVICE, AND STORAGE MEDIUM |
| JP2023577720A JP7822405B2 (ja) | 2021-09-15 | 2022-09-08 | 映像処理方法、映像処理装置、機器、記憶媒体及びコンピュータプログラム |
| US18/536,092 US12192594B2 (en) | 2021-09-15 | 2023-12-11 | Method, apparatus, device, and storage medium of video processing |
| US18/970,775 US20250097546A1 (en) | 2021-09-15 | 2024-12-05 | Method, apparatus, device, and storage medium of video processing |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202111081785.6A CN115811632B (zh) | 2021-09-15 | 2021-09-15 | 一种视频处理方法、装置、设备及存储介质 |
| CN202111081785.6 | 2021-09-15 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/536,092 Continuation US12192594B2 (en) | 2021-09-15 | 2023-12-11 | Method, apparatus, device, and storage medium of video processing |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023040743A1 true WO2023040743A1 (zh) | 2023-03-23 |
Family
ID=85481875
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/117803 Ceased WO2023040743A1 (zh) | 2021-09-15 | 2022-09-08 | 一种视频处理方法、装置、设备及存储介质 |
Country Status (5)
| Country | Link |
|---|---|
| US (2) | US12192594B2 (zh) |
| EP (1) | EP4340372A4 (zh) |
| JP (1) | JP7822405B2 (zh) |
| CN (1) | CN115811632B (zh) |
| WO (1) | WO2023040743A1 (zh) |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN116647714A (zh) * | 2023-05-31 | 2023-08-25 | 北京达佳互联信息技术有限公司 | 视频生成方法、装置、电子设备以及存储介质 |
| CN117009574B (zh) * | 2023-07-20 | 2024-05-28 | 天翼爱音乐文化科技有限公司 | 热点视频模板的生成方法、系统、设备及存储介质 |
| EP4525459A4 (en) * | 2023-07-26 | 2025-07-09 | Beijing Zitiao Network Technology Co Ltd | VIDEO EDITING METHOD AND APPARATUS, AND DEVICE AND MEDIUM |
| CN120881334A (zh) * | 2024-04-29 | 2025-10-31 | 北京字跳网络技术有限公司 | 视频编辑的方法、装置、设备和存储介质 |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100322589A1 (en) * | 2007-06-29 | 2010-12-23 | Russell Henderson | Non sequential automated production by self-interview kit of a video based on user generated multimedia content |
| CN108259965A (zh) * | 2018-03-31 | 2018-07-06 | 湖南广播电视台广播传媒中心 | 一种视频剪辑方法和剪辑系统 |
| CN109756751A (zh) * | 2017-11-07 | 2019-05-14 | 腾讯科技(深圳)有限公司 | 多媒体数据处理方法及装置、电子设备、存储介质 |
| CN109889882A (zh) * | 2019-01-24 | 2019-06-14 | 北京亿幕信息技术有限公司 | 一种视频剪辑合成方法和系统 |
| CN111711855A (zh) * | 2020-05-27 | 2020-09-25 | 北京奇艺世纪科技有限公司 | 视频生成方法及装置 |
| CN112040142A (zh) * | 2020-07-08 | 2020-12-04 | 智者四海(北京)技术有限公司 | 用于移动终端上的视频创作的方法 |
| CN112579826A (zh) * | 2020-12-07 | 2021-03-30 | 北京字节跳动网络技术有限公司 | 视频显示及处理方法、装置、系统、设备、介质 |
Family Cites Families (15)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20050235198A1 (en) | 2004-04-16 | 2005-10-20 | Howard Johnathon E | Editing system for audiovisual works and corresponding text for television news |
| JP2006054517A (ja) | 2004-08-09 | 2006-02-23 | Bank Of Tokyo-Mitsubishi Ltd | 情報提示装置、方法及びプログラム |
| US7512537B2 (en) * | 2005-03-22 | 2009-03-31 | Microsoft Corporation | NLP tool to dynamically create movies/animated scenes |
| JP2007052626A (ja) | 2005-08-18 | 2007-03-01 | Matsushita Electric Ind Co Ltd | メタデータ入力装置およびコンテンツ処理装置 |
| JP2009507453A (ja) | 2005-09-07 | 2009-02-19 | ポータルビデオ・インコーポレーテッド | ビデオ編集方法および装置におけるテキスト位置の時間見積もり |
| JP2007336283A (ja) | 2006-06-15 | 2007-12-27 | Toshiba Corp | 情報処理装置、情報処理方法および情報処理プログラム |
| US20100153520A1 (en) * | 2008-12-16 | 2010-06-17 | Michael Daun | Methods, systems, and media for creating, producing, and distributing video templates and video clips |
| CA2787380C (en) | 2010-01-26 | 2017-05-09 | Francois Beaumier | Digital jukebox device with improved user interfaces, and associated methods |
| US10140259B2 (en) * | 2016-04-28 | 2018-11-27 | Wipro Limited | Method and system for dynamically generating multimedia content file |
| JP7086331B2 (ja) * | 2018-04-16 | 2022-06-20 | 株式会社Nhkテクノロジーズ | ダイジェスト映像生成装置およびダイジェスト映像生成プログラム |
| US20200126583A1 (en) * | 2018-10-19 | 2020-04-23 | Reduct, Inc. | Discovering highlights in transcribed source material for rapid multimedia production |
| KR101994592B1 (ko) | 2018-10-19 | 2019-06-28 | 인하대학교 산학협력단 | 비디오 콘텐츠의 메타데이터 자동 생성 방법 및 시스템 |
| US11049525B2 (en) * | 2019-02-21 | 2021-06-29 | Adobe Inc. | Transcript-based insertion of secondary video content into primary video content |
| US11126856B2 (en) * | 2019-10-11 | 2021-09-21 | Adobe Inc. | Contextualized video segment selection for video-filled text |
| CN113364999B (zh) * | 2021-05-31 | 2022-12-27 | 北京达佳互联信息技术有限公司 | 视频生成方法、装置、电子设备及存储介质 |
-
2021
- 2021-09-15 CN CN202111081785.6A patent/CN115811632B/zh active Active
-
2022
- 2022-09-08 JP JP2023577720A patent/JP7822405B2/ja active Active
- 2022-09-08 EP EP22869112.7A patent/EP4340372A4/en active Pending
- 2022-09-08 WO PCT/CN2022/117803 patent/WO2023040743A1/zh not_active Ceased
-
2023
- 2023-12-11 US US18/536,092 patent/US12192594B2/en active Active
-
2024
- 2024-12-05 US US18/970,775 patent/US20250097546A1/en active Pending
Patent Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20100322589A1 (en) * | 2007-06-29 | 2010-12-23 | Russell Henderson | Non sequential automated production by self-interview kit of a video based on user generated multimedia content |
| CN109756751A (zh) * | 2017-11-07 | 2019-05-14 | 腾讯科技(深圳)有限公司 | 多媒体数据处理方法及装置、电子设备、存储介质 |
| CN108259965A (zh) * | 2018-03-31 | 2018-07-06 | 湖南广播电视台广播传媒中心 | 一种视频剪辑方法和剪辑系统 |
| CN109889882A (zh) * | 2019-01-24 | 2019-06-14 | 北京亿幕信息技术有限公司 | 一种视频剪辑合成方法和系统 |
| CN111711855A (zh) * | 2020-05-27 | 2020-09-25 | 北京奇艺世纪科技有限公司 | 视频生成方法及装置 |
| CN112040142A (zh) * | 2020-07-08 | 2020-12-04 | 智者四海(北京)技术有限公司 | 用于移动终端上的视频创作的方法 |
| CN112579826A (zh) * | 2020-12-07 | 2021-03-30 | 北京字节跳动网络技术有限公司 | 视频显示及处理方法、装置、系统、设备、介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4340372A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| JP2024521502A (ja) | 2024-05-31 |
| US20240114216A1 (en) | 2024-04-04 |
| EP4340372A1 (en) | 2024-03-20 |
| JP7822405B2 (ja) | 2026-03-02 |
| US12192594B2 (en) | 2025-01-07 |
| CN115811632B (zh) | 2025-07-15 |
| CN115811632A (zh) | 2023-03-17 |
| EP4340372A4 (en) | 2024-10-16 |
| US20250097546A1 (en) | 2025-03-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2023040743A1 (zh) | 一种视频处理方法、装置、设备及存储介质 | |
| CN110928468B (zh) | 智能交互平板的页面显示方法、装置、设备和存储介质 | |
| CN101453567B (zh) | 拍摄和编辑运动图像的设备和方法 | |
| CN110928460B (zh) | 智能交互平板的操作方法、装置、终端设备和存储介质 | |
| US8205159B2 (en) | System, method and medium organizing templates for generating moving images | |
| KR102590100B1 (ko) | 비디오 처리 방법 및 장치, 디바이스 및 저장 매체 | |
| CN108920057B (zh) | 电子白板的连接节点控制方法、装置、设备及存储介质 | |
| US11941728B2 (en) | Previewing method and apparatus for effect application, and device, and storage medium | |
| US12154596B2 (en) | Video editing method and apparatus | |
| WO2023104078A1 (zh) | 一种视频编辑模板的生成方法、装置、设备及存储介质 | |
| CN112584208B (zh) | 一种基于人工智能的视频浏览编辑方法和系统 | |
| JP2024502754A (ja) | シミュレートされた撮影用特殊効果の生成方法、装置、機器及び媒体 | |
| CN116916092A (zh) | 视频处理方法、装置、电子设备和存储介质 | |
| CN110880197B (zh) | 信息处理装置、存储介质及信息处理方法 | |
| JP4129162B2 (ja) | コンテンツ作成実演システム及びコンテンツ作成実演方法 | |
| WO2022194070A1 (zh) | 应用程序的视频处理方法和电子设备 | |
| CN115202543B (zh) | 书籍式导航栏的生成、切换方法、装置、设备及存储介质 | |
| US12155926B2 (en) | Video generation method and apparatus for guiding users to take high-quality videos | |
| EP4525459A1 (en) | Video editing method and apparatus, and device and medium | |
| CN121334419A (zh) | 一种视频生成方法、装置、设备、介质及程序产品 | |
| WO2025056071A1 (zh) | 一种多媒体资源处理方法、装置、设备及存储介质 | |
| CN121126086A (zh) | 一种视频生成方法、装置、设备及存储介质 | |
| CN119893160A (zh) | 多媒体显示方法、电子设备及程序产品 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22869112 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022869112 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2023577720 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2022869112 Country of ref document: EP Effective date: 20231213 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |