WO2020020241A1 - 视频处理方法和装置 - Google Patents

视频处理方法和装置 Download PDF

Info

Publication number
WO2020020241A1
WO2020020241A1 PCT/CN2019/097527 CN2019097527W WO2020020241A1 WO 2020020241 A1 WO2020020241 A1 WO 2020020241A1 CN 2019097527 W CN2019097527 W CN 2019097527W WO 2020020241 A1 WO2020020241 A1 WO 2020020241A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
image
weight
user
shortened
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2019/097527
Other languages
English (en)
French (fr)
Inventor
王君富
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to US17/263,425 priority Critical patent/US11445272B2/en
Priority to EP19841881.6A priority patent/EP3826312A4/en
Publication of WO2020020241A1 publication Critical patent/WO2020020241A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8549Creating video summaries, e.g. movie trailer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/235Processing of additional data, e.g. scrambling of additional data or processing content descriptors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • H04N21/43074Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen of additional data with content streams on the same device, e.g. of EPG data or interactive icon with a TV programme
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programmes or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/4508Management of client data or end-user data
    • H04N21/4532Management of client data or end-user data involving end-user characteristics, e.g. viewer profile, preferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4661Deriving a combined profile for a plurality of end-users of the same client, e.g. for family members within a home
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • H04N21/4667Processing of monitored end-user data, e.g. trend analysis based on the log file of viewer selections
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/65Transmission of management data between client and server
    • H04N21/658Transmission by the client directed to the server
    • H04N21/6587Control parameters, e.g. trick play commands, viewpoint selection

Definitions

  • the present disclosure relates to the field of multimedia technology, and in particular, to a video processing method and device.
  • Double speed scheme The user can select the playback speed on the display interface of the video player to achieve the effect of fast playback.
  • a video processing method which includes: semantically analyzing the lines of the video to obtain condensed lines; determining the weight of each frame image in the video based on a predetermined image processing strategy; The weight of each frame image is extracted from high to low in order to obtain a shortened version of the video image; a shortened version of the video is obtained based on the shortened version of the video image and condensed lines.
  • obtaining the shortened video includes: determining the position of each frame of the shortened version of the video image in the original video timeline; determining the position of the original line corresponding to each sentence line in the condensed line in the original video timeline; according to the timeline Match the playback progress of the video image with the playback progress of the condensed lines to generate a shortened video.
  • the predetermined image processing strategy includes: determining a weight allocation policy according to the type tag of the video; and performing one or more of the following operations according to the weight allocation policy: increasing the weight of the close-up image frame according to the weight allocation policy; Increase the weight of facial emotion-rich image frames according to the weight allocation strategy; or, increase the weight of wide-angle lens image frames according to the weight allocation strategy.
  • the video processing method further includes: adjusting the weight of each frame of the image according to the video playback behavior of the user who has watched the video; and generating a shortened version of the video image according to the weighted image frame to update the shortened video.
  • the video processing method further includes: for a single user: obtaining a user's playing behavior of videos of the same type of tags, updating a user's weight allocation policy for videos of the same type of tags according to the user's playing behavior; and according to the updated The weight allocation strategy adjusts the weight of each image frame in the video of the same type of label; according to the weighted image frame, a user-specific shortened version of the video image is generated to generate a user-specific shortened video.
  • the video processing method further includes: updating a weight allocation policy for the same type of users on the same video and / or videos of the same type of tags; and adjusting the weight allocation policy according to the updated weight allocation policy.
  • the video processing method further includes: determining a user's viewing preference by collecting a user's playing behavior; determining a user's similarity degree according to the user's viewing preference; and determining users whose similarity degree exceeds a predetermined threshold as the same type of user.
  • the video processing method further includes: adjusting the weight of each image frame through an entry-based collaborative filtering algorithm and / or a machine learning algorithm, and generating a shortened version of the video image according to the adjusted weighted image frame, so as to update the shortened video.
  • a video processing apparatus including: a line processing unit configured to perform semantic analysis on lines of a video to obtain condensed lines; a weight determining unit configured to determine a video based on a predetermined image processing strategy The weight of each frame image in the frame; the image shortening unit is configured to extract the image frames according to a predetermined video shortening ratio according to the weight of each frame image in order from high to low to obtain a shortened version of the video image; the shortened video acquisition unit is configured to Get shortened videos based on shortened video images and condensed lines.
  • the shortened video acquisition unit is configured to determine the position of each frame of the shortened version of the video image in the original video timeline; determine the position of the original line corresponding to each sentence line in the condensed line in the original video timeline ; According to the timeline, the playback progress of the video image is matched with the playback progress of the condensed lines to generate a shortened video.
  • the predetermined image processing strategy includes: determining a weight allocation policy according to a type tag of the video; and performing one or more of the following operations according to the weight allocation policy:
  • the weight determining unit is further configured to adjust the weight of each frame of the image according to the video playback behavior of the user who has watched the video; the image shortening unit is further configured to generate a shortened version of the video according to the weighted image frame Image to shorten the video acquisition unit Update shortened video.
  • the video processing device further includes: a user behavior obtaining unit configured to obtain, for a single user, a user's playback behavior of a video of the same type of tag, and update the user's behavior of the video of the same type of tag according to the user's playback behavior.
  • Weight allocation strategy configured to obtain, for a single user, a user's playback behavior of a video of the same type of tag, and update the user's behavior of the video of the same type of tag according to the user's playback behavior.
  • Weight allocation strategy the weight determination unit is further configured to: adjust the weight of each image frame in the video of the same type of label according to the updated weight allocation strategy
  • the image shortening unit is also configured to generate a user personalized according to the weighted image frame The shortened version of the video image, so that the shortened video acquisition unit generates a user-specific shortened video.
  • the video processing apparatus further includes: a policy adjustment unit configured to update a weight allocation policy for the same type of users on the same video and / or a video of the same type of tags, for the type of user;
  • the determining unit is further configured to adjust the weight of each image frame in the video of the same type of label according to the updated weight allocation policy;
  • the image shortening unit is further configured to generate a user-type personalized shortened version of the video image based on the weighted image frame To shorten the video acquisition unit to generate a user-type personalized shortened video.
  • the video processing apparatus further includes: a user type determining unit configured to: determine a user's viewing preference by collecting a user's playback behavior; determine a user's similarity degree according to the user's viewing preference; and exceed the similarity degree by a predetermined threshold Of users are identified as the same type of user.
  • a user type determining unit configured to: determine a user's viewing preference by collecting a user's playback behavior; determine a user's similarity degree according to the user's viewing preference; and exceed the similarity degree by a predetermined threshold Of users are identified as the same type of user.
  • the video processing device further includes: a shortened video optimization unit configured to adjust the weight of each image frame through an entry-based collaborative filtering algorithm and / or a machine learning algorithm, and generate a shortened version according to the weighted image frame Video images to update shortened videos.
  • a shortened video optimization unit configured to adjust the weight of each image frame through an entry-based collaborative filtering algorithm and / or a machine learning algorithm, and generate a shortened version according to the weighted image frame Video images to update shortened videos.
  • a video processing apparatus including: a memory; and a processor coupled to the memory, the processor being configured to execute any one of the video processing methods above based on instructions stored in the memory.
  • a computer-readable storage medium on which computer program instructions are stored, and the instructions, when executed by a processor, implement the steps of any one of the video processing methods above.
  • FIG. 1 is a flowchart of some embodiments of an image processing method of the present disclosure.
  • FIG. 2 is a flowchart of some embodiments of audio and image matching in the image processing method of the present disclosure.
  • FIG. 3A is a flowchart of some embodiments of image frame weight adjustment in an image processing method of the present disclosure.
  • FIG. 3B is a flowchart of another embodiment of image frame weight adjustment in the image processing method of the present disclosure.
  • FIG. 3C is a flowchart of still another embodiment of image frame weight adjustment in the image processing method of the present disclosure.
  • FIG. 4 is a flowchart of still another embodiment of an image processing method of the present disclosure.
  • FIG. 5 is a schematic diagram of some embodiments of an image processing apparatus of the present disclosure.
  • FIG. 6 is a schematic diagram of another embodiment of an image processing apparatus of the present disclosure.
  • FIG. 7 is a schematic diagram of still another embodiment of an image processing apparatus of the present disclosure.
  • the maximum double-speed is usually 2 times, and playback below 2 times still consumes a lot of time for the user. For example, a 40-minute TV series, even 1.5 times, it takes 27 minutes; if the video is played at 2 times, then There are high requirements for the user's hearing and understanding ability, and the user must always be in a state of high concentration and stress, so that the fun of watching videos is lost.
  • FIG. 1 A flowchart of some embodiments of the image processing method of the present disclosure is shown in FIG. 1.
  • step 101 semantic analysis is performed on the lines of the video to obtain condensed lines.
  • all the lines of the current video can be obtained, the lines can be NLU (Natural Language Understanding), and the narrative structure and storyline of the video can be condensed and summarized.
  • NLU Natural Language Understanding
  • the narrative structure and storyline of the video can be condensed and summarized.
  • the length of the condensed lines can be controlled as required, such as making the condensed line length about one quarter of the original line length.
  • a weight of each frame image in the video is determined based on a predetermined image processing strategy.
  • each frame of the video can be analyzed, such as: marking a shot as wide-angle or close-up, and marking which characters and characters are included in these shots, and identifying their facial emotions Mood and romance.
  • the video content is divided into images according to the dimensions of the frame, and then the images are annotated, such as: wide-angle lens, close-up lens, facial expressions such as mood, sadness, joy, and supporting characters and passers-by in the video.
  • different weights can be set for different video frames according to the label, lines, and picture settings of the video.
  • step 103 image frames are extracted according to a predetermined video shortening ratio according to the weight of each frame image from high to low to obtain a shortened version of the video image.
  • a shortened video is obtained according to the shortened version of the video image and the condensed lines. For example, the image and audio are combined, and the two parties are controlled to play synchronously to form a shortened video.
  • FIG. 2 a flowchart of some embodiments of audio and image matching in the image processing method of the present disclosure is shown in FIG. 2.
  • step 201 it is determined that each frame image in the shortened version of the video image corresponds to a time point in the original video time axis.
  • step 202 the position of the original lines corresponding to the lines of each sentence in the condensed lines in the original video timeline is determined. In some embodiments, since the lines have undergone coagulation processing, a time period corresponding to the original video timeline of each line is obtained.
  • step 203 the playback progress of the video image and the playback progress of the condensed lines are matched according to the time axis to generate a shortened video.
  • the line of speech is played in synchronization with the image frame.
  • the time point of the image frame in the time axis does not fall within the time period corresponding to any of the lines in the condensed line, you can reduce the audio playback speed, or increase the pause, insert background sound, etc.
  • the other image frames are played synchronously with the corresponding lines.
  • an appropriate number of image frames may be selected for playback from the time period corresponding to the speech according to the image frame weight.
  • word2vector modeling can be used to connect to the context to understand the semantics of each word, and re-extract and summarize the same semantic lines; for semantic representation, use supervised learning methods, because for Understanding of the scene, unsupervised learning is impossible, such as the lines: "Who's dream is to travel around the world?" And "Who's dream is to travel around the world?" For unsupervised learning, these two lines are the same In fact, it is not the same; for the order of contexts, we must consider the time series problem.
  • the LSTM (Long Short-Term Memory) in the RNN (Recurrent Neural Network) model uses the past period of time.
  • TF_IDF Term Frequency-Inverse Document Frequency
  • different image weight setting policies may be set for videos of different regions, styles, and / or types.
  • videos can be classified, such as Chinese, Hong Kong, European, American, Japanese, and Korean by region, and comedy, tragedy, love, action, shootout, thriller, and suspense. Sweet, wild, worth a visit, artificial intelligence, robots, etc.
  • regions, styles, and / or types of videos can be set to filter different content. For example: love movies pay more attention to storylines, so as to increase the weight of storylines and close-ups of characters; shooters pay more attention to fighting scenes, so Targeted increase the weight of fighting scenes; suspense films have higher requirements for scenes and sound effects, so targeted increase of the weight of slow-changing scenes.
  • the video classification information is manually added when the video data is entered, and the video can be read directly; the tag can be selected according to the video content and the feeling of the user when watching the video. If there is no suitable one, it can be added by itself . Then use a K-means (K-means) clustering algorithm to classify the user's tags, select the tag closest to the center point as the tag of this video, and update it at a predetermined frequency to improve timeliness.
  • the video image may be trained by an image recognition system (such as the open source model TensorFlow (Tensor Flow)), and finally the training result is combined with the video classification and label to identify the corresponding plot segment.
  • the image weight setting strategy can be adjusted according to the characteristics of different videos, thereby making the shortened video more consistent with the characteristics of the corresponding region, style, and / or type of video, improving the amount of information retained by the shortened video on the original video, and highlighting key information. To improve user experience.
  • the shortened video when the video is initially put on the shelf, the shortened video may be generated by the above method, and then adjusted according to the viewing behavior of the user to form a shortened video that is more in line with user requirements.
  • FIG. 3A A flowchart of some embodiments of image frame weight adjustment in the image processing method of the present disclosure is shown in FIG. 3A.
  • step 311 the weight of each frame image is adjusted according to the playback behavior of the user who has watched the video.
  • the user's video behavior is recorded, for example, where does the user fast forward the video; where does the fast playback; where does the content look back; where does the stay, and stay duration.
  • user behavior collection can be achieved by capturing mouse click events and recording the position of the video timeline scrolling.
  • the amount of video information is not important and its weight is reduced; for image frames that are viewed back, its weight is increased; for a longer stay (but within a predetermined range, in case the user leaves the pause
  • the video causes the judgment of the importance of the image frame), increasing the weight of the image frame (you can set the weight to be longer when the stay time is within the predetermined time range), etc.
  • step 312 a shortened version of the video image is generated according to the weighted image frames so as to update the shortened video.
  • FIG. 3B A flowchart of another embodiment of image frame weight adjustment in the image processing method of the present disclosure is shown in FIG. 3B.
  • step 321 for a single user, the playback behavior of the user for the video of the same type of tag is obtained, and the user's weight allocation policy for the video of the same type of tag is updated according to the playback behavior of the user. For example, if the two videos are both TV series, cross-travel and court play, then the similarity between the two videos is very high and they can be classified as videos of the same type. For another example, for the same TV series, although each episode is a different video, it has a certain degree of coherence and similarity, and the shortened video of the unwatched video can be adjusted according to the user's playback behavior of several episodes that have been viewed.
  • the user may play the same type of video, such as watching close-ups, fighting scenes, etc., the weight of the image frames of such scenes should be increased. Advance, skip, etc., then the image frame weight of the image frame type to which these image frames belong should be reduced.
  • step 322 the weight of each image frame in the video of the same type of label is adjusted according to the updated weight allocation policy.
  • step 323 a user-type personalized shortened version video image is generated according to the weighted image frames, and a user-type personalized shortened video is generated.
  • the personal preferences of the same user can be analyzed to generate a user-specific weight distribution strategy, so as to realize user-adapted image shortening operations, ensure the appeal of video to each user, and further improve the user experience.
  • FIG. 3C A flowchart of still another embodiment of image frame weight adjustment in the image processing method of the present disclosure is shown in FIG. 3C.
  • a weight allocation policy for this type of user is updated for the playback behavior of the same type of user on the same video and / or the same type of tag video.
  • the viewing preferences of users are determined by collecting playback behaviors of different users, the similarity degree of different users is determined according to the viewing preferences of the users, and then users whose similarity degree exceeds a predetermined threshold are determined to be the same type of user.
  • step 332 the weight of each image frame in the video of the same type of label is adjusted according to the updated weight allocation policy.
  • step 333 a user type personalized shortened version video image is generated according to the weighted image frame, so as to generate a user type personalized shortened video.
  • the same type of user playback behavior can be used to generate personalized shortened videos for this type of users, thereby reducing the insufficient basic data and accidental effects caused by the limited playback behavior of a single user
  • the big problem is to improve the ability of shortening videos to adapt to users' personalization.
  • FIG. 4 A flowchart of still another embodiment of the image processing method of the present disclosure is shown in FIG. 4.
  • step 401 during the cold start phase, semantic analysis is performed on the lines of the video to obtain condensed lines.
  • the length of the condensed lines can be controlled based on the length of the target video.
  • a weight of each frame image in the video is determined based on a predetermined image processing strategy.
  • the weight of the close-up image frames may be increased according to the weight allocation strategy; the weight of the facial emotion-rich image frames may be increased according to the weight allocation strategy; and the weight of the wide-angle lens image frames may be increased according to the weight allocation strategy.
  • the weight adjustment amount may be determined according to a weight allocation policy corresponding to a tag of a video.
  • step 403 image frames are extracted in order from the weight of each frame image from high to low to obtain a shortened version of the video image.
  • step 404 a shortened video is obtained according to the shortened version video image and the condensed lines.
  • step 411 the playback behavior of the user is acquired as the user uses it.
  • the weight of each image frame is adjusted by a user-based collaborative filtering algorithm.
  • the collected user behavior data can discover the user's viewing preferences for a video, and measure and score these preferences. Calculate the relationship between users based on their attitudes and preferences for the same video, and integrate the same style of video clips for users with the same preferences.
  • step 413 the image frames are weighted again, and then step 430 is performed.
  • the method in the embodiments related to FIG. 3A and / or 3B may be used to implement adjustment of weight allocation based on user playback behavior.
  • step 421 the weight of each image frame is adjusted by an entry-based collaborative filtering algorithm.
  • the relationship between videos is obtained by calculating different user ratings on different videos. Based on the relationship between videos, similar new plays and films can be edited and integrated according to the same scheme.
  • the method in the embodiment shown in FIG. 3A or 3C may be used to implement adjustment of weight allocation based on the playing behavior of different users of the same type of video.
  • step 422 the adjustment of the weight of each image frame is determined by a machine learning algorithm.
  • the video quality can be continuously improved.
  • step 421 and step 422 may be performed in an interchangeable order.
  • step 430 according to the configuration of the image frame weights during the cold start process, the image frame weights are updated in combination with the weight adjustment results of steps 413 and 422.
  • step 403 is performed to update the obtained shortened version video image and regenerate the shortened video.
  • the video is quickly filtered and processed, leaving only the key content and main storyline, thereby improving the efficiency of people watching videos and saving watching. Time allows people to get more information in unit time.
  • the image processing apparatus includes a line processing unit 51, a weight determination unit 52, an image shortening unit 52, and a shortened video acquisition unit 54.
  • the line processing unit 51 can perform semantic analysis on lines of the video to obtain condensed lines. In some embodiments, all the lines of the current video can be acquired, the NLU semantic understanding of the lines can be obtained, and the narrative structure and storyline of the video can be condensed and summarized.
  • the weight determination unit 52 can determine the weight of each frame image in the video based on a predetermined image processing strategy.
  • the image shortening unit 53 is capable of extracting image frames according to a predetermined video shortening ratio according to the weight of each frame image from high to low to obtain a shortened version of the video image.
  • the shortened video obtaining unit 54 can obtain a shortened video according to the shortened version of the video image and the condensed lines, such as merging images and audio, and controlling the two parties to play synchronously, thereby forming a shortened video.
  • Such a video processing device can generate condensed lines that conform to the subject matter of the video, extract important frames in the video, and automatically generate a shortened video for the user to watch, so that the user can efficiently obtain the effective information of the video and improve the user experience.
  • the shortened video acquisition unit 54 can determine the position of each frame of the shortened version of the video image in the original video timeline, determine the position of the original line corresponding to each sentence line in the condensed line in the original video timeline, and according to time
  • the axis matches the playback progress of the video image with the playback progress of the condensed lines to generate a shortened video, which can ensure the synchronization of the line and the image progress as much as possible, facilitate the user's understanding of the video, improve the quality of the shortened video, and improve the user experience.
  • the weight determining unit 52 can also adjust the weight of each frame of the image according to the video playback behavior of the user who has watched the video; the image shortening unit 53 can also generate a shortened version of the video image according to the weighted image frame, so that Shortened video acquisition unit updates shortened video.
  • Such a device can collect characteristics of a user's viewing behavior of the same video, and analyze the playback behavior of the user to obtain feedback on the important situation of the image frame from the user, thereby achieving a personalized image frame weight analysis of the video itself and improving the extraction of the image frame. Accuracy to further optimize and shorten the satisfaction of video to user needs.
  • the video processing apparatus may further include a user behavior obtaining unit 55, capable of acquiring a user's playing behavior of videos of the same type of tags, and updating a user's weight allocation policy for videos of the same type of tags according to the user's playing behavior.
  • the weight determining unit 52 can also adjust the weight of each image frame in the video of the same type of label according to the updated weight allocation policy; the image shortening unit 53 can also generate a user-customized shortened version of the video image based on the adjusted weighted image frame, so that The shortened video acquisition unit generates a user-specific shortened video.
  • Such a device can analyze the personal preferences of the same user and generate a personalized weight allocation strategy for the user, thereby realizing user-adapted image shortening operations, ensuring the appeal of the video to each user, and further improving the user experience.
  • the video processing device may further include a policy adjustment unit 56 capable of updating the weight allocation policy for this type of users for the same type of users' playing behaviors on the same video and / or the same type of videos;
  • Unit 52 can also adjust the weight of each image frame in the video of the same type of label according to the updated weight allocation policy;
  • image shortening unit 53 can also generate a user-type personalized shortened version of the video image based on the adjusted weighted image frame in order to shorten the video
  • the acquisition unit generates a user-type personalized shortening video.
  • the video processing apparatus may further include a user type determining unit 57 capable of determining a user's viewing preferences by collecting playback behaviors of different users, determining the similarity degree of different users according to the user's viewing preferences, and exceeding the predetermined degree of similarity Of users are determined to be the same type of user, so that the policy adjustment unit can update the weight allocation policy for this type of user.
  • a user type determining unit 57 capable of determining a user's viewing preferences by collecting playback behaviors of different users, determining the similarity degree of different users according to the user's viewing preferences, and exceeding the predetermined degree of similarity Of users are determined to be the same type of user, so that the policy adjustment unit can update the weight allocation policy for this type of user.
  • Such a device can use the same type of user playback behavior to generate a personalized shortened video for this type of user, thereby reducing the problem of insufficient basic data amount and occasional impact due to the limited playback behavior of a single user, and improving the shortened video for users.
  • Personalized adaptability can be used to use the same type of user playback behavior to generate a personalized shortened video for this type of user, thereby reducing the problem of insufficient basic data amount and occasional impact due to the limited playback behavior of a single user, and improving the shortened video for users.
  • the video processing device may further include a shortened video optimization unit 58 capable of adjusting the weight of each image frame through an entry-based collaborative filtering algorithm and a machine learning algorithm, and generating a shortened version of the video based on the weighted image frames Images in order to update the shortened video, so as to continuously optimize the shortened video, and realize the adaptive adjustment and evolution of the shortened video for each user.
  • a shortened video optimization unit 58 capable of adjusting the weight of each image frame through an entry-based collaborative filtering algorithm and a machine learning algorithm, and generating a shortened version of the video based on the weighted image frames Images in order to update the shortened video, so as to continuously optimize the shortened video, and realize the adaptive adjustment and evolution of the shortened video for each user.
  • the video processing apparatus includes a memory 601 and a processor 602.
  • the memory 601 may be a magnetic disk, a flash memory, or any other non-volatile storage medium.
  • the memory is configured to store instructions in the corresponding embodiments of the video processing method above.
  • the processor 602 is coupled to the memory 601 and may be implemented as one or more integrated circuits, such as a microprocessor or a microcontroller.
  • the processor 602 is configured to execute instructions stored in a memory, so that a user can efficiently obtain valid information of a video and improve a user experience.
  • the video processing apparatus 700 includes a memory 701 and a processor 702.
  • the processor 702 is coupled to the memory 701 through a BUS bus 703.
  • the video processing device 700 may also be connected to the external storage device 705 through the storage interface 704 to call external data, and may also be connected to the network or another computer system (not shown) through the network interface 706. I won't go into details here.
  • the user can efficiently obtain valid information of the video and improve the user experience.
  • a computer-readable storage medium stores computer program instructions that, when executed by a processor, implement steps of a method corresponding to a method in a video processing method.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing device to work in a particular manner such that the instructions stored in the computer-readable memory produce a manufactured article including an instruction device, the instructions
  • the device implements the functions specified in one or more flowcharts and / or one or more blocks of the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing device, so that a series of steps can be performed on the computer or other programmable device to produce a computer-implemented process, which can be executed on the computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more flowcharts and / or one or more blocks of the block diagrams.
  • the methods and apparatus of the present disclosure may be implemented in many ways.
  • the methods and devices of the present disclosure may be implemented by software, hardware, firmware or any combination of software, hardware, firmware.
  • the above order of the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless otherwise specifically stated.
  • the present disclosure may also be implemented as programs recorded in a recording medium, which programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing a method according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Social Psychology (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Library & Information Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Television Signal Processing For Recording (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开提出一种视频处理方法和装置,涉及多媒体技术领域。本公开的一种视频处理方法包括:对视频的台词进行语义分析,获取凝练台词;基于预定图像处理策略确定视频中各帧图像的权重;按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像;根据缩短版视频图像和凝练台词获取缩短视频。通过这样的方法,能够生成符合视频主题内容的凝练台词,并提取出视频中重要的帧,自动生成缩短视频供用户观看,使用户能够高效的获取视频的有效信息,提高用户体验。

Description

视频处理方法和装置
相关申请的交叉引用
本申请是以CN申请号为201810843764.5,申请日为2018年7月27日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及多媒体技术领域,特别是一种视频处理方法和装置。
背景技术
随着人们的生活节奏不断加快,大家越来越希望能够在有限的碎片化时间中获取更多的信息,信息获取的途径有文字、图像、视频等。目前,视频是我们快速获取信息的重要来源,与此同时我们对视频的播放形式、观看方式的要求越来越高。
为提高在视频观看过程中的信息获取效率,播放器提供了两种方案:
(1)快进方案。在视频播放器中,用户可以通过左右滑动屏幕、或在键盘上点击左右键来实现快进和快退效果,还可以直接拖动进度条实现快进和快退的效果。
(2)倍速方案。用户可以在视频播放器的显示界面选择播放速度,实现快速播放的效果。
发明内容
根据本公开的一些实施例,提出一种视频处理方法,包括:对视频的台词进行语义分析,获取凝练台词;基于预定图像处理策略确定视频中各帧图像的权重;按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像;根据缩短版视频图像和凝练台词获取缩短视频。
在一些实施例中,获取缩短视频包括:确定缩短版视频图像中各帧图像在原视频时间轴中的位置;确定凝练台词中各句台词对应的原台词在原视频时间轴中的位置;根据时间轴将视频图像的播放进度和凝练台词的播放进度契合,生成缩短视频。
在一些实施例中,预定图像处理策略包括:根据视频的类型标签确定权重分配策略;以及根据权重分配策略执行以下操作中的一种或多种:根据权重分配策略增加特写镜头图像帧的权重;根据权重分配策略增加面部情绪丰富的图像帧的权重;或,根 据权重分配策略增加广角镜头图像帧的权重。
在一些实施例中,视频处理方法还包括:根据已观看视频的用户对视频的播放行为调整各帧图像的权重;根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
在一些实施例中,视频处理方法还包括:针对单个用户:获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略;根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重;根据调整权重后的图像帧生成用户个性化缩短版视频图像,以便生成用户个性化缩短视频。
在一些实施例中,视频处理方法还包括:针对同一类用户对同一个视频和/或同一类型标签的视频的播放行为更新针对这一类用户的权重分配策略;根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重;根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便生成用户类型个性化缩短视频。
在一些实施例中,视频处理方法还包括:通过采集用户的播放行为确定用户的观看偏好;根据用户的观看偏好确定用户的相似程度;将相似程度超过预定阈值的用户确定为同一类用户。
在一些实施例中,视频处理方法还包括:通过基于条目的协同过滤算法和/或机器学习算法调整各个图像帧的权重,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
根据本公开的另一些实施例,提出一种视频处理装置,包括:台词处理单元,被配置为对视频的台词进行语义分析,获取凝练台词;权重确定单元,被配置基于预定图像处理策略确定视频中各帧图像的权重;图像缩短单元,被配置为按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像;缩短视频获取单元,被配置为根据缩短版视频图像和凝练台词获取缩短视频。
在一些实施例中,缩短视频获取单元,被配置为:确定缩短版视频图像中各帧图像在原视频时间轴中的位置;确定凝练台词中各句台词对应的原台词在原视频时间轴中的位置;根据时间轴将视频图像的播放进度和凝练台词的播放进度契合,生成缩短视频。
在一些实施例中,预定图像处理策略包括:根据视频的类型标签确定权重分配策略;以及根据权重分配策略执行以下操作中的一种或多种:
根据权重分配策略增加特写镜头图像帧的权重;根据权重分配策略增加面部情绪丰富的图像帧的权重;或,根据权重分配策略增加广角镜头图像帧的权重。
在一些实施例中,权重确定单元还被配置为:根据已观看视频的用户对视频的播放行为调整各帧图像的权重;图像缩短单元还被配置为根据调整权重后的图像帧生成缩短版视频图像,以便缩短视频获取单元更新缩短视频。
在一些实施例中,视频处理装置还包括:用户行为获取单元,被配置为针对单个用户,获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略;权重确定单元还被配置为:根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重;图像缩短单元还被配置为根据调整权重后的图像帧生成用户个性化缩短版视频图像,以便缩短视频获取单元生成用户个性化缩短视频。
在一些实施例中,视频处理装置还包括:策略调整单元,被配置为针对同一类用户对同一个视频和/或同一类型标签的视频的播放行为更新针对这一类用户的权重分配策略;权重确定单元还被配置为:根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重;图像缩短单元还被配置为根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便缩短视频获取单元生成用户类型个性化缩短视频。
在一些实施例中,视频处理装置还包括:用户类型确定单元,被配置为:通过采集用户的播放行为确定用户的观看偏好;根据用户的观看偏好确定用户的相似程度;将相似程度超过预定阈值的用户确定为同一类用户。
在一些实施例中,视频处理装置还包括:缩短视频优化单元,被配置为通过基于条目的协同过滤算法和/或机器学习算法调整各个图像帧的权重,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
根据本公开的又一些实施例,提出一种视频处理装置,包括:存储器;以及耦接至存储器的处理器,处理器被配置为基于存储在存储器的指令执行上文中任意一种视频处理方法。
根据本公开的再一些实施例,提出一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现上文中任意一种视频处理方法的步骤。
附图说明
此处所说明的附图用来提供对本公开的进一步理解,构成本公开的一部分,本公开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:
图1为本公开的图像处理方法的一些实施例的流程图。
图2为本公开的图像处理方法中音频、图像匹配的一些实施例的流程图。
图3A为本公开的图像处理方法中图像帧权重调整的一些实施例的流程图。
图3B为本公开的图像处理方法中图像帧权重调整的另一些实施例的流程图。
图3C为本公开的图像处理方法中图像帧权重调整的又一些实施例的流程图。
图4为本公开的图像处理方法的又一些实施例的流程图。
图5为本公开的图像处理装置的一些实施例的示意图。
图6为本公开的图像处理装置的另一些实施例的示意图。
图7为本公开的图像处理装置的又一些实施例的示意图。
具体实施方式
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。
发明人发现:相关技术中的快进方案需要人不断的干预,不断的向前或向后滑动,以满足观看效果,但是遗漏内容以后还需要再重新回来观看。这样就导致用户不能专心的观看视频内容,获取的故事情节也是不连续的,用户体验很差,还容易遗漏关键内容。这样用户花费了时间却没能有效的达到快速观看的效果。
倍速播放方案中,通常最大倍速为2倍速,2倍速以下的播放依然会消耗用户大量的时间,比如一节40分钟的电视剧,即使1.5倍速还需要27分钟;如果2倍速播放视频的时候,那么对用户听力和理解能力都有较高的要求,而且用户必须一直处于一个高度集中、精神紧张的状态,这样就失去了观看视频的乐趣。
本公开的图像处理方法的一些实施例的流程图如图1所示。
在步骤101中,对视频的台词进行语义分析,获取凝练台词。在一些实施例中,可以获取当前视频的全部台词,对台词进行NLU(Natural Language Understanding,自然语言理解),凝练总结视频的叙事结构、故事情节等。通过句法分析、信息抽取、语言组织,形成新台词,且新台词包含了完整的叙事情节。在一些实施例中,可以根据需要控制凝练台词的长度,如使得凝练台词长度在原台词长度的四分之一左右。
在步骤102中,基于预定图像处理策略确定视频中各帧图像的权重。在一些实施 例中,可以对视频的每一帧画面进行分析,比如:标注某个镜头是广角或特写镜头,以及标注在这些镜头中包含了哪些人物和角色,并且识别出他们的面部情绪的喜怒哀乐。将视频内容按帧的维度进行图像分拆,接着对图像做标注,比如:广角镜头、特写镜头,喜怒哀乐等人物表情,以及视频中出现的主配角和路人甲。在一些实施例中,可以根据视频的标签、台词以及画面设置对不同视频帧设置不同的权重。
在步骤103中,按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像。
在步骤104中,根据缩短版视频图像和凝练台词获取缩短视频,如将图像、音频合并,控制双方同步播放,形成缩短视频。
通过这样的方法,能够生成符合视频主题内容的凝练台词,并提取出视频中重要的帧,自动生成缩短视频供用户观看,例如,将一集40分钟的电视剧缩短为10分钟,一部2个小时的电影缩短为30分钟,缩短视频中保留视频主要表达的内容,大大提高了人们观看的效率和信息获取效率,提高用户体验。
在一些实施例中,为保证视频的音频、视频同步播放,本公开的图像处理方法中音频、图像匹配的一些实施例的流程图如图2所示。
在步骤201中,确定缩短版视频图像中各帧图像对应原视频时间轴中的时间点。
在步骤202中,确定凝练台词中各句台词对应的原台词在原视频时间轴中的位置。在一些实施例中,由于台词经过凝练处理,因此获取每句台词对应原视频时间轴的时间段。
在步骤203中,根据时间轴将视频图像的播放进度和凝练台词的播放进度契合,生成缩短视频。
在一些实施例中,若图像帧在时间轴中的时间点落在某句台词对应的时间段范围内,则将该台词与该图像帧同步播放。
在一些实施例中,若图像帧在时间轴中的时间点不落在凝练台词中任何一句台词对应的时间段范围内,则可以通过降低音频播放速度,或增加停顿、插入背景音等方式使其他图像帧与对应的台词同步播放。
在一些实施例中,若台词对应的时间段不存在选出的图像帧,则可以从台词对应的时间段内根据图像帧权重选择适宜数量的图像帧播放。
通过这样的方法,能够尽可能保证台词与图像进度的同步,便于用户对视频的理解,提高缩短视频的质量,提升用户体验。
在一些实施例中,在获取凝练台词的过程中,可以采用word2vector建模,联系上下文理解每个词语的语义,将相同语义台词进行重新抽取总结;对于语义的表示,使用监督学习方法,因为对于场景的理解,非监督学习无法做到,比如台词:“谁的梦想是环游世界?”和“环游世界是谁的梦想?”,对于非监督学习来说,这两句台词是一样的,事实上是不一样的;对于上下文的顺序要考虑到时间序列问题,RNN(Recurrent Neural Network,循环神经网络)模型里的LSTM(Long Short-Term Memory,长短期记忆网络)是利用过去一段时间内某事件时间的特征来预测未来一段时间内该事件的特征。在一些实施例中,对于台词文本分布的不均匀性会导致出现语义倾斜问题,在word2vector模型上融合TF_IDF(Term Frequency–Inverse Document Frequency,词频-逆文本频率指数)思想,进行语义平滑处理,并且在有必要的情况下进行降维处理。
通过这样的方法,能够实现对语义的正确理解和适当凝练,从而保证凝练台词相对于原台词保留较多的重要信息的信息量,保证凝练台词对原台词表达的正确性,提高用户体验。
在一些实施例中,对于不同地域、风格和/或类型的视频可以设置不同的图像权重设置策略。如,可以对视频进行分类,比如按地区可分为华语、港片、欧美、日韩等;按类型可分为喜剧、悲剧、爱情、动作、枪战、惊悚、悬疑等,按照风格分可以包括甜美、荒野、值得一看、人工智能、机器人等。不同的地域、风格和/或类型的视频可以设置过滤不同的内容,例如:爱情片更注重故事情节,从而有针对性的提高故事情节、人物特写镜头的权重;枪战片更注重打斗场面,从而有针对性的提高打斗场景的权重;悬疑片对场景和音效要求比较高,从而有针对性的提高变化较慢的场景的权重等。在一些实施例中,视频分类信息在录入视频数据的时候手动添加,可以直接读取视频;标签可以在用户在观看视频的时候可根据视频内容和自己的感受选择,如果没有合适的可自行添加。然后使用K-means(K-均值)聚类算法,将用户的标签分类,选择离中心点最近的标签作为此视频的标签,并按照预定频率更新,从而提高时效性。在一些实施例中,可以通过图像识别系统(如开源模型TensorFlow(张量流))对视频图像进行训练,最后把训练结果结合视频的分类和标签,识别出相应的剧情片段。
通过这样的方法,能够根据不同视频的特点调整图像权重设置策略,从而使得缩短视频更加符合对应地域、风格和/或类型的视频的特征,提高缩短视频对原视频的信 息保留量,突出重点信息,提升用户体验。
在一些实施例中,当视频初始上架时,可以通过上文中的方法生成缩短视频,进而根据用户观看行为调整,形成更加符合用户需求的缩短视频。本公开的图像处理方法中图像帧权重调整的一些实施例的流程图如图3A所示。
在步骤311中,根据已观看视频的用户对视频的播放行为调整各帧图像的权重。在一些实施例中,将用户的视频行为记录下来,如,用户在视频播放到什么位置进行快进;在什么位置进行快速播放;在什么位置进行内容回看;在什么位置进行停留,以及停留时长。在一些实施例中,用户行为采集可以通过捕捉鼠标点击事件和记录视频时间轴滚动的位置情况实现。对于快进的图像帧,可以认为该部分视频信息量不重要,降低其权重;对于回看的图像帧,增加其权重;对于停留时长较长(但在预定范围内,以防由于用户离开暂停视频导致影响图像帧重要性判断),增加图像帧权重(可以设置当停留时长在预定时间范围内时,停留时间越长则权重越大)等。
在步骤312中,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
通过这样的方法,能够针对用户对同一视频的观看行为进行特征采集,通过用户播放行为分析,得到来自用户的图像帧重要情况反馈,从而实现对于视频自身的个性化图像帧权重分析,提高图像帧提取的准确度,进一步优化缩短视频对用户需求的满足情况。
在一些实施例中,由于不同用户具有不同的视频观看习惯,其重视的环节会有区别,因此可以根据用户本身的播放习惯调整为该用户个人提供的缩短视频。本公开的图像处理方法中图像帧权重调整的另一些实施例的流程图如图3B所示。
在步骤321中,针对单个用户,获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略。例如两个视频都是电视剧、都是穿越剧、都是宫廷剧,那么这两个视频的相似度就很高,可以归为同一类型标签的视频。又例如,对于同一电视剧,虽然每一集是不同的视频,但其具有一定的连贯性和相似度,可以根据用户对观看过的几集的播放行为调整对未观看视频的缩短视频。
在一些实施例中,可以针对用户对相同类型的视频的播放行为,如更喜欢观看特写镜头、打斗场景等,则该类场景图像帧的权重应升高;若用户对某些图像帧出现快进、跳过等操作,那么这些图像帧归属的图像帧类型的图像帧权重应降低。
在步骤322中,根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重。
在步骤323中,根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,生成用户类型个性化缩短视频。
通过这样的方法,可以对同一用户个人的喜好进行分析,生成用户个性化的权重分配策略,从而实现用户自适应的图像缩短操作,保证视频对每个用户的吸引力,进一步提升用户体验。
本公开的图像处理方法中图像帧权重调整的又一些实施例的流程图如图3C所示。
在步骤331中,针对同一类用户对同一个视频和/或同一类型标签的视频的播放行为更新针对这一类用户的权重分配策略。在一些实施例中,通过采集不同用户的播放行为确定用户的观看偏好,根据用户的观看偏好确定不同用户的相似程度,进而将相似程度超过预定阈值的用户确定为同一类用户。
在步骤332中,根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重。
在步骤333中,根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便生成用户类型个性化缩短视频。
由于部分用户的行为会有一定的相似度,因此可以用同一类型的用户播放行为生成针对该类用户的个性化缩短视频,从而降低由于单个用户的播放行为有限造成的基础数据量不足、偶然性影响大的问题,提升缩短视频对用户的个性化适应能力。
本公开的图像处理方法的又一些实施例的流程图如图4所示。
在步骤401中,在冷启动阶段,对视频的台词进行语义分析,获取凝练台词。在一些实施例中,可以根据目标视频的长度控制凝练台词的长度。
在步骤402中,基于预定图像处理策略确定视频中各帧图像的权重。在一些实施例中,可以根据权重分配策略增加特写镜头图像帧的权重;根据权重分配策略增加面部情绪丰富的图像帧的权重;根据权重分配策略增加广角镜头图像帧的权重。在一些实施例中,权重调整量可以根据视频的标签对应的权重分配策略确定。
在步骤403中,按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像。
在步骤404中,根据所述缩短版视频图像和所述凝练台词获取缩短视频。
在步骤411中,随着用户的使用获取用户的播放行为。
在步骤412中,通过基于用户的协同过滤算法调整各个图像帧的权重。通过采集的用户行为数据可以发现用户对某个视频的观看偏好,并对这些偏好进行度量和打分。根据不同用户对相同视频的态度和偏好程度计算用户之间的关系,对有相同喜好的用户进行相同风格的视频剪辑整合。
在步骤413中,重新对图像帧进行权重分配,继而执行步骤430。在一些实施例中,可以采用图3A和/或3B相关的实施例中的方式实现基于用户播放行为对权重分配的调整。
在步骤421中,通过基于条目的协同过滤算法调整各个图像帧的权重。通过计算不同用户对不同视频的评分获得视频间的关系。基于视频间的关系对同类的新剧和新片就可以依据相同的方案进行剪辑和整合。在一些实施例中,可以采用图3A或3C所示实施例中的方式实现基于同一类型视频不同用户播放行为的对权重分配的调整。
在步骤422中,通过机器学习算法确定对各个图像帧的权重的调整情况。通过机器学习的方法不断去调整权重参数,能够不断的完善视频质量。
在一些实施例中,步骤421和步骤422可以互换执行顺序。
继而执行步骤430。
在步骤430中,根据冷启动过程中对图像帧权重的配置情况,结合步骤413、步骤422的权重调整结果更新图像帧权重。完成权重更新后,执行步骤403,从而更新获取的缩短版视频图像,重新生成缩短视频。
通过这样的方法,根据不同的视频类型、不同的视频风格,结合人们的喜好,对视频进行快速过滤和处理,只留下关键的内容和主要故事情节,从而提高人们的观看视频效率,节省观看时间,使人们在单位时间内获取更多的信息。
本公开的图像处理装置的一些实施例的示意图如图5所示。图像处理装置包括台词处理单元51、权重确定单元52、图像缩短单元52和缩短视频获取单元54。
台词处理单元51能够对视频的台词进行语义分析,获取凝练台词。在一些实施例中,可以获取当前视频的全部台词,对台词进行NLU语义理解,凝练总结视频的叙事结构、故事情节等。权重确定单元52能够基于预定图像处理策略确定视频中各帧图像的权重。图像缩短单元53能够按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像。缩短视频获取单元54能够根据缩短版视频图像和凝练台词获取缩短视频,如将图像、音频合并,控制双方同步播放, 从而形成缩短视频。
这样的视频处理装置能够生成符合视频主题内容的凝练台词,并提取出视频中重要的帧,自动生成缩短视频供用户观看,使用户能够高效的获取视频的有效信息,提高用户体验。
在一些实施例中,缩短视频获取单元54能够确定缩短版视频图像中各帧图像在原视频时间轴中的位置,确定凝练台词中各句台词对应的原台词在原视频时间轴中的位置,根据时间轴将视频图像的播放进度和凝练台词的播放进度契合,生成缩短视频,从而能够尽可能保证台词与图像进度的同步,便于用户对视频的理解,提高缩短视频的质量,提升用户体验。
在一些实施例中,权重确定单元52还能够根据已观看视频的用户对视频的播放行为调整各帧图像的权重;图像缩短单元53还能够根据调整权重后的图像帧生成缩短版视频图像,以便缩短视频获取单元更新缩短视频。
这样的装置能够针对用户对同一视频的观看行为进行特征采集,通过用户播放行为分析,得到来自用户的图像帧重要情况反馈,从而实现对于视频自身的个性化图像帧权重分析,提高图像帧提取的准确度,进一步优化缩短视频对用户需求的满足情况。
在一些实施例中,视频处理装置还可以包括用户行为获取单元55,能够获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略。权重确定单元52还能够根据更新后的权重分配策略,调整同一类型标签的视频中各个图像帧的权重;图像缩短单元53还能够根据调整权重后的图像帧生成用户个性化缩短版视频图像,以便缩短视频获取单元生成用户个性化缩短视频。
这样的装置能够对同一用户个人的喜好进行分析,生成用户个性化的权重分配策略,从而实现用户自适应的图像缩短操作,保证视频对每个用户的吸引力,进一步提升用户体验。
在一些实施例中,视频处理装置还可以包括策略调整单元56,能够针对同一类用户对同一个视频和/或同一类型标签的视频的播放行为更新针对这一类用户的权重分配策略;权重确定单元52还能够根据更新后的权重分配策略调整同一类型标签的视频中各个图像帧的权重;图像缩短单元53还能够根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便缩短视频获取单元生成用户类型个性化缩短视频。在一些实施例中,视频处理装置还可以包括用户类型确定单元57,能够通过采集 不同用户的播放行为确定用户的观看偏好,根据用户的观看偏好确定不同用户的相似程度,将相似程度超过预定阈值的用户确定为同一类用户,从而便于策略调整单元更新针对这一类用户的权重分配策略。
这样的装置能够用同一类型的用户播放行为生成针对该类用户的个性化缩短视频,从而降低由于单个用户的播放行为有限造成的基础数据量不足、偶然性影响大的问题,提升缩短视频对用户的个性化适应能力。
在一些实施例中,视频处理装置还可以包括缩短视频优化单元58,能够通过基于条目的协同过滤算法,以及通过机器学习算法调整各个图像帧的权重,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频,从而实现对缩短视频的不断优化,实现缩短视频针对每个用户的自适应调整和演进。
本公开视频处理装置的一些实施例的结构示意图如图6所示。视频处理装置包括存储器601和处理器602。其中:存储器601可以是磁盘、闪存或其它任何非易失性存储介质。存储器用于存储上文中视频处理方法的对应实施例中的指令。处理器602耦接至存储器601,可以作为一个或多个集成电路来实施,例如微处理器或微控制器。该处理器602用于执行存储器中存储的指令,能够使用户高效的获取视频的有效信息,提高用户体验。
在一些实施例中,还可以如图7所示,视频处理装置700包括存储器701和处理器702。处理器702通过BUS总线703耦合至存储器701。该视频处理装置700还可以通过存储接口704连接至外部存储装置705以便调用外部数据,还可以通过网络接口706连接至网络或者另外一台计算机系统(未标出)。此处不再进行详细介绍。
在该实施例中,通过存储器存储数据指令,再通过处理器处理上述指令,能够使用户高效的获取视频的有效信息,提高用户体验。
在另一些实施例中,一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现视频处理方法对应实施例中的方法的步骤。本领域内的技术人员应明白,本公开的实施例可提供为方法、装置、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(系统)和计算机程序产品的流程 图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
至此,已经详细描述了本公开。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。
可能以许多方式来实现本公开的方法以及装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法以及装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。
最后应当说明的是:以上实施例仅用以说明本公开的技术方案而非对其限制;尽管参照较佳实施例对本公开进行了详细的说明,所属领域的普通技术人员应当理解:依然可以对本公开的具体实施方式进行修改或者对部分技术特征进行等同替换;而不脱离本公开技术方案的精神,其均应涵盖在本公开请求保护的技术方案范围当中。

Claims (18)

  1. 一种视频处理方法,包括:
    对视频的台词进行语义分析,获取凝练台词;
    基于预定图像处理策略确定视频中各帧图像的权重;
    按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像;
    根据所述缩短版视频图像和所述凝练台词获取缩短视频。
  2. 根据权利要求1所述的视频处理方法,其中,所述获取缩短视频包括:
    确定所述缩短版视频图像中各帧图像在原视频时间轴中的位置;
    确定所述凝练台词中各句台词对应的原台词在原视频时间轴中的位置;
    根据时间轴将所述视频图像的播放进度和所述凝练台词的播放进度契合,生成所述缩短视频。
  3. 根据权利要求1所述的视频处理方法,其中,所述预定图像处理策略包括:
    根据视频的类型标签确定权重分配策略;
    以及根据所述权重分配策略执行以下操作中的一种或多种:
    根据所述权重分配策略增加特写镜头图像帧的权重;
    根据所述权重分配策略增加面部情绪丰富的图像帧的权重;或
    根据所述权重分配策略增加广角镜头图像帧的权重。
  4. 根据权利要求1所述的视频处理方法,还包括:
    根据已观看视频的用户对视频的播放行为调整各帧图像的权重;
    根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
  5. 根据权利要求1所述的视频处理方法,还包括:针对单个用户:
    获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略;
    根据更新后的所述权重分配策略调整同一类型标签的视频中各个图像帧的权重;
    根据调整权重后的图像帧生成用户个性化缩短版视频图像,以便生成用户个性化缩短视频。
  6. 根据权利要求1所述的视频处理方法,还包括:
    针对同一类用户对同一个视频或同一类型标签中至少一种的视频的播放行为更新针对这一类用户的权重分配策略;
    根据更新后的所述权重分配策略调整同一类型标签的视频中各个图像帧的权重;
    根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便生成用户类型个性化缩短视频。
  7. 根据权利要求6所述的视频处理方法,还包括:
    通过采集用户的播放行为确定用户的观看偏好;
    根据用户的观看偏好确定用户的相似程度;
    将相似程度超过预定阈值的用户确定为同一类用户。
  8. 根据权利要求1~7任意一项所述的视频处理方法,还包括:
    通过基于条目的协同过滤算法,以及通过机器学习算法调整各个图像帧的权重,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
  9. 一种视频处理装置,包括:
    台词处理单元,被配置为对视频的台词进行语义分析,获取凝练台词;
    权重确定单元,被配置基于预定图像处理策略确定视频中各帧图像的权重;
    图像缩短单元,被配置为按照预定视频缩短比例,根据各帧图像的权重从高到低的顺序抽取图像帧,获取缩短版视频图像;
    缩短视频获取单元,被配置为根据所述缩短版视频图像和所述凝练台词获取缩短视频。
  10. 根据权利要求9所述的视频处理装置,其中,所述缩短视频获取单元,被配置为:
    确定所述缩短版视频图像中各帧图像在原视频时间轴中的位置;
    确定所述凝练台词中各句台词对应的原台词在原视频时间轴中的位置;
    根据时间轴将所述视频图像的播放进度和所述凝练台词的播放进度契合,生成所述缩短视频。
  11. 根据权利要求9所述的视频处理装置,其中,所述预定图像处理策略包括:
    根据视频的类型标签确定权重分配策略;
    以及根据所述权重分配策略执行以下操作中的一种或多种:
    根据所述权重分配策略增加特写镜头图像帧的权重;
    根据所述权重分配策略增加面部情绪丰富的图像帧的权重;或
    根据所述权重分配策略增加广角镜头图像帧的权重。
  12. 根据权利要求9所述的视频处理装置,其中,所述权重确定单元还被配置为:根据已观看视频的用户对视频的播放行为调整各帧图像的权重;
    所述图像缩短单元还被配置为根据调整权重后的图像帧生成缩短版视频图像,以便所述缩短视频获取单元更新缩短视频。
  13. 根据权利要求9所述的视频处理装置,还包括:
    用户行为获取单元,被配置为针对单个用户,获取用户对同一类型标签的视频的播放行为,根据用户的播放行为更新用户对同一类型标签的视频的权重分配策略;
    所述权重确定单元还被配置为:根据更新后的所述权重分配策略调整同一类型标签的视频中各个图像帧的权重;
    所述图像缩短单元还被配置为根据调整权重后的图像帧生成用户个性化缩短版视频图像,以便所述缩短视频获取单元生成用户个性化缩短视频。
  14. 根据权利要求9所述的视频处理装置,还包括:
    策略调整单元,被配置为针对同一类用户对同一个视频或同一类型标签中至少一种的视频的播放行为更新针对这一类用户的权重分配策略;
    所述权重确定单元还被配置为:根据更新后的所述权重分配策略调整同一类型标签的视频中各个图像帧的权重;
    所述图像缩短单元还被配置为根据调整权重后的图像帧生成用户类型个性化缩短版视频图像,以便所述缩短视频获取单元生成用户类型个性化缩短视频。
  15. 根据权利要求14所述的视频处理装置,还包括:用户类型确定单元,被配置为:
    通过采集用户的播放行为确定用户的观看偏好;
    根据用户的观看偏好确定用户的相似程度;
    将相似程度超过预定阈值的用户确定为同一类用户。
  16. 根据权利要求9~15任意一项所述的视频处理装置,还包括:
    缩短视频优化单元,被配置为通过基于条目的协同过滤算法,以及通过机器学习算法调整各个图像帧的权重,根据调整权重后的图像帧生成缩短版视频图像,以便更新缩短视频。
  17. 一种视频处理装置,包括:
    存储器;以及
    耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令执行如权利要求1至8任一项所述的方法。
  18. 一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现权利要求1至8任意一项所述的方法的步骤。
PCT/CN2019/097527 2018-07-27 2019-07-24 视频处理方法和装置 Ceased WO2020020241A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/263,425 US11445272B2 (en) 2018-07-27 2019-07-24 Video processing method and apparatus
EP19841881.6A EP3826312A4 (en) 2018-07-27 2019-07-24 VIDEO PROCESSING METHOD AND APPARATUS

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810843764.5A CN110769279B (zh) 2018-07-27 2018-07-27 视频处理方法和装置
CN201810843764.5 2018-07-27

Publications (1)

Publication Number Publication Date
WO2020020241A1 true WO2020020241A1 (zh) 2020-01-30

Family

ID=69181325

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/097527 Ceased WO2020020241A1 (zh) 2018-07-27 2019-07-24 视频处理方法和装置

Country Status (4)

Country Link
US (1) US11445272B2 (zh)
EP (1) EP3826312A4 (zh)
CN (1) CN110769279B (zh)
WO (1) WO2020020241A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022705A (zh) * 2022-05-24 2022-09-06 咪咕文化科技有限公司 一种视频播放方法、装置及设备

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111918112B (zh) * 2020-06-29 2021-07-20 北京大学 一种视频优化的方法、装置、存储介质及终端
US11418859B1 (en) 2021-07-13 2022-08-16 Rovi Guides, Inc. System and methods to determine user interest in unresolved plot of a series and resolving it
CN114297439B (zh) * 2021-12-20 2023-05-23 天翼爱音乐文化科技有限公司 一种短视频标签确定方法、系统、装置及存储介质
US11769531B1 (en) 2023-01-03 2023-09-26 Roku, Inc. Content system with user-input based video content generation feature
US12445684B2 (en) * 2023-07-06 2025-10-14 Sony Group Corporation Content category based media clip generation from media content using machine learning (ML) model

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5664227A (en) * 1994-10-14 1997-09-02 Carnegie Mellon University System and method for skimming digital audio/video data
CN1969552A (zh) * 2004-06-17 2007-05-23 皇家飞利浦电子股份有限公司 使用个性属性的个性化概要
US20110305439A1 (en) * 2009-02-20 2011-12-15 Subhasis Chaudhuri Device and method for automatically recreating a content preserving and compression efficient lecture video
US20120033949A1 (en) * 2010-08-06 2012-02-09 Futurewei Technologies, Inc. Video Skimming Methods and Systems
CN105761263A (zh) * 2016-02-19 2016-07-13 浙江大学 一种基于镜头边界检测和聚类的视频关键帧提取方法
CN106888407A (zh) * 2017-03-28 2017-06-23 腾讯科技(深圳)有限公司 一种视频摘要生成方法及装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content
KR100411342B1 (ko) * 2001-05-22 2003-12-18 엘지전자 주식회사 비디오 텍스트 합성 키프레임 생성방법
KR100411437B1 (ko) * 2001-12-28 2003-12-18 엘지전자 주식회사 지능형 뉴스 비디오 브라우징 시스템
US20070245379A1 (en) 2004-06-17 2007-10-18 Koninklijke Phillips Electronics, N.V. Personalized summaries using personality attributes
KR20140049832A (ko) * 2012-10-18 2014-04-28 삼성전자주식회사 블루레이 디스크와 이를 재생하기 위한 블루레이 디스크 재생 장치 및 그 자막 표시 방법
US20160029106A1 (en) * 2013-03-06 2016-01-28 Zhibo Chen Pictorial summary of a video
CN103150373A (zh) * 2013-03-08 2013-06-12 北京理工大学 一种高满意度视频摘要生成方法
US9286938B1 (en) 2014-01-02 2016-03-15 Google Inc. Generating and providing different length versions of a video
EP3340103A1 (en) * 2016-12-21 2018-06-27 Axis AB Method for identifying events in a motion video
CN107943990B (zh) * 2017-12-01 2020-02-14 天津大学 基于带权重的原型分析技术的多视频摘要方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5664227A (en) * 1994-10-14 1997-09-02 Carnegie Mellon University System and method for skimming digital audio/video data
CN1969552A (zh) * 2004-06-17 2007-05-23 皇家飞利浦电子股份有限公司 使用个性属性的个性化概要
US20110305439A1 (en) * 2009-02-20 2011-12-15 Subhasis Chaudhuri Device and method for automatically recreating a content preserving and compression efficient lecture video
US20120033949A1 (en) * 2010-08-06 2012-02-09 Futurewei Technologies, Inc. Video Skimming Methods and Systems
CN105761263A (zh) * 2016-02-19 2016-07-13 浙江大学 一种基于镜头边界检测和聚类的视频关键帧提取方法
CN106888407A (zh) * 2017-03-28 2017-06-23 腾讯科技(深圳)有限公司 一种视频摘要生成方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3826312A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022705A (zh) * 2022-05-24 2022-09-06 咪咕文化科技有限公司 一种视频播放方法、装置及设备

Also Published As

Publication number Publication date
US11445272B2 (en) 2022-09-13
EP3826312A1 (en) 2021-05-26
CN110769279A (zh) 2020-02-07
US20210314675A1 (en) 2021-10-07
CN110769279B (zh) 2023-04-07
EP3826312A4 (en) 2022-04-27

Similar Documents

Publication Publication Date Title
US11445272B2 (en) Video processing method and apparatus
CN109922373B (zh) 视频处理方法、装置及存储介质
US9961403B2 (en) Visual summarization of video for quick understanding by determining emotion objects for semantic segments of video
JP5010292B2 (ja) 映像属性情報出力装置、映像要約装置、プログラムおよび映像属性情報出力方法
US8750681B2 (en) Electronic apparatus, content recommendation method, and program therefor
Hua et al. Ave: automated home video editing
CN110769314A (zh) 一种视频播放方法、装置和计算机可读存储介质
CN111757170A (zh) 一种视频分段和标记的方法及装置
US20220167055A1 (en) Method and device for controlling video playback
CN110519620A (zh) 在电视机推荐电视节目的方法以及电视机
Chu et al. On broadcasted game video analysis: event detection, highlight detection, and highlight forecast
US12482499B1 (en) System and method for AI-powered narrative analysis of video content
Midoglu et al. Mmsys' 22 grand challenge on ai-based video production for soccer
KR20180089977A (ko) 영상 이벤트 단위 세그멘테이션 시스템 및 그 방법
KR20200044435A (ko) 영상의 쇼트 분류를 이용한 사용자 맞춤형 영상 추천 시스템
Wang et al. From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding
CN113012723B (zh) 多媒体文件播放方法、装置、电子设备
US20220164024A1 (en) User-driven adaptation of immersive experiences
CN111163366B (zh) 一种视频处理方法及终端
Yu et al. Text2Video: automatic video generation based on text scripts
WO2022200815A1 (en) Video content item selection
CN114022814A (zh) 视频处理方法和装置、电子设备及计算机可读的存储介质
Zwicklbauer et al. Video analysis for interactive story creation: the sandmännchen showcase
US12439108B2 (en) Video clip learning model
US20260059180A1 (en) System and Method for AI-Powered Generation and Delivery of Video Clips

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19841881

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019841881

Country of ref document: EP

Effective date: 20210222