WO2022121796A1 - 图像处理方法及电子设备 - Google Patents

图像处理方法及电子设备 Download PDF

Info

Publication number
WO2022121796A1
WO2022121796A1 PCT/CN2021/135353 CN2021135353W WO2022121796A1 WO 2022121796 A1 WO2022121796 A1 WO 2022121796A1 CN 2021135353 W CN2021135353 W CN 2021135353W WO 2022121796 A1 WO2022121796 A1 WO 2022121796A1
Authority
WO
WIPO (PCT)
Prior art keywords
style
electronic device
image
video
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2021/135353
Other languages
English (en)
French (fr)
Inventor
陈文东
陈帅
刘蒙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to EP21902503.8A priority Critical patent/EP4246955A4/en
Priority to US18/256,158 priority patent/US12567129B2/en
Publication of WO2022121796A1 publication Critical patent/WO2022121796A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/62Control of parameters via user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/698Control of cameras or camera modules for achieving an enlarged field of view, e.g. panoramic image capture
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/00Two-dimensional [2D] image generation
    • G06T11/10Texturing; Colouring; Generation of textures or colours
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/617Upgrading or updating of programs or applications for camera control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • H04N23/631Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/63Control of cameras or camera modules by using electronic viewfinders
    • H04N23/631Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters
    • H04N23/632Graphical user interfaces [GUI] specially adapted for controlling image capture or setting capture parameters for displaying or modifying preview images prior to image capturing, e.g. variety of image resolutions or capturing parameters
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/667Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/68Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
    • H04N23/682Vibration or motion blur correction
    • H04N23/683Vibration or motion blur correction performed by a processor, e.g. controlling the readout of an image memory
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2200/00Indexing scheme for image data processing or generation, in general
    • G06T2200/24Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present application relates to the field of terminal technologies, and in particular, to an image processing method and an electronic device.
  • the electronic device may perform frame extraction processing on the video to achieve the above-mentioned time-lapse photography effect.
  • users often need to keep electronic devices in one place for a long time. This imposes high restrictions on the scene, equipment and time of time-lapse photography.
  • the present application provides an image processing method and electronic device, which can transfer the style of the video captured by the user, so that the video captured by the user in a short time has the time-lapse effect of the video captured for a long time, and the convenience of the user for time-lapse photography is improved. Sex and fun.
  • the present application provides an image processing method.
  • the method includes: the electronic device acquiring a first sequence of images.
  • the electronic device may process the first image sequence based on the target transfer style to obtain the second image sequence.
  • Both the first image sequence and the second image sequence contain n frames of images.
  • the high-level semantic information of the ith frame image in the first image sequence is the same as that of the ith frame image in the second image sequence.
  • the style of the ith frame image in the first image sequence is different from the style of the ith frame image in the second image sequence.
  • the target transfer style can be used to indicate that the style of the first frame image in the second image sequence to the style of the n-th frame image varies in the M styles in order of the first style.
  • n and M are integers greater than 1.
  • i is a positive integer less than or equal to n.
  • the electronic device may save the second sequence of images.
  • the above-mentioned first image sequence may be a video or a multi-frame image obtained by segmenting a panorama.
  • the style of the image may include the texture feature of the image and the artistic expression of the image.
  • the content of the image may include low-level semantic information and high-level semantic information of the image.
  • the low-level semantic information of the image is the style of the above image.
  • the high-level semantic information of an image refers to what the image expresses that is closest to human understanding.
  • the electronic device processing the first image sequence based on the target transfer style may specifically be performing style transfer processing.
  • the above-described style transfer process may be to change the style of the image. That is, the style of the image processed by style transfer changes, but the high-level semantic information of the original image remains unchanged.
  • the above-mentioned target migration style may be, for example, a style of day and night transition, a style of changing seasons, a style of alternating between rain and shine, and the like.
  • the above-mentioned target transfer style may be determined by the electronic device based on the style option selected by the user.
  • the above target transfer style can be used to determine the size of M above. That is, the number of style transfer models used for fusion.
  • the high-level semantic information of the images in the second image sequence obtained by changing the sequence of the first style may present a sequence change in natural time.
  • the target transfer style is day-night transfer style.
  • the above M styles can be day style and night style.
  • the above-mentioned first style order may be an order changing from day style to night style.
  • the target migration style is the seasonal change style.
  • the above M styles can be spring style, summer style, autumn style and winter style.
  • the above-mentioned first style sequence may be from spring style to summer style, then from summer style to autumn style, and then from autumn style to night style. This embodiment of the present application does not limit the order in which the M styles are arranged in the above-mentioned first style order.
  • the electronic device may use k fusion style transfer models to process the first image sequence based on the above target transfer style.
  • k is less than or equal to n.
  • the output images of the k fused style transfer models are the above-mentioned second image sequence.
  • the output image of one fusion style transfer model may be one frame of images or consecutive multiple frames of images in the second image sequence.
  • One of the above k fusion style transfer models may be weighted and generated by M single style transfer models.
  • the styles of the respective output images of the above-mentioned M single-style transfer models constitute the above-mentioned M styles, and j is a positive integer less than or equal to M.
  • the target transfer style is day-night transfer style.
  • the single style transfer model used by the electronic device to generate the fusion model can be a day style transfer model and a night style transfer model.
  • the weights of the daytime style transfer model and the nighttime style transfer model in the fusion style transfer model for performing style transfer processing on the first image sequence are different. Specifically, from the first fusion style transfer model to the k-th fusion style transfer model, the weight of the daytime style transfer model can gradually decrease, and the weight of the nighttime style transfer model can gradually increase.
  • the electronic device can use the k fused style transfer models to perform style transfer processing on the first image sequence to obtain the second image sequence.
  • the styles of the first frame image to the nth frame image in the second image sequence may gradually change from a day style to a night style.
  • the above k fused style transfer models and the above M single style transfer models are both neural network models and have the same neural network structure.
  • the above single style transfer model for generating the fusion style transfer model is obtained after training.
  • the electronic device may obtain the training data set.
  • the above training data set may include one or more frames of style images and multiple frames of content images in the first video.
  • the style of one or more style images is the style of the output image of the trained single style transfer model.
  • the above content images may be images that need to be styled.
  • the electronic device may process multiple frames of content images in the first video by using the single style transfer model to be trained to obtain multiple frames of composite images.
  • the electronic device can use the loss function to train the above-mentioned single-style transfer model to be trained to obtain a trained single-style transfer model.
  • the loss function may include a high-level semantic information loss function, a style loss function, and a time domain constraint loss function.
  • the above-mentioned high-level semantic information loss function is determined by the high-level semantic information of the multi-frame content image and the high-level semantic information of the multi-frame composite image.
  • the above style loss function is determined by the style of the multi-frame content image and the style of the multi-frame composite image.
  • the above-mentioned time-domain constrained loss function is determined by the style of one frame of composite image in the multi-frame composite image and the style of the multi-frame composite image adjacent to the one frame of composite image.
  • the electronic device introduces the above-mentioned time-domain constrained loss function into the loss function used to train the single-style transfer model, which can take into account the connection between consecutive multiple frames of content images, reducing the impact of the single-style transfer model on the video.
  • the probability of the style jumping of adjacent frame images When multi-frame images are subjected to style transfer processing, the probability of the style jumping of adjacent frame images.
  • the trained style transfer model can improve the consistency of the stylization effect of images of consecutive multiple frames of content in the video, and reduce the flickering phenomenon during the video playback process when performing style transfer on the multi-frame images in the video.
  • the above trained single style transfer model may be stored in an electronic device.
  • the electronic device can obtain a single style transfer model locally for fusion.
  • the above-trained single-style transfer model can be stored in the cloud.
  • the electronic device can upload the video that needs to be style-transferred and the target style to be transferred to the cloud.
  • the cloud can use the fusion style transfer model to perform style transfer on the video, and send the obtained style-transferred video to the electronic device.
  • the electronic device 100 may only send the target migration style to the cloud.
  • the cloud can send the single style transfer model that needs to be integrated to the electronic device according to the above target transfer style.
  • the electronic device may turn on the camera to collect the first video, and obtain n frames of images in the first image sequence according to the first video.
  • the first video includes z-frame images.
  • the above-mentioned n-frame images are extracted from the above-mentioned z-frame images.
  • the electronic device when the electronic device acquires the above-mentioned first video, it may further perform anti-shake processing on the first video.
  • the above-mentioned anti-shake processing may be, for example, anti-shake processing in methods such as electronic anti-shake and/or optical anti-shake.
  • the electronic device can perform frame extraction and style transfer processing on the video shot by the user in real time.
  • a video shot by a user in a short period of time can have a time-lapse effect of shooting a video for a long time.
  • the above-mentioned time-lapse video shot by the user within 1 minute may have a time-lapse effect that rapidly changes from day to night, which originally took 12 hours or even longer to be shot.
  • the electronic device performs anti-shake processing on the captured video. In this way, the user can hold the electronic device for shooting when shooting a time-lapse video, without fixing the electronic device for shooting in one place without the need for a fixed device.
  • the above method enables time-lapse photography to break through the restrictions on shooting scenes, equipment and time, and improves the convenience and fun of time-lapse photography for users.
  • the electronic device may obtain the first video from locally stored videos according to the first video selected by the user, and obtain n frames of images in the first image sequence according to the first video.
  • the first video contains z-frame images, and n-frame images are extracted from the z-frame images.
  • the user can select the corresponding video from locally stored videos such as the gallery app.
  • the electronic device can perform the above frame extraction and style transfer processing on the video selected by the user, so that the processed video can have a time-lapse effect of shooting a video for a long time.
  • the electronic device can also obtain videos from the cloud to perform the above frame extraction and style transfer processing.
  • the above-mentioned extracted frame extraction rate may be determined by the playback duration of the first image sequence selected by the user.
  • the frame sampling rate may be a ratio of the playback duration of the first image sequence to the acquisition duration of the first video.
  • the collection duration of the above-mentioned first video is 1 minute.
  • the playback duration of the first image selected by the user is 10 seconds. Then, the electronic device can extract the above-mentioned first image sequence from the multiple frames of images of the first video according to the ratio of 1:6.
  • the frame extraction rate for the first video can be customized according to the playback duration of the first image sequence that the user selects to be generated. In this way, the maximum frame extraction rate that the electronic device can provide will not be limited by the capture duration of the first video.
  • the electronic device may store the n frames of images in the above-mentioned second image sequence in series as a video. During the playing process of the video, the effect that the above-mentioned M styles are sequentially changed according to the above-mentioned first style can be presented.
  • the electronic device may acquire the first image, and segment the first image to obtain n frames of images in the above-mentioned first image sequence.
  • the electronic device divides the above-mentioned first image by means of capturing an image through a sliding window.
  • the length of the sliding window may be the first length.
  • the sliding distance of each sliding window may be the first sliding distance.
  • the electronic device may slide the sliding window from one side of the first image to the other side n-1 times to obtain an image with the lengths of n frames being the first length.
  • the above-mentioned first length may be smaller than the first sliding distance. That is, there are overlapping parts of adjacent images in the above n frames of images.
  • the electronic device may cut out a mosaic region from each frame of the image in the second image sequence to obtain n mosaic regions.
  • the above n splicing regions do not have overlapping parts.
  • the electronic device may stitch the above n stitched regions to obtain a second image, and store the second image.
  • the resolution of the second image is the same as the resolution of the first image.
  • the high-level semantic information of the second image is the same as the high-level semantic information of the first image.
  • the resolutions of the above n mosaic regions may be the same or different.
  • the above-mentioned first image may be obtained by the user turning on the camera of the electronic device to shoot immediately, or may be selected by the user from a gallery application of the electronic device.
  • the target transfer style is a day-night transfer style.
  • the electronic device may perform style transfer processing on the first image sequence by using the fusion style transfer model.
  • the second image obtained by the electronic device splicing by using the splicing area in the second image sequence may present a process of gradually changing from a daytime style to a nighttime style from one side to the other.
  • the styles of the spliced regions captured from adjacent images can transition more smoothly. That is, the above-mentioned method of segmenting the first image and cutting out the splicing area from each image obtained through style transfer can improve the smoothness of the stylization effect of the first image.
  • the styles of the second image obtained by splicing the electronic device from one side to the other side can be more smoothly changed in the order of the first style among the above-mentioned M styles.
  • the second image with the style gradient effect can be obtained. This can make it more interesting for users to capture images, especially panoramas.
  • the present application provides an electronic device, which may include a display screen, a memory, and one or more processors.
  • This memory can be used to store multiple single style transfer models.
  • the memory can also be used to store computer programs.
  • the processor can be used to invoke a computer program in the memory, so that the electronic device executes any one of the possible implementations of the first aspect.
  • the present application provides a computer storage medium, including instructions, which, when the above-mentioned instructions are executed on an electronic device, cause the above-mentioned electronic device to execute any one of the possible implementation manners of the above-mentioned first aspect.
  • an embodiment of the present application provides a chip, the chip is applied to an electronic device, the chip includes one or more processors, and the processor is configured to invoke computer instructions to cause the electronic device to execute any one of the first aspects above a possible implementation.
  • an embodiment of the present application provides a computer program product containing instructions, which, when the computer program product is run on a device, enables the electronic device to execute any one of the possible implementations of the first aspect.
  • the electronic device provided in the second aspect, the computer storage medium provided in the third aspect, the chip provided in the fourth aspect, and the computer program product provided in the fifth aspect are all used to execute the methods provided by the embodiments of the present application. Therefore, for the beneficial effects that can be achieved, reference may be made to the beneficial effects in the corresponding method, which will not be repeated here.
  • FIGS. 1A to 1F are schematic diagrams of some user interfaces for shooting time-lapse videos provided by embodiments of the present application.
  • FIGS. 2A to 2C are schematic diagrams of some user interfaces for playing time-lapse videos provided by embodiments of the present application.
  • FIG. 3 is a schematic diagram of a method for obtaining a fusion style transfer model by an electronic device according to an embodiment of the present application
  • FIG. 4 is a schematic diagram of a method for performing style transfer on a video by an electronic device using a fusion style transfer model provided by an embodiment of the present application;
  • FIG. 5 is a flowchart of a method for training a style transfer model provided by an embodiment of the present application
  • FIG. 6 is a flowchart of a shooting method provided by an embodiment of the present application.
  • FIGS. 7A to 7G are schematic diagrams of some user interfaces for performing style transfer on videos provided by embodiments of the present application.
  • FIGS. 8A to 8E are schematic diagrams of some user interfaces for shooting panoramas provided by embodiments of the present application.
  • FIG. 9 is a schematic diagram of a method for performing style transfer on a panorama by using a fusion style transfer model for an electronic device provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of an electronic device 100 provided by an embodiment of the present application.
  • Time-lapse photography is a photography technique that compresses time.
  • the electronic device can collect images of a certain scenic spot from morning (eg, 7:00) to night (eg, 22:00) at the rate of capturing images during normal photography (eg, 30 frames of images per second). image to get the original video. Then, the electronic device can frame the original video. For example, the electronic device may extract 1 frame of images every 1800 frames of images. The electronic device can concatenate multiple frames of images obtained through frame extraction processing in the order of acquisition time to obtain a time-lapse video. At a video playback rate of 30 frames per second, the electronic device can compress a video with a playback time of 15 hours into a video with a playback time of 30 seconds. That is, the electronic device can present the change of a scenic spot from 7:00 in the morning to 22:00 in the evening in a 30-second time-lapse video.
  • the electronic device may adjust the rate of capturing images to obtain multiple frames of images included in the time-lapse video. Specifically, the electronic device may capture images according to a delay rate input by a user or a preset delay rate. For example, the delay rate is 2 frames per second. Then, the electronic device can concatenate the multiple frames of images collected at the above-mentioned delay rate and play them in the form of video.
  • time-lapse photography generally takes a long time to shoot.
  • users often need to use a fixed device such as a tripod to fix the electronic device used for shooting in one place when performing time-lapse photography.
  • It is difficult for a user to shoot a video with a time-lapse effect (such as a rapid change from day to night) in a short period of time when a user holds an electronic device (such as a mobile phone).
  • the above-mentioned time-lapse effect may refer to compressing the process of slowly changing the object or the scene in a short period of time to present the process of rapidly changing the object or the scene.
  • the embodiments of the present application provide an image processing method.
  • the electronic device can perform anti-shake processing on the first video obtained by shooting the first scene in the first time period to obtain the second video. Then, the electronic device may perform frame extraction processing on the second video to obtain a third video.
  • the electronic device may use the fusion style transfer model to perform style transfer on the multiple frames of images in the third video to obtain the fourth video.
  • the above-mentioned fusion style transfer model is obtained by integrating at least two style transfer models in the electronic device.
  • the electronic device may use the model obtained by fusing the daytime style transfer model and the nighttime style transfer model to perform style transfer on multiple frames of images in the third video.
  • the fourth video can present the effect that the first scene changes rapidly in time from day to night.
  • This embodiment of the present application does not limit the length of the above-mentioned first time period.
  • the first period of time may be a short period of time, such as 30 seconds, 1 minute, or the like.
  • the electronic device can perform the processing in the above image processing method on the video shot in a short time, so that the video shot in a short time can have a time-lapse effect of shooting a video for a long time.
  • the above image processing method enables time-lapse photography to break through the limitations of shooting scenes, equipment and time, and improves the convenience and fun of time-lapse photography for users.
  • the style of the image can include the texture features of the image and the artistic expression of the image.
  • the style of the image may be cartoon style, manga style, oil painting style, realistic style, ukiyo-e style, day style, night style, spring style, summer style, autumn style, winter style, sunny style, rain style, etc.
  • the embodiment of the present application does not limit the style of the image.
  • Performing style transfer on an image may refer to merging a first image with style transfer requirements and a second image with a target style to generate a third image.
  • the above-mentioned fusion process may be a process of processing the first image by using a style transfer model.
  • the above style transfer model can be used to output images with the above target style. For the description of the above style transfer model, please refer to the introduction of the third point below.
  • the above-mentioned third image has high-level semantic information in the content of the first image and the style of the second image.
  • the content of the image may include low-level semantic information and high-level semantic information of the image.
  • the low-level semantic information may refer to the color, texture, etc. of the image.
  • the low-level semantic information of the image is also the style of the image.
  • High-level semantic information can refer to what the image expresses closest to human understanding. For example, an image that contains sand, blue sky, sea water.
  • the low-level semantic information of the image may include the color and texture of sand, blue sky, and sea water.
  • the high-level semantic information of the image may be that the image contains sand, blue sky, sea water, and that the image is a beach image.
  • the above-mentioned first image with the need for style transfer may be a content image.
  • the above-mentioned second image with the target style may be a style image.
  • An image obtained by performing style transfer on the content graphics can be a composite image.
  • the electronic device performs style transfer on the content image, it saves the high-level semantic information of the content image, and replaces the style of the content image with the style of the style image.
  • the content image is the above-mentioned beach image.
  • the current style of the content image is Realistic.
  • the style of the style image is cartoon style.
  • the electronic device performs style transfer on the above-mentioned beach image, and can convert a realistic-style beach image into a cartoon-style beach image. Sand, blue sky, and sea water are still present in cartoon-style beach images after style transfer.
  • the low-level semantic information such as color and texture of the cartoon-style beach image has changed.
  • a style transfer model can be used to take in content images and generate synthetic images.
  • the synthetic image may have high-level semantic information of the content image and a style corresponding to the style transfer model.
  • a style transfer model can correspond to a style.
  • a cartoon style transfer model may correspond to cartoon style.
  • the cartoon style transfer model can replace the style of the received content image with the cartoon style.
  • the style transfer model may be a neural network model.
  • the style transfer model can be obtained using a large amount of training data.
  • a training data may consist of a training sample and a training result corresponding to the training sample.
  • the above-mentioned training samples may include a content image (the aforementioned first image) and a style image (the aforementioned second image).
  • the training result corresponding to the above training sample may be a synthetic image (the aforementioned third image).
  • the fusion style transfer model is a style transfer model obtained by fusing two or more style transfer models.
  • the weights of parameters in multiple style transfer models for fusion can be different.
  • the electronic device can change the style corresponding to the obtained fused style transfer model by changing the weights of the parameters in the multiple style transfer models to be fused.
  • the electronic device may incorporate a daytime style transfer model and a nighttime style transfer model.
  • the electronic device changes the weights of the parameters in the daytime style transfer model and the weights of the parameters in the nighttime style transfer model, and can obtain a fusion style transfer model with styles between the daytime style and the nighttime style (for example, the dusk style).
  • the electronic device may use the fusion style transfer model obtained by fusion to perform style transfer on the multiple frames of images in the third video above.
  • the weights of parameters in the daytime style transfer model and the weights of parameters in the nighttime style transfer model can be changed. Therefore, the fourth video processed by the fusion style transfer model can present the process of the first scene changing rapidly from day to night. That is to say, a video shot by a user in a short period of time can also have a time-lapse effect of shooting a video for a long time.
  • a neural network can be composed of neural units, and a neural unit can refer to an operation unit that takes x s and an intercept 1 as inputs, and the output of the operation unit can refer to the following formula (1):
  • W s is the weight of x s
  • b is the bias of the neural unit.
  • f is an activation function of the neural unit, which is used to introduce nonlinear characteristics into the neural network to convert the input signal in the neural unit into an output signal. The output signal of this activation function can be used as the input of the next convolutional layer.
  • the activation function can be a sigmoid function.
  • a neural network is a network formed by connecting many of the above single neural units together, that is, the output of one neural unit can be the input of another neural unit.
  • the input of each neural unit can be connected with the local receptive field of the previous layer to extract the features of the local receptive field, and the local receptive field can be an area composed of several neural units.
  • a convolutional neural network is a neural network with a convolutional structure.
  • a convolutional neural network consists of a feature extractor consisting of convolutional and subsampling layers.
  • the feature extractor can be viewed as a filter, and the convolution process can be viewed as convolution with an input image or a convolutional feature map using a trainable filter.
  • the convolutional layer refers to the neuron layer in the convolutional neural network that convolves the input signal.
  • a neuron can only be connected to some of its neighbors.
  • a convolutional layer usually contains several feature planes, and each feature plane can be composed of some neural units arranged in a rectangle.
  • Neural units in the same feature plane share weights, and the shared weights here are convolution kernels.
  • Shared weights can be understood as the way to extract image information is independent of location. The underlying principle is that the statistics of one part of the image are the same as the other parts. This means that image information learned in one part can also be used in another part. So for all positions on the image, the same learned image information can be used.
  • multiple convolution kernels can be used to extract different image information. Generally, the more convolution kernels, the richer the image information reflected by the convolution operation.
  • the convolution kernel can be initialized in the form of a matrix of random size, and the convolution kernel can obtain reasonable weights by learning during the training process of the convolutional neural network.
  • the immediate benefit of sharing weights is to reduce the connections between the layers of the convolutional neural network, while reducing the risk of overfitting.
  • the convolutional neural network can use the error back propagation (BP) algorithm to correct the size of the parameters in the initial super-resolution model during the training process, so that the reconstruction error loss of the super-resolution model becomes smaller and smaller. Specifically, forwarding the input signal until the output will generate an error loss, and updating the parameters in the initial super-resolution model by back-propagating the error loss information, so that the error loss converges.
  • the back-propagation algorithm is a back-propagation motion dominated by the error loss, aiming to obtain the parameters of the optimal super-resolution model, such as the weight matrix.
  • the electronic device 100 may include a camera 193 .
  • the camera 193 may be a front camera.
  • Camera 193 may also include a rear camera.
  • the electronic device 100 may display the user interface 210 shown in FIG. 1A .
  • the user interface 210 may include an application icon display area 211, a tray 212 with icons of frequently used applications. in:
  • the application icon display area 211 may include a gallery icon 211A.
  • the electronic device 100 may open the gallery application, thereby displaying information such as pictures and videos stored in the electronic device 100 .
  • the pictures and videos stored in the electronic device 100 include photos and videos shot by the electronic device 100 through a camera application.
  • the application icon display area 211 may further include more application program icons, such as mail icons, music icons, sports and health icons, and the like, which are not limited in this embodiment of the present application.
  • a tray 212 with frequently used application icons may display a camera icon 212A.
  • the electronic device 100 can open the camera application, so as to perform functions such as photographing and video recording.
  • the camera 193 front camera and/or rear camera
  • the tray 212 with icons of frequently used applications may also display icons of more applications, such as dial icons, information icons, contact icons, etc., which are not limited in this embodiment of the present application.
  • User interface 210 may also contain more or less content, such as controls to display the current time and date, controls to display the weather, and the like. It can be understood that FIG. 1A only exemplarily shows a user interface on the electronic device 100 , and should not constitute a limitation to the embodiments of the present application.
  • the electronic device 100 may display a user interface 220 as shown in FIG. 1B .
  • the user interface 220 may include a preview area 221 , a flash control 222 , a settings control 223 , a camera mode option 201 , a gallery shortcut control 202 , a shutter control 203 , and a camera rollover control 204 . in:
  • the preview area 221 may be used to display images captured by the camera 193 in real time.
  • the electronic device can refresh the displayed content in real time, so that the user can preview the image currently captured by the camera 193 .
  • Flash control 222 may be used to turn the flash on or off.
  • the setting control 223 can be used to adjust parameters for taking pictures (such as resolution, filters, etc.) and turn on or off some methods for taking pictures (such as timed photo, smile capture, voice-activated photo, etc.).
  • the setting control 223 may be used to set more other shooting functions, which are not limited in this embodiment of the present application.
  • the one or more shooting mode options may be displayed in the camera mode options 201 .
  • the one or more shooting mode options may include: large aperture mode option 201A, video mode option 201B, photo mode option 201C, time-lapse photography mode option 201D, and portrait mode option 201E.
  • the one or more shooting mode options can be represented as text information on the interface, such as "large aperture”, “video recording”, “photography”, “time-lapse photography", and “portrait”.
  • the one or more camera options may also be represented as icons or other forms of interactive elements (IEs) on the interface.
  • the electronic device 100 can start the shooting mode selected by the user.
  • the camera mode options 201 may also include more or less shooting mode options. The user can browse other shooting mode options by swiping left/right in the camera mode options 201 .
  • Gallery shortcut control 202 may be used to launch the Gallery application.
  • the electronic device 100 can open the gallery application.
  • the gallery application is a picture management application on an electronic device such as a smart phone and a tablet computer, and may also be called an "album", and the name of the application is not limited in this embodiment.
  • the gallery application can support the user to perform various operations on the pictures stored on the electronic device 100, such as browsing, editing, deleting, selecting and other operations.
  • the shutter control 203 can be used to monitor the user operation that triggers the photographing.
  • the electronic device 100 may detect a user operation acting on the shutter control 203, and in response to the operation, the electronic device 100 may save the image in the preview area 221 as a picture in the gallery application.
  • the electronic device 100 may also display thumbnails of the saved images in the gallery shortcut key 203 . That is, the user can click on the shutter control 203 to trigger taking a photo.
  • the shutter control 203 may be a button or other forms of control.
  • the camera rollover control 204 can be used to listen for a user operation that triggers rollover of the camera.
  • the electronic device 100 can detect a user operation, such as a tap operation, acting on the camera flip control 204, and in response to the operation, the electronic device 100 can flip the camera for shooting, such as switching the rear camera to the front camera, or the front camera. switch from the front camera to the rear camera.
  • the user interface 221 may further include more or less content, which is not limited in this embodiment of the present application.
  • FIGS. 1C to 1F exemplarily show a user interface of the electronic device 100 for time-lapse photography.
  • the electronic device 100 may display a user interface 230 as shown in FIG. 1C .
  • User interface 230 contains substantially the same basic controls as user interface 210 . Additionally, style options 231 may be included in the user interface 230 .
  • One or more style options may be included in style options 231 .
  • style options 231 For example, the day and night switching style option 231A, the four seasons changing style option 231B, and the sunny and rainy changing style option 231C.
  • the one or more style options may appear as textual information on the interface. For example, “day and night”, “four seasons”, “rain and rain”. Not limited to this, the one or more style options may also be represented as icons or other interactive elements on the interface.
  • Each style option can be used to instruct the electronic device 100 to perform style transfer on the captured time-lapse video, and convert the style of the video to the style corresponding to the style option.
  • the style transfer model corresponding to the above-mentioned day and night transition style option 231A may be a fusion style transfer model obtained by fusing the daytime style transfer model and the nighttime style transfer model.
  • the electronic device 100 can use the fusion style transfer model to perform style transfer on multiple frames of images in the video, so as to obtain a video that can present a process in which the captured content changes rapidly from day to night.
  • the style transfer model corresponding to the above-mentioned four-season change transition style option 231B may be a fusion style transfer model obtained from the spring style transfer model, the summer style transfer model, the autumn style transfer model, and the winter style transfer model.
  • the electronic device 100 can use the fusion style transfer model to perform style transfer on multiple frames of images in the video, and obtain a video that can present a process of rapid gradation of the captured content from spring to summer, from summer to autumn, and then from autumn to winter.
  • the style transfer model corresponding to the sunny and rainy alternate style option 231C may be a fusion style transfer model obtained by fusing the sunny weather style transfer model and the rainy weather style transfer model.
  • the electronic device 100 can use the fusion style transfer model to perform style transfer on multiple frames of images in the video, so as to obtain a video that can present a process in which the captured content changes rapidly from sunny days to rainy days.
  • style options 231 may also contain more or less style options. Not limited to the fusion style transfer style model shown in FIG. 1C , the style option 231 may also include a single style transfer model (eg, a cartoon style transfer model).
  • the embodiment of the present application does not limit the specific way of changing the style of the video obtained by performing style transfer using the fusion style transfer model.
  • the electronic device 100 may also perform style transfer processing on the video, so that the video shows a process of rapidly changing the captured content from night to day during playback.
  • the day/night switching style option 231A, the four-season switching style option 231B, and the rainy/weather switching style option 231C are all unselected.
  • the color of the above style options are all white.
  • the electronic device 100 may display a user interface 230 as shown in FIG. 1D .
  • the day and night transition style option 231A may appear to be selected.
  • the color of day and night switching style option 231A may be changed to gray.
  • the embodiments of the present application do not limit the manner in which the style option presents an unselected state and a selected state.
  • the electronic device 100 may display a user interface 230 as shown in FIG. 1E.
  • user interface 230 may include time selection box 232 .
  • the time selection box 232 may be used by the user to select the length of time for the generated time-lapse video.
  • the time selection box 232 may include a prompt 232A, a time option 232B, a confirmation control 232C, and a cancel control 232D. in:
  • the prompt 232A may be used to prompt the user to select the time length of the generated time-lapse video.
  • the prompt 232A may include a text prompt "Please determine the duration of the video obtained by time-lapse photography".
  • Time option 232B may be used by the user to select the length of time for the time-lapse video. For example 10 seconds.
  • Confirmation control 232C may be used to instruct electronic device 100 to begin time-lapse photography.
  • the electronic device 100 may store the length of time indicated by the time option 232B. Further, the electronic device 100 may process the captured video to obtain a time-lapse video whose time length is the time length indicated by the time option 232B (ie, the time length selected by the user).
  • the cancel control 232D may be used by the user to cancel the selection of the length of time for the time-lapse video.
  • the electronic device 100 may display the user interface 230 as shown in FIG. 1D .
  • the embodiment of the present application does not limit the specific expression form of the above-mentioned time selection box.
  • the electronic device 100 may acquire the time length of the time-lapse video that the user wants to generate. After the original video is obtained by shooting, the electronic device can extract frames from the original video, so that the time length of the generated time-lapse video is the time length desired by the user. In some embodiments, the electronic device may determine the rate at which images are captured or the frame rate of the original video obtained by shooting by providing a delay rate. But in this way, the maximum delay rate that the electronic device can provide is often limited. When the shooting time is too long, the time length of the time-lapse video generated by the electronic device is still long, and the time-lapse effect presented by the time-lapse video is not obvious.
  • the frame extraction rate determined during time-lapse photography in some embodiments of the present application may be determined according to the actual shooting time length and the time length indicated by the above time option 232B. That is, the user can customize the frame rate of the original video by selecting the time length of the time-lapse video to be generated in the above-mentioned time option 232B.
  • the electronic device 100 may display a time-lapse photography interface 240 as shown in FIG. 1F .
  • the time-lapse photography interface 240 may include a preview area 221 , a capture time indicator 241 , and a stop capture control 205 . in:
  • the preview area 221 may display the images captured by the electronic device 100 through the camera during time-lapse photography.
  • the electronic device 100 may store a series of images successively displayed in the preview area 221 during the time-lapse photography (ie, the time period from the start of the time-lapse photography to the end of the time-lapse photography) as the original video .
  • the electronic device 100 can perform anti-shake processing, frame extraction processing, and style transfer processing on the original video to obtain a time-lapse video.
  • the shooting time indicator 241 may be used to indicate the length of time the electronic device 100 has shot for time-lapse photography. As shown in FIG. 1F , the inclusion of “00:01:00” in the shooting time indicator 241 may indicate that the electronic device 100 has taken time-lapse photography for 1 minute.
  • Stop capture control 205 can be used to end time lapse photography. As shown in FIG. 1F , when the time length of shooting is 1 minute, in response to a user operation acting on the stop shooting control 205 , the electronic device 100 may end the time-lapse shooting. Wherein, the electronic device 100 can obtain an original video with a time length of 1 minute.
  • the electronic device 100 can perform anti-shake processing, frame extraction processing and style transfer on the original video according to the style selected by the user and the time length for which the time-lapse video is to be generated to obtain the time-lapse video.
  • FIGS. 2A to 2C exemplarily show the user interface of the electronic device 100 for playing the time-lapse video obtained during the shooting process shown in FIGS. 1C to 1F .
  • the electronic device 100 may display the video playing interface 250 as shown in FIG. 2A .
  • the video playback interface 250 may include a time control 251 , an image display area 252 , a pause control 253 , a progress bar 254 , an elapsed video playback time 255 and a total video duration 256 . in:
  • the time control 251 may instruct the electronic device 100 to store the time-lapse video. For example, on November 9, 2020, after 8:00 am.
  • the image display area 252 can be used to display frame by frame images included in the time-lapse video.
  • Pause control 253 can be used to pause playback of the time-lapse video.
  • the progress bar 254 can be used to compare the played time of the video with the total duration of the video to indicate the progress of the video playing.
  • Video elapsed time 255 may be used to indicate how long the video has played.
  • Total video duration 256 may be used to indicate the total duration of the time-lapse video. It can be seen from FIG. 2A that the total duration of the time-lapse video is 10 seconds. That is, the electronic device 100 shoots a video with a time length of 1 minute and processes it into a time-lapse video with a time length of 10 seconds according to the time length indicated by the time option 232B shown in FIG. 1E . Then, the electronic device 100 can present the content shot in 1 minute in 10 seconds.
  • the electronic device 100 may use the corresponding style transfer model to perform style transfer on the multiple frames of images in the video respectively.
  • the electronic device 100 may perform style transfer on the video using the fused style transfer model incorporating the daytime style transfer model and the nighttime style transfer model.
  • the time-lapse video obtained through the above style transfer can show the process of rapid gradation of the captured content from day to night.
  • the images included in the time-lapse video retain high-level semantic information (eg, trees, rivers) of the images included in the original video.
  • the style of the time-lapse video including the first frame to the last frame gradually changes from day style to night style.
  • the style of the image is the daytime style. That is, the image of the time-lapse video when the playback time is the second second can present the scene of the captured content (such as trees and rivers) in the daytime.
  • the style of the image is between the style of daytime and the style of nighttime (eg, dusk style).
  • the image of the time-lapse video when the playback time is 4 seconds can present the scene of the captured content (such as trees and rivers) at dusk.
  • the style of the image is night style. That is, the image of the time-lapse video when the playback time is the second second can present the scene of the captured content (such as trees and rivers) in the dark night.
  • FIGS. 2A to 2C are only exemplary descriptions for gradually changing the style of an image from a daytime style to a nighttime style, and do not limit the specific presentation content of the images corresponding to the style.
  • a video shot by a user in a short period of time may have a time-lapse effect of shooting a video for a long time.
  • the above-mentioned time-lapse video shot by the user within 1 minute may have a time-lapse effect that rapidly changes from day to night, which originally took 12 hours or even longer to be shot.
  • the electronic device can perform anti-shake processing on the captured video. In this way, the user can hold the electronic device for shooting when shooting a time-lapse video, without fixing the electronic device for shooting in one place without the need for a fixed device.
  • the above-mentioned image processing method enables time-lapse photography to break through the limitations of shooting scenes, equipment and time, and improves the convenience and fun of time-lapse photography for users.
  • the jitter in the captured video is generally due to changes in the pose of the electronic device used for capturing during the capturing process.
  • the electronic device can process each frame of image in the video by calculating the change of its own posture during the shooting process to eliminate the jitter.
  • the electronic device 100 may use a motion sensor (eg, a gyroscope sensor, an acceleration sensor) to calculate the change of its own pose during the shooting process.
  • the electronic device 100 may determine the original motion path during the shooting process according to the change of its own posture. Further, the electronic device 100 may perform smoothing processing on the original motion path (ie, eliminate the jittered part on the motion path), so as to obtain the change of the pose of the electronic device 100 in a stable shooting state.
  • the electronic device 100 can perform image registration on this frame of image, Thereby, the coordinates corresponding to each pixel of this frame of image in the steady shooting state are obtained.
  • the electronic device 100 can connect the frames of images that have undergone image registration in series in the order of acquisition time, so as to obtain a more stable video.
  • the electronic device 100 may also reduce or eliminate the jitter of the captured video through the optical image stabilization method.
  • the lens group of the camera of the electronic device 100 includes a magnetic levitation lens.
  • the electronic device 100 may detect shaking using the motion sensor. According to the measurement value of the motion sensor, the electronic device 100 can control the magnetic levitation lens to compensate the optical path and avoid the optical path from shaking. In this way, the electronic device 100 can reduce or eliminate the shake of the captured video.
  • the electronic device 100 may further perform anti-shake processing in combination with the aforementioned electronic anti-shake and optical anti-shake methods.
  • the embodiments of the present application do not limit the method for performing anti-shake processing on the electronic device 100.
  • anti-shake processing method reference may also be made to other video anti-shake methods in the prior art.
  • FIG. 3 exemplarily shows a flow chart of the electronic device 100 fusing M style transfer models.
  • M is a positive integer greater than or equal to 2. in:
  • the first style transfer model, the second style transfer model, ..., the Mth style transfer model are all style transfer models that have been trained, and all correspond to a single style.
  • the style corresponding to the first style transfer model is the night style.
  • the first style transfer model can transform the style of the input image into a night style.
  • the M style transfer models may specifically be neural network models, such as convolutional neural network models. Moreover, the network structure of these M style transfer models is the same.
  • the electronic device 100 may fuse the M style transfer models into a fused style transfer model with a specific style by means of interpolation and fusion. Specifically, the electronic device 100 may perform interpolation and fusion on the parameters of the M style transfer models at the same position, and use the parameters obtained after the interpolation and fusion as the parameters of the fusion style transfer model at this position.
  • the method for the electronic device 100 to interpolate and fuse the parameters at the same position of the M style transfer models may refer to the following formula (2):
  • ⁇ interp ⁇ 1 ⁇ 1 + ⁇ 2 ⁇ 2 +...+ ⁇ i ⁇ i +...+ ⁇ M ⁇ M (2)
  • ⁇ i can represent the parameters of the ith style transfer model at the first position.
  • the above-mentioned first position may be any position in the i-th style transfer model.
  • the parameter can be, for example, the bias b of a certain neural unit in the ith style transfer model, and the weight W s of each neural unit in the layer above this neural unit.
  • ⁇ i can represent the fusion weight of the ith style transfer model.
  • the value of ⁇ i is not otherwise limited in this embodiment of the present application.
  • ⁇ interp can represent the parameters obtained after interpolation and fusion. That is, the parameters of the fusion style transfer model in the first position.
  • the electronic device 100 can determine the values of the parameters of the fusion style transfer model at each position, so as to obtain a fusion style transfer model that fuses the above M style transfer models.
  • the network structure of the fusion style transfer model is the same as the network structure of the M style transfer models.
  • the first position of the i-th style transfer model and the first position of the above-mentioned fusion style transfer model are the same position in the same network structure.
  • the electronic device 100 is specifically described as an example of integrating the daytime style transfer model and the nighttime style transfer model.
  • the calculation formula of the parameters of the fusion style transfer model obtained by fusion in the first position can refer to the following formula (3):
  • the style corresponding to the fusion style transfer model is a style between the day style and the night style, and is determined by the values of ⁇ day and ⁇ night .
  • the style corresponding to the fusion style transfer model is closer to the style of daytime.
  • the style corresponding to the fusion style transfer model is closer to the night style.
  • the video that can present the time-lapse effect of rapid gradient from day to night can be obtained by changing the values of ⁇ day and ⁇ night . .
  • the value of ⁇ day can gradually decrease, and the value of ⁇ night can gradually increase.
  • a video contains n frames of images.
  • n is an integer greater than 1.
  • the calculation formula of the parameter of the first position may refer to the following formula (4):
  • the first position is any position in the fusion style transfer model.
  • the weight occupied by the fused style transfer model can be adjusted, so that the video shows the effect of gradually changing from one style to another.
  • the electronic device 100 may determine the fused style transfer model according to the style options in the foregoing embodiments.
  • the style options are day and night transition style options.
  • the electronic device 100 may determine that the fused style transfer models include a daytime style transfer model and a nighttime style transfer model.
  • the style of the images contained in the video can gradually change from a daytime style to a nighttime style. That is, the above-mentioned style-transferred video can present the time change of the scene in the video from day to night during the playback process.
  • FIG. 4 exemplarily shows an implementation manner of the electronic device 100 performing style transfer on a video by using a fusion style transfer model incorporating a daytime style transfer model and a nighttime style transfer model.
  • the electronic device 100 may perform interpolation and fusion on the daytime style transfer model and the nighttime style transfer model according to the above formula (4).
  • the video contains n frames of images.
  • the model for performing style transfer on the jth frame of the n frames of images may be a fusion style transfer model j.
  • the parameters of the fusion style transfer model j at the first position may be ((n-j+1)/n) ⁇ day +((j-1)/n) ⁇ night .
  • the electronic device 100 may calculate the parameters of the fusion style transfer model j at all positions, so as to obtain the fusion style transfer model j.
  • the electronic device 100 may use the fusion style transfer model 1 to the fusion style transfer model n to perform style transfer on the first frame image to the nth frame image, respectively.
  • the styles corresponding to the fusion style transfer model 1 to the fusion style transfer model n are respectively style 1 to style n.
  • the above-mentioned fusion style transfer model 1 may be a daytime style transfer model.
  • Style 1 can be a day style.
  • the above-mentioned fusion style transfer model n may be a night style transfer model.
  • Style n can be night style.
  • the styles of the first frame image to the nth frame image are style 1 to style n, respectively.
  • the style of the first frame image to the nth frame image gradually changes from a day style to a night style.
  • the style-transferred video can show a rapid transition from day to night during playback.
  • the number of fused style transfer models obtained by the electronic device 100 using the M style transfer models may be less than the number of frames of images in a video that needs to perform style transfer.
  • a fusion style transfer model can perform style transfer on a frame of images or consecutive multiple frames of images in a video that needs to be style-transferred.
  • the device for training the style transfer model may be a training device.
  • the style transfer model obtained by training is a style transfer model corresponding to a style.
  • the style transfer model corresponding to the night style is an example of training a dark night style transfer model.
  • the training set used for training the dark night style transfer model may include the content images that need to be styled transferred and the dark night style images.
  • the training device can input the content image into the dark night style transfer model to be trained to obtain a synthetic image.
  • the training equipment can calculate the loss of the loss function. This loss can be used to represent the gap between the style of the synthesized image and the style of the above-mentioned dark style image, as well as the gap between the content of the synthesized image and the content image input to the dark style transfer model in high-level semantic information.
  • the training device can adjust the parameters in the dark night style transfer model to be trained through the back-propagation algorithm.
  • the training equipment is directed to make the value of loss lower (that is, the smaller the gap between the style of the synthesized image and the style of the above-mentioned dark style image, the content of the synthesized image and the content image of the input dark style transfer model on high-level semantic information Adjust the parameters in the dark night style transfer model to be trained in the direction of the smaller the gap.
  • the training device can obtain the trained dark night style transfer model.
  • the training equipment can consider the connection between consecutive multi-frame content images when training the style transfer model. Specifically, when a training device trains a style transfer model, a multi-frame temporal loss can be introduced into the loss function.
  • FIG. 5 exemplarily shows a flowchart of another method for training a style transfer model. This method is particularly suitable for training style transfer models for video style transfer. As shown in FIG. 5, the training method may include steps S101-S104. The training set used to train the style transfer model in the training method may contain videos that need to be style transferred. in:
  • the training device inputs the content image of the rth frame of the video into the style transfer model to be trained, and calculates the loss loss_cur corresponding to the content image of the rth frame.
  • the training device may sequentially use the first frame of content images to the last frame of content images in the video to train the style transfer model.
  • the method of calculating the loss corresponding to the content image of the rth frame reference may be made to the description in the foregoing implementation manner.
  • the training device obtains the h frames of content images before the rth frame of content images and inputs the style transfer model to be trained to obtain h frames of composite images.
  • Both r and h above are positive integers, and h is less than r.
  • the training device calculates the difference between the synthetic image obtained by inputting the content image of the rth frame into the style transfer model to be trained and the synthetic image of the h frame, and obtains the multi-frame temporal loss L ct .
  • the training device can refer to the following formula (5) to calculate the multi-frame time domain loss L ct .
  • N can represent the style transfer model to be trained.
  • f cur can represent the content image of the current frame used for training the style transfer model, that is, the content image of the rth frame of the video.
  • N(f cur ) can represent the content image obtained by inputting the content image of the rth frame into the style transfer model to be trained.
  • f pre_i may represent the i-th frame of content images preceding the r-th frame of content images.
  • N(f pre_i ) may represent a content image obtained by inputting the content image of the i-th frame before the content image of the r-th frame into the style transfer model to be trained.
  • the training device adjusts the parameters of the style transfer model to be trained by using a back-propagation algorithm.
  • the training device can use the sum of the above loss_cur and the multi-frame time domain loss L ct as the input of the style transfer model to be trained as the loss corresponding to the rth frame of the content image, and then use the back-propagation algorithm to adjust the parameters of the style transfer model to be trained.
  • the training device can use multiple videos to train the style transfer model.
  • the training device can obtain the trained style transfer model.
  • the trained style transfer model can improve the consistency of the stylization effect of continuous multi-frame content images in the video when performing style transfer on multiple frames of images in the video, and reduce the flickering phenomenon during the video playback process.
  • the method for calculating the above-mentioned multi-frame temporal loss is not limited to the calculation method indicated by formula (5). Gap of composite image.
  • the training device may also obtain several frames of images after the rth frame of content images to calculate the multi-frame temporal loss.
  • the above training device and the electronic device 100 in this application may be the same device.
  • the electronic device uses two or more of these style transfer models for The fusion style transfer model obtained by interpolation fusion can also reduce the style differences of consecutive frames in the style-transferred video. That is to say, after the electronic device uses the fusion style transfer model to perform style transfer on the video according to the method in the foregoing embodiment, the styles of adjacent frame images in the video can be smoothly transitioned. During the video playback process, due to the style of adjacent frame images. Flicker caused by transitions is reduced.
  • the above-mentioned trained style transfer models may be stored in the electronic device 100 .
  • the electronic device 100 can obtain the style transfer model locally for fusion.
  • the above-mentioned trained style transfer model can be stored in the cloud.
  • the electronic device 100 uploads the video that needs to be styled transferred and the selected style option (such as the day and night style option 231A shown in FIG. 1D ) to the cloud.
  • the cloud can use the fusion style transfer model to perform style transfer on the video, and send the obtained style-transferred video to the electronic device 100 .
  • the electronic device 100 may also send only the selected style option to the cloud.
  • the cloud can send the style transfer model that needs to be merged to the electronic device 100 according to the above style options.
  • This embodiment of the present application does not specifically limit the storage location of the above trained style transfer model.
  • FIG. 6 exemplarily shows a flowchart of a photographing method provided by the present application.
  • the method may include steps S201-S207. in:
  • the electronic device 100 starts a camera application and a camera.
  • the electronic device 100 may open the camera application and the camera.
  • the electronic device 100 receives a first user operation for selecting a time-lapse photography mode in the camera mode, and displays an option for determining the style of the time-lapse video.
  • the above-mentioned first user operation may be, for example, the user operation acting on the time-lapse photography option 201D shown in FIG. 1B .
  • the electronic device 100 may display options for determining the style of the time-lapse video on the user interface. For example, as shown in FIG. 1C , the style option 231 for performing style transfer on the captured video is shown.
  • the above-mentioned time-lapse video may be a video obtained in series according to the sequence of collection time after the images collected by the camera in the process of time-lapse photography are processed through the following steps S205-S207. Among them, the user can view the time-lapse video obtained by the electronic device through the gallery application.
  • S203 The electronic device 100 receives a second user operation for selecting the style and duration of the time-lapse video, and stores the selected style and duration of the time-lapse video.
  • the above-mentioned second user operation may be, for example, the user operation performed on the day/night switching style option 231A shown in FIG. 1C and the user operation performed on the confirmation control 232C after selecting a time in the time option 232B shown in FIG. 1E .
  • the style option selected above may be used to instruct the electronic device 100 to perform style transfer on the video captured by the camera according to the style option.
  • the electronic device 100 may obtain the daytime style option and the nighttime style option from the local or the cloud.
  • the electronic device 100 may perform interpolation and fusion on the daytime style option and the nighttime style option, and use the obtained fusion style transfer model to perform style transfer on the video.
  • the above-mentioned selected time length can be used to indicate the frame extraction rate when the video captured by the camera of the electronic device 100 performs frame extraction.
  • the electronic device 100 may first display an option for determining the time of the time-lapse video. Afterwards, the electronic device 100 displays an option for determining the style of the time-lapse video.
  • the electronic device 100 may also simultaneously display options for determining the style and time of the time-lapse video.
  • This embodiment of the present application does not limit the manner in which the electronic device 100 displays options for determining the style and time of the time-lapse video.
  • the electronic device 100 receives a user operation for starting and ending time-lapse photography, and obtains a first video, where the first video includes images captured by the camera during the process from starting time-lapse photography to ending time-lapse photography.
  • the above-mentioned user operation for starting time-lapse photography can be, for example, the user operation that acts on the confirmation control 232C after selecting the time in the time option 232B as shown in FIG. 1E .
  • the above-mentioned user operation for ending time-lapse photography may be, for example, the user operation acting on the stop shooting control 205 shown in FIG. 1F .
  • the above process from starting the time-lapse photography to ending the time-lapse photography is the process of performing the time-lapse photography.
  • the camera of the electronic device 100 may capture images at the rate of normal video recording and capturing images (for example, capture 30 frames of images per second).
  • the electronic device 100 may connect the images collected by the camera during the time-lapse photography in series in the order of collection time to obtain the first video.
  • the electronic device 100 performs anti-shake processing on the first video to obtain a second video.
  • the electronic device 100 can perform anti-shake processing on the first video to obtain the second video.
  • the specific method of anti-shake processing will not be repeated here.
  • the electronic device 100 extracts frames from the second video according to the time length selected by the user in the second user operation to obtain a third video, where the time length of the third video is the time length selected by the user.
  • the electronic device 100 may determine the frame sampling rate according to the time length of the process of time-lapse photography performed by the camera and the time length selected by the user in the second user operation.
  • the electronic device 100 can perform frame extraction on the second video according to the frame extraction rate, and the extracted images are connected in series in the order of acquisition time to obtain the third video.
  • the electronic device 100 receives a user operation acting on the stop photography control 205 , and the electronic device 100 can stop photography. That is, the process of time-lapse photography by the camera is 1 minute. As shown in FIG. 1E , the time length selected by the user in the above-mentioned second user operation is 10 seconds. Then, the electronic device 100 may determine that the frame rate is 1:6. The electronic device 10 may extract 1 frame of images in every 6 frames of images. In a possible implementation manner, the electronic device 100 may draw frames at equal intervals. That is, the electronic device 100 can extract the 1st frame image, the 7th frame image, the 13th frame image . . . in the second video. Then, the electronic device 100 may connect the extracted images in series in the order of acquisition time to obtain a third video.
  • the electronic device 100 may also extract frames in other ways according to the obtained frame sampling rate.
  • the electronic device 100 uses the fusion style transfer model to perform style transfer on the third video according to the style selected by the user in the second user operation, and saves the style-transferred video as a time-lapse video.
  • the electronic device 100 may determine the style transfer model to be merged according to the style selected by the user in the second user operation. Further, the electronic device 100 may determine a fusion style transfer model for performing style transfer on each frame of images in the third video according to the number of frames of the images included in the third video. According to the method for performing style transfer on a video using the fusion style transfer model in the foregoing embodiment, the electronic device 100 can use the obtained fusion style transfer model to perform style transfer on each frame of the third video to obtain a time-lapse video. The electronic device 100 may save the time-lapse video. The specific implementation method for the electronic device 100 to perform style transfer on the third video will not be repeated here.
  • the electronic device may extract frames from the captured video, so as to compress the time length of the original video to the time length desired by the user.
  • the electronic device performs style transfer on the video according to the video style selected by the user, so that the video shot by the user in a short time has the time-lapse effect of the video shot for a long time. For example, a time-lapse video shot within 1 minute can have a time-lapse effect that rapidly fades from day to night, which would have taken 12 hours or even longer.
  • the electronic device can perform anti-shake processing on the captured video.
  • the above shooting method enables time-lapse photography to break through the limitations of shooting scenes, equipment and time, and improves the convenience and fun of time-lapse photography for users.
  • the electronic device 100 does not receive the user-selected style of the time-lapse video.
  • the electronic device 100 may not perform style transfer on the video.
  • the user does not select a style in the style option 231 . That is, all styles in style option 231 are unchecked.
  • the electronic device 100 can start time-lapse photography.
  • the electronic device 100 may perform anti-shake processing and frame extraction processing to obtain a time-lapse video. That is to say, the user can choose to compress only the original video captured by the camera in time, so as to obtain a time-lapse video with a corresponding time-lapse effect.
  • the electronic device 100 may perform anti-shake processing on the captured original video, and select the time-lapse video according to the user's selection.
  • the style of the time-lapse video is transferred to the anti-shake-processed video to obtain a time-lapse video. That is to say, the electronic device 100 may not perform frame extraction processing on the video.
  • the electronic device 100 may first perform anti-shake processing on the captured original video.
  • the electronic device 100 may perform frame interpolation on the anti-shake processed video to increase the time length of the video to the time length of the time-lapse video selected by the user. Finally, the electronic device 100 may perform style transfer on the video subjected to frame insertion processing according to the style of the time-lapse video selected by the user to obtain a time-lapse video.
  • frame insertion method for video, reference may be made to the frame insertion method in the prior art, which is not limited in this embodiment of the present application.
  • the electronic device 100 may process the video captured in the non-time-lapse photography mode (eg, frame extraction and style transfer) to obtain a time-lapse video.
  • the non-time-lapse photography mode eg, frame extraction and style transfer
  • the electronic device 100 may open the gallery application and display the gallery interface 260 as shown in FIG. 7B.
  • the gallery interface 260 may include a first time indicator 261 , a second time indicator 265 , a first video thumbnail 262 , a second video thumbnail 263 , a first photo thumbnail 264 , and a second photo thumbnail 266 . in:
  • the first video thumbnail 262 and the second video thumbnail 263 may be the covers of the first video and the second video, respectively.
  • the electronic device 100 may use the first frame image of the video as the cover of the video thumbnail.
  • the electronic device 100 may display a user interface for playing the first video or the second video.
  • the first photo thumbnail 264 and the second photo thumbnail 266 may be thumbnails of the first photo and the second photo, respectively.
  • the electronic device 100 may display the first photo or the second photo.
  • the first time indicator 261 and the second time indicator 265 may be used to indicate the time when the video and the photo were taken under the first time indicator 261 and the second time indicator 265, respectively.
  • the time indicated by the first time indicator 261 is today (today is the time displayed on the user interface 210 on November 9).
  • the first video thumbnail 262 , the second video thumbnail 263 , and the first photo thumbnail 264 are located under the first time indicator 261 . That is to say, the first video, the second video, and the first photo were shot by the electronic device 100 on November 9.
  • the time indicated by the second time indicator 265 is yesterday (yesterday is November 8).
  • the second photo thumbnail 166 is located under the second time indicator 265 . That is to say, the second photo was taken by the electronic device 100 on November 8th.
  • the electronic device 100 may display more content on the gallery interface 260 in response to a user operation acting on the gallery interface 260 to slide up and down.
  • the electronic device 100 may display a user interface 270 as shown in FIG. 7C.
  • User interface 270 may include time controls 271 , video playback area 272 , setting options 273 . in:
  • the time control 271 may be used to indicate the time at which the electronic device 100 stores the first video. For example, November 9, 2020 at 7:30 am.
  • the above-mentioned time for storing the first video may be the time when the shooting of the first video is completed.
  • Video playback area 272 may include playback controls 272A.
  • the play control 272A can be used to instruct the electronic device 272A to play the first video.
  • Set options 273 may include share options 273A, favorite options 273B, edit options 273C, and delete options 273D.
  • the sharing option 273A can be used by the user to share the first video to other devices.
  • Favorite option 273B is available for the user to favorite the first video.
  • the editing option 273C may be used by the user to perform editing operations on the first video, such as rotating and cropping, adding filters, and the like.
  • the delete option 273D may be used by the user to delete the first video from the electronic device 100 .
  • the electronic device 100 may display the video editing interface 280 as shown in FIG. 7D.
  • the video editing interface 280 may include a video playback area 281 , editing options 282 . in:
  • Editing options 282 may include rotate crop options 282A, filter options 282B, soundtrack options 282C, text options 282D, watermark options 282E, time-lapse photography options 282F.
  • the option cropping option 282A can be used to rotate and crop each frame image in the first video.
  • the filter option 282B, the soundtrack option 282C, the text option 282D, and the watermark option 282E can respectively add filters, add background music, add text, and add watermarks to each frame of the first video.
  • the time-lapse photography option 282F can be used to perform frame extraction processing and style transfer processing on the first video, so as to obtain a video with a time-lapse effect.
  • Edit options 282 may contain more or fewer options.
  • Video editing interface 280 may also include style selection options 283 .
  • the style selection options 283 may include a prompt control 283A, a day/night switching style option 283B, a season changing style option 283C, a rainy or sunny style option 283D, a cancel control 283E, and a next step control 283F. in:
  • the prompt control 283A may be used to prompt the user to select a style that needs to be styled for the first video.
  • Prompt control 283A may include the text prompt "Style Selection”.
  • the embodiment of the present application does not limit the specific form of the prompt word control 283A.
  • Cancel control 283E may be used by the user to cancel the style selection.
  • the electronic device 100 may display the video editing interface 280 as shown in FIG. 7D.
  • the next step control 283 can be used by the user to further complete related settings of time-lapse photography editing. For example, set the length of time for a time-lapse video.
  • style options may also be included in the style selection options 283 described above.
  • the electronic device 100 may display the video editing interface 280 as shown in FIG. 7F .
  • the video editing interface 280 may also include a duration selection option 284 .
  • the duration selection option 284 may include a prompt control 284A, a time option 284B, a previous step control 284C, and a save control 284D. in:
  • Prompt control 284A may be used to prompt the user to select the length of time for the final generated time-lapse video.
  • the prompt control 284A may include a text prompt "Duration selection (please determine the duration of the processed video)".
  • time option 284B For the role of the time option 284B, reference may be made to the introduction of the time option 232B in FIG. 1E in the foregoing embodiment, and details are not repeated here.
  • the previous step control 284C can be used by the user to return to the previous step and reselect the style for performing style transfer processing on the first video.
  • the electronic device 100 may display the video editing interface 280 as shown in FIG. 7D.
  • the save control 284D may be used by the electronic device 100 to store a user-selected style (eg, day and night style) and a length of time (eg, 10 seconds).
  • the electronic device 100 may perform frame extraction and style transfer on the first video according to the style and time length selected by the user, to obtain a time-lapse video.
  • the time length of the time-lapse video is the time length selected by the user.
  • the electronic device 100 may use a fusion style transfer model (eg, a fusion style transfer model incorporating a daytime style transfer model and a nighttime style transfer model) to perform style transfer on the frame-extracted first video.
  • a fusion style transfer model eg, a fusion style transfer model incorporating a daytime style transfer model and a nighttime style transfer model
  • the electronic device 100 may display a user interface 270 as shown in FIG. 7G.
  • the content contained in the user interface 270 shown in FIG. 7G is consistent with the controls contained in the user interface 270 shown in FIG. 7C .
  • the difference is that the video contained in the video playing area 272 is the cover of the time-lapse video obtained by performing frame extraction and style transfer on the first video.
  • the cover of the time-lapse video may be the first frame of the time-lapse video.
  • the time control 271 may instruct the electronic device 100 to store the time for the above-mentioned time-lapse video. For example, on November 9, 2020, after 8:00 am. It can be seen that the time at which the electronic device 100 stores the first video is different from the time at which the above-mentioned time-lapse video is stored. The first video was shot and stored in the electronic device 100 at 7:30 am on November 9, 2020. The above time-lapse video is obtained by processing the first video by the electronic device 100 at 8:00 am on November 9, 2020 and stored in the electronic device 100 .
  • the electronic device 100 may play the time-lapse video in response to a user operation acting on the play control 272A shown in FIG. 7G .
  • a user operation acting on the play control 272A shown in FIG. 7G For the specific process of the delayed video playback, reference may be made to the video playback process shown in FIG. 2A to FIG. 2C. I won't go into details here.
  • the user can select the style and time length to perform time-lapse processing on the video that has been shot.
  • the above-mentioned video that has been shot may be, for example, a video obtained by the electronic device 100 through the camera when the video recording mode option 201B is selected in the camera mode option shown in FIG. 1B . That is to say, the electronic device 100 can perform frame extraction and style transfer processing on any video to obtain a time-lapse video.
  • the time-lapse effect of the time-lapse video may not be affected by the time length of the original video. Any video shot by a user in a short period of time can have a time-lapse effect of shooting a video for a long time after the frame extraction and style transfer processing in the embodiments of the present application.
  • the following describes a scene for shooting a panorama image with a style gradient effect provided by an embodiment of the present application.
  • 8A to 8E exemplarily show schematic diagrams of scenes in which the electronic device 100 shoots a panorama image with a style transfer effect.
  • the electronic device 100 may open the camera application and the camera. Wherein, the electronic device 100 may display the user interface 290 as shown in FIG. 8B .
  • the user interface 290 may include a preview area 291 , a style option 292 , a camera mode option 201 , a gallery shortcut control 202 , a shutter control 203 , and a camera flip control 204 . in:
  • the camera mode option 201 may further include a time-lapse panorama mode option 201G, and the time-lapse panorama mode option 201G is in a selected state.
  • the electronic device 100 can capture a panorama image with a style gradient effect.
  • style options 292 may be included in style options 292 .
  • style options 292 For example, the day and night changing style option 292A, the four season changing style option 292B, and the rainy and sunny changing style option 292C. All of the above style options can be used to instruct the electronic device 100 to perform style transfer on the captured panorama, so as to convert the style of the panorama to the style corresponding to the style option.
  • the style transfer model corresponding to the above-mentioned day-night transition style option 292A may be a fusion style transfer model obtained by fusing the daytime style transfer model and the nighttime style transfer model.
  • the electronic device 100 may divide the captured panorama into m regions from left to right. There may be overlapping portions in the m regions. Further, the electronic device 100 may use the above-mentioned fusion style transfer model to perform style transfer on the m regions segmented from the panorama.
  • the electronic device 100 may select a splicing area from each of the above-mentioned m areas that have undergone style transfer, and stitch the m areas to obtain a panorama image that has undergone style transfer.
  • the style from the left to the right of the panorama obtained by the stitching may gradually change from a daytime style to a nighttime style.
  • the style transfer model corresponding to the above-mentioned four-season change transition style option 231B may be a fusion style transfer model obtained from the spring style transfer model, the summer style transfer model, the autumn style transfer model, and the winter style transfer model.
  • the electronic device 100 may use the above-mentioned fusion style transfer model to perform style transfer on the m regions obtained by segmenting the panorama image.
  • the electronic device 100 may select a splicing area from each of the above-mentioned m areas that have undergone style transfer, and stitch the m areas to obtain a panorama image that has undergone style transfer.
  • the style from the left to the right of the panorama obtained by the stitching can be gradually changed from spring style to summer style, then from summer style to autumn style, and then from autumn style to winter style.
  • the style transfer model corresponding to the sunny and rainy alternate style option 231C may be a fusion style transfer model obtained by fusing the sunny weather style transfer model and the rainy weather style transfer model.
  • the electronic device 100 may use the above-mentioned fusion style transfer model to perform style transfer on the m regions obtained by segmenting the panorama image.
  • the electronic device 100 may select a splicing area from each of the above-mentioned m areas that have undergone style transfer, and stitch the m areas to obtain a panorama image that has undergone style transfer.
  • the style from the left to the right of the panorama obtained by the stitching can be gradually changed from a sunny style to a rainy style.
  • style option 292 may also include a single style transfer model (eg, a cartoon style transfer model).
  • the embodiment of the present application does not limit the specific way of changing the style of the panorama image obtained by performing style transfer using the fusion style transfer model.
  • the electronic device 100 may also perform style transfer processing on the multi-frame images obtained by segmenting the panorama, and then stitch the multi-frame images into a panorama.
  • the style from the left to the right of the panorama obtained by the above splicing may gradually change from a night style to a day style.
  • the manner of dividing the panorama image by the electronic device 100 may also be dividing from the upper side to the lower side.
  • the style from the upper side to the lower side of the panorama obtained through the above style transfer processing and splicing may be gradually changed from a daytime style to a nighttime style, or gradually changed from a nighttime style to a daytime style.
  • the electronic device 100 may also segment the panoramic image along any direction of the panoramic image.
  • the preview area 291 can be used to display images captured by the camera in real time.
  • the preview area 291 may contain an operation prompt 291A and a shooting progress indication 291B. in:
  • the operation prompt 291A may be used to prompt the user for an operation instruction for shooting a panorama.
  • the operation prompt 291A may include a text prompt "press the shutter key and move slowly in the direction of the arrow".
  • the “shutter key” in the above text prompt is the shutter control 203 .
  • the “arrow” in the above text prompt is the arrow in the shooting progress indication 291B.
  • the shooting progress indication 291B may include a panorama thumbnail and arrows.
  • the above-mentioned panorama thumbnails may be used to present thumbnails of panorama images obtained from the time when the panorama shooting is started to the current moment.
  • the above arrows can be used to indicate the direction in which the electronic device 100 moves during the panorama shooting process.
  • the above-mentioned arrow pointing in the horizontal right direction may indicate that during the panorama shooting process, the electronic device 100 moves in the horizontal right direction from the position at the moment when the panorama shooting starts.
  • the electronic device 100 receives a user operation acting on the day/night switching style option 292A and a user operation acting on the shutter control 203 .
  • the electronic device 100 can start panorama shooting.
  • the electronic device 100 may display the user interface 290 as shown in FIG. 8C .
  • User interface 290 may include preview area 291 , pause capture control 206 .
  • the preview area can be used to display the images captured by the camera in real time.
  • the preview area 291 may include a shooting progress indication 291B and an operation prompt 291C.
  • the above-mentioned shooting progress indication 291B reference may be made to the descriptions of the foregoing embodiments.
  • the above-mentioned operation instruction 291C can be used as an operation description for prompting the user to take a picture during the panorama shooting process.
  • the operation prompt 291C may include the text prompt "Please keep the arrow on the center line".
  • Pause capture control 206 may be used to end panorama capture.
  • the electronic device 100 may stitch images captured by the camera into a panorama image during the process from starting the panorama shooting to ending the panorama shooting.
  • the arrow in the above-mentioned shooting progress indication 291B may move with the movement of the electronic device 100 .
  • the electronic device 100 may end the panorama shooting.
  • the ending position of the above arrow may be, for example, the rightmost position in the shooting progress indication 291B.
  • the electronic device 100 may stitch the images collected by the camera into a panorama during the process from starting the panorama shooting to ending the panorama shooting.
  • the electronic device 100 may perform anti-shake processing on the multi-frame images collected during the above-mentioned panorama shooting process. Further, the electronic device 100 may stitch multiple frames of images subjected to anti-shake processing into a panorama.
  • the embodiment of the present application does not limit the manner in which the electronic device 100 splices the images collected by the camera to obtain the panorama image, and the specific implementation method may refer to the method for capturing the panorama image in the prior art.
  • the electronic device 100 may perform wind migration on the above-mentioned original panorama according to the style selected by the user to obtain a panorama with a style gradient effect.
  • the electronic device 100 may store the above-mentioned panorama image with a style gradient effect.
  • the electronic device 100 may display the user interface 290 .
  • the user interface 290 For the content contained on the user interface 290, reference may be made to the description of the user interface 290 shown in FIG. 8B in the foregoing embodiment.
  • the user can view the panorama image with the style gradient effect obtained by the above shooting through the gallery shortcut key 202 .
  • the electronic device 100 may display the user interface 310 as shown in FIG. 8E.
  • User interface 310 may include time controls 311 and first panorama 312 .
  • the first panorama 312 is a panorama obtained through the panorama shooting process shown in FIG. 8B and FIG. 8C .
  • the style of the first panorama 312 from left to right may gradually change from a day style to a night style.
  • the above-mentioned time control 311 may be used to instruct the electronic device 100 to store the above-mentioned time of the first panorama 312 .
  • the user interface 310 may further include more content, which is not limited in this embodiment of the present application.
  • the electronic device 100 can use the fusion style transfer model to perform style transfer on the panorama image to obtain a panorama image with a style gradient effect.
  • the above-mentioned method for processing a panorama image improves the interest of a user in shooting a panorama image.
  • FIG. 9 exemplarily shows a method for implementing style transfer for panorama images.
  • the main steps of performing style transfer on the panorama to obtain a panorama with a style gradient effect may include: dividing the panorama, performing style transfer on the divided regions and selecting a splicing area, and splicing the stitching area to obtain a panorama with a style gradient effect.
  • Panorama a method for implementing style transfer for panorama images.
  • the first panorama may be an original panorama obtained by stitching the images collected by the camera during the panorama shooting process shown in FIG. 8B to FIG. 8C .
  • the electronic device 100 may divide the first panorama into m regions from left to right.
  • the electronic device 100 may segment the first panorama image by capturing an image through a sliding window.
  • the length of the sliding window may be d.
  • the distance of each sliding of the sliding window is ⁇ c.
  • the electronic device 100 slides the above-mentioned sliding window m-1 times to the right from the leftmost side of the first panorama, so as to obtain m regions each with a length of d.
  • the above-mentioned ⁇ c is smaller than the above-mentioned d. That is, there are overlapping parts in adjacent regions.
  • the electronic device 100 may perform style transfer for the m regions respectively according to the style selected by the user.
  • style transfer for the above method of performing style transfer, reference may be made to the introduction of the electronic device 100 using the fusion style transfer model to perform style transfer on multiple frames of images in the video in the foregoing embodiment.
  • the style selected by the user is a day and night transition style.
  • the electronic device 100 may perform interpolation fusion on the daytime style transfer model and the nighttime style transfer model.
  • the network structure of the resulting fusion style transfer model is the same as that of the daytime style transfer model and the nighttime style transfer model.
  • the model for the electronic device 100 to perform style transfer on the jth region is the fusion style transfer model a j.
  • the parameter of the fusion style transfer model a j at the first position may be ((m-j+1)/m) ⁇ day +((j-1)/m) ⁇ night .
  • the above ⁇ day and ⁇ night are the parameters of the day style transfer model and the night style transfer model in the first position, respectively.
  • the above-mentioned first position is any position in the network structure of the above-mentioned fusion style transfer model a j.
  • the first position of the daytime style transfer model, the first position of the nighttime style transfer model, and the first position of the above-mentioned fusion style transfer model a j are all the same position in the same network structure.
  • the above j is an integer greater than or equal to 1 and less than or equal to m.
  • the electronic device 100 can use the fusion style transfer model a1 to the fusion style transfer model am to perform style transfer on the first area to the mth area, respectively.
  • the styles corresponding to the fusion style transfer model 1 to the fusion style transfer model m are respectively style 1 to style m.
  • the above fusion style transfer model 1 can be a daytime style transfer model.
  • Style 1 can be a day style.
  • the above-mentioned fusion style transfer model m may be a night style transfer model.
  • Style m can be night style.
  • the styles of the first region to the mth region are style 1 to style m, respectively.
  • the style of the first area to the mth area gradually changes from a daytime style to a nighttime style.
  • the electronic device 100 may cut out a part of the splicing area from each area from the first area to the mth area that has undergone style transfer.
  • the electronic device 100 may perform splicing of the obtained splicing regions. In this way, the electronic device 100 can obtain a panorama image with a style gradient effect.
  • the electronic device 100 may cut out splicing regions of the same length from each region.
  • the length of the spliced region may be ⁇ c ⁇ .
  • ⁇ c ⁇ L/m.
  • the electronic device 100 intercepts the k th splicing area from the k th area the length from the leftmost area of the k th area is (k-1)*( ⁇ c ⁇ - ⁇ c) Start at the position, and intercept the splicing area with a length of ⁇ c ⁇ .
  • the above k is an integer greater than or equal to 1 and less than or equal to m.
  • the electronic device 100 starts from the leftmost side of the first region that has undergone style transfer, and intercepts a splicing region with a length of ⁇ c' to obtain the first splicing region.
  • the electronic device 100 intercepts a spliced area with a length of ⁇ c ⁇ starting from a position with a length ⁇ c ⁇ - ⁇ c from the leftmost second area that has undergone style transfer, to obtain a second spliced area.
  • the electronic device 100 starts from the position of the leftmost length of (m-1)*( ⁇ c ⁇ - ⁇ c) from the mth area that has undergone style transfer, intercepts the splicing area with the length ⁇ c ⁇ , and obtains the m splice area.
  • the electronic device 100 may splicing the above-mentioned first splicing area to m th splicing area in order from left to right. Wherein, each splicing region does not have overlapping parts during splicing. As shown in FIG. 9 , the electronic device 100 can obtain a panoramic image with a length L.
  • the panorama has a stylized gradient effect. For example, the panorama's style from left to right is gradually changing from a day style to a night style.
  • the electronic device 100 may also acquire panorama images or other types of images from a gallery application, and perform style transfer on the obtained panorama images or other types of images by using the method in the aforementioned embodiment shown in FIG. 9 .
  • the styles of the spliced regions cut out from adjacent regions can transition more smoothly. That is, the above-mentioned method of segmenting the first panoramic image and cutting out the splicing area from each area obtained through style transfer can improve the smoothness of the stylized effect of the panoramic image.
  • the left-to-right style of the panorama generated by the electronic device 100 can be more smoothly changed from a daytime style to a nighttime style.
  • the embodiments of the present application do not limit the specific values of the length d of the sliding window, the distance ⁇ c for each sliding of the sliding window, the number of areas m obtained by dividing the first panoramic image, and the length ⁇ c' of each splicing area.
  • each sliding distance of the sliding window may be different. That is, the lengths of the regions obtained by dividing the first panoramic image by the electronic device 100 may not be equal.
  • the length of the electronic device 100 intercepting the splicing area from each area may also be different. That is, the lengths of the stitching regions used for stitching the panorama may not be equal.
  • FIG. 10 shows a schematic structural diagram of the electronic device 100 .
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headphone jack 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, and Subscriber identification module (subscriber identification module, SIM) card interface 195 and so on.
  • SIM Subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, and ambient light. Sensor 180L, bone conduction sensor 180M, etc.
  • the structures illustrated in the embodiments of the present invention do not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components.
  • the illustrated components may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example, the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural-network processing unit (NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processor
  • graphics processor graphics processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • the controller may be the nerve center and command center of the electronic device 100 .
  • the controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is cache memory. This memory may hold instructions or data that have just been used or recycled by the processor 110 . If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby increasing the efficiency of the system.
  • the USB interface 130 is an interface that conforms to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and peripheral devices.
  • the charging management module 140 is used to receive charging input from the charger.
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives input from the battery 142 and/or the charging management module 140 and supplies power to the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, the modulation and demodulation processor, the baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands.
  • the mobile communication module 150 may provide wireless communication solutions including 2G/3G/4G/5G etc. applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves from the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modulation and demodulation processor for demodulation.
  • LNA low noise amplifier
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low frequency baseband signal is processed by the baseband processor and passed to the application processor.
  • the application processor outputs sound signals through audio devices (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or videos through the display screen 194 .
  • the wireless communication module 160 can provide applications on the electronic device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), global navigation satellites Wireless communication solutions such as global navigation satellite system (GNSS), frequency modulation (FM), near field communication (NFC), and infrared technology (IR).
  • WLAN wireless local area networks
  • BT Bluetooth
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication
  • IR infrared technology
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , perform frequency modulation on it, amplify it, and convert it into electromagnetic waves for radiation through the antenna 2 .
  • the antenna 1 of the electronic device 100 is coupled with the mobile communication module 150, and the antenna 2 is coupled with the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor.
  • the GPU is used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.
  • Display screen 194 is used to display images, videos, and the like.
  • Display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light).
  • LED diode AMOLED
  • flexible light-emitting diode flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (quantum dot light emitting diodes, QLED) and so on.
  • the electronic device 100 may include one or N display screens 194 , where N is a positive integer greater than one.
  • the electronic device 100 may implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
  • the ISP is used to process the data fed back by the camera 193 .
  • the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin tone.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be provided in the camera 193 .
  • the ISP may also perform anti-shake processing on multiple frames of images in the video.
  • the ISP can compensate the image according to the data collected by the motion sensor, so as to reduce problems such as image instability and out-of-focus caused by the shaking of the electronic device 100 during the shooting process.
  • Camera 193 is used to capture still images or video.
  • the object is projected through the lens to generate an optical image onto the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other formats of image signals.
  • the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • a digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy and so on.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs.
  • the electronic device 100 can play or record videos of various encoding formats, such as: Moving Picture Experts Group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG Moving Picture Experts Group
  • MPEG2 moving picture experts group
  • MPEG3 MPEG4
  • MPEG4 Moving Picture Experts Group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the electronic device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • multiple style transfer models may be stored in the NPU.
  • the NPU can use the style transfer model to perform style transfer on the images processed by ISP.
  • the NPU can fuse multiple style transfer models using the methods shown in Figures 3 and 4, and use the fusion style transfer model obtained by fusion to separate the multiple frames of images. Perform style transfer.
  • the external memory interface 120 can be used to connect an external memory card to expand the storage capacity of the electronic device 100 .
  • the external memory card communicates with the processor 110 through the external memory interface 120 to realize the data storage function. For example to save files like music, video etc in external memory card.
  • Internal memory 121 may be used to store computer executable program code, which includes instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing the instructions stored in the internal memory 121 .
  • the electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playback, recording, etc.
  • the audio module 170 is used for converting digital audio information into analog audio signal output, and also for converting analog audio input into digital audio signal.
  • Speaker 170A also referred to as a “speaker” is used to convert audio electrical signals into sound signals.
  • the receiver 170B also referred to as “earpiece”, is used to convert audio electrical signals into sound signals.
  • the microphone 170C also called “microphone” or “microphone”, is used to convert sound signals into electrical signals.
  • the earphone jack 170D is used to connect wired earphones.
  • the pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals.
  • the gyro sensor 180B may be used to determine the motion attitude of the electronic device 100 .
  • the angular velocity of electronic device 100 about three axes ie, x, y, and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to offset the shaking of the electronic device 100 through reverse motion to achieve anti-shake.
  • the gyro sensor 180B can also be used for navigation and somatosensory game scenarios.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude through the air pressure value measured by the air pressure sensor 180C to assist in positioning and navigation.
  • the acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes).
  • the magnitude and direction of gravity can be detected when the electronic device 100 is stationary. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the electronic device 100 can measure the distance through infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 can use the distance sensor 180F to measure the distance to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power.
  • the ambient light sensor 180L is used to sense ambient light brightness.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket, so as to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, accessing application locks, taking pictures with fingerprints, answering incoming calls with fingerprints, and the like.
  • the temperature sensor 180J is used to detect the temperature.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K may be disposed on the display screen 194 , and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to touch operations may be provided through display screen 194 .
  • the bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human voice.
  • the keys 190 include a power-on key, a volume key, and the like. Keys 190 may be mechanical keys. It can also be a touch key.
  • the electronic device 100 may receive key inputs and generate key signal inputs related to user settings and function control of the electronic device 100 .
  • Motor 191 can generate vibrating cues.
  • the motor 191 can be used for vibrating alerts for incoming calls, and can also be used for touch vibration feedback.
  • the indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
  • the SIM card interface 195 is used to connect a SIM card.
  • the electronic device may acquire the first image sequence.
  • the above-mentioned first image sequence may be a video or a multi-frame image obtained by segmenting a panorama.
  • the first image sequence is a video
  • the above-mentioned first image sequence may be obtained by turning on the camera of the electronic device shown in the foregoing embodiments of FIGS. 1A to 1F and shooting.
  • the above-mentioned first image sequence may be obtained from a gallery application by the electronic device shown in the foregoing embodiments of FIG. 7A to FIG. 7C .
  • the panorama may be photographed by turning on the camera of the electronic device shown in FIGS. 8A to 8D .
  • the electronic device may process the first image sequence based on the target transfer style to obtain the second image sequence.
  • the above-mentioned target migration style may be, for example, the style of day and night transition, the style of changing seasons, and the style of alternating between rain and shine in the foregoing embodiment.
  • the electronic device 100 may determine the above-mentioned target transfer style according to a user operation acting on any style option in the style options 231 .
  • the above-mentioned target transfer style may be used to indicate that the style of the first frame of images in the second image sequence to the style of the n-th frame of images changes in the M styles in the order of the first style.
  • the above-mentioned target transfer style can be used to determine the size of the above-mentioned M. That is, the number of style transfer models used for fusion.
  • the high-level semantic information of the images in the second image sequence obtained by changing the sequence of the first style may present a sequence change in natural time.
  • the target transfer style is day-night transfer style.
  • the above M styles can be day style and night style.
  • the above-mentioned first style order may be an order changing from day style to night style.
  • the target migration style is the seasonal change style.
  • the above M styles can be spring style, summer style, autumn style and winter style.
  • the above-mentioned first style sequence may be from spring style to summer style, then from summer style to autumn style, and then from autumn style to night style. This embodiment of the present application does not limit the order in which the M styles are arranged in the above-mentioned first style order.
  • the electronic device may use k fusion style transfer models to process the above-mentioned first image sequence based on the above-mentioned target transfer style.
  • the k fusion style transfer models can be weighted and generated by M single style transfer models.
  • the term “when” may be interpreted to mean “if” or “after” or “in response to determining" or “in response to detecting" depending on the context.
  • the phrases “in determining" or “if detecting (the stated condition or event)” can be interpreted to mean “if determining" or “in response to determining" or “on detecting (the stated condition or event)” or “in response to the detection of (the stated condition or event)”.
  • the above-mentioned embodiments it may be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on a computer, all or part of the processes or functions described in the embodiments of the present application are generated.
  • the computer may be a general purpose computer, special purpose computer, computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be downloaded from a website site, computer, server, or data center Transmission to another website site, computer, server, or data center by wire (eg, coaxial cable, optical fiber, digital subscriber line) or wireless (eg, infrared, wireless, microwave, etc.).
  • the computer-readable storage medium can be any available medium that can be accessed by a computer, or a data storage device such as a server, data center, etc., that includes an integration of one or more available media.
  • the usable media may be magnetic media (eg, floppy disks, hard disks, magnetic tapes), optical media (eg, DVDs), or semiconductor media (eg, solid state drives), and the like.
  • the process can be completed by instructing the relevant hardware by a computer program, and the program can be stored in a computer-readable storage medium.
  • the program When the program is executed , which may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: ROM or random storage memory RAM, magnetic disk or optical disk and other mediums that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Studio Devices (AREA)
  • Television Signal Processing For Recording (AREA)
  • Image Processing (AREA)

Abstract

本申请提供一种图像处理方法及电子设备。在该方法中,电子设备可以基于目标迁移风格,利用融合有多个单风格迁移模型的融合风格迁移模型对第一图像序列进行风格迁移处理,得到第二图像序列。第二图像序列中第一帧图像的风格至最后一帧图像的风格在上述多个单风格迁移模型输出图像的风格中按第一风格顺序变化。上述第一图像序列可以来自电子设备拍摄得到的视频。电子设备可以将上述第二图像序列中的多帧图像保存为视频。该视频在播放过程中可呈现时间快速流逝的效果。上述图像处理方法可以使得用户在短时间内拍摄出来的视频具有长时间拍摄视频的延时效果,提高了用户进行延时摄影的便利性和趣味性。

Description

图像处理方法及电子设备
本申请要求于2020年12月07日提交中国专利局、申请号为202011420630.6、申请名称为“图像处理方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端技术领域,尤其涉及图像处理方法及电子设备。
背景技术
随着电子技术的发展,用户可以通过手机、平板电脑等包含摄像头的电子设备进行延时摄影,将在较长时间内,例如几小时、几天甚至几年,拍摄的视频压缩成一个较短的时间的视频进行播放,从而呈现出平常用肉眼无法察觉的精彩景象。
上述延时摄影过程中,电子设备可以对视频进行抽帧处理以达到上述延时摄影的效果。但在拍摄延时视频时,用户往往需要将电子设备长时间固定在一个地方。这对延时摄影的场景、设备和时间均有较高的限制。
发明内容
本申请提供了图像处理方法及电子设备,可以对用户拍摄得到的视频进行风格迁移,使得用户在短时间内拍摄出来的视频具有长时间拍摄视频的延时效果,提高用户进行延时摄影的便利性和趣味性。
第一方面,本申请提供一种图像处理方法。该方法包括:电子设备获取第一图像序列。电子设备可以基于目标迁移风格对第一图像序列进行处理,得到第二图像序列。第一图像序列和第二图像序列均包含n帧图像。第一图像序列中第i帧图像与第二图像序列中第i帧图像的高层语义信息相同。第一图像序列中第i帧图像与第二图像序列中第i帧图像的风格不同。目标迁移风格可以用于指示第二图像序列中第一帧图像的风格至第n帧图像的风格在M个风格中按第一风格顺序变化。n和M为大于1的整数。i为小于或等于n的正整数。电子设备可以保存第二图像序列。
上述第一图像序列可以为视频或者为从全景图分割得到的多帧图像。
其中,图像的风格可以包括图像的纹理特征、图像的艺术表现形式。图像的内容可以包括图像的低层语义信息和高层语义信息。图像的低层语义信息即为上述图像的风格。图像的高层语义信息以指图像所表达的最接近人类理解的事物。电子设备基于目标迁移风格对第一图像序列进行处理具体可以为进行风格迁移处理。上述风格迁移处理可以为改变图像的风格。即经过风格迁移处理的图像的风格改变,但原图像的高层语义信息不变。
上述目标迁移风格可以例如是昼夜转换风格、四季更迭风格、晴雨交替风格等风格。上述目标迁移风格可以是电子设备基于用户选择的风格选项来确定的。
上述目标迁移风格可用于确定上述M的大小。也即用于融合的风格迁移模型的数量。在一些实施例中,按上述第一风格顺序变化得到的第二图像序列中图像的高层语义信息可以呈现在自然时间上先后顺序的变化。例如,目标迁移风格为昼夜转换风格。则上述M个风格可 以为白天风格和黑夜风格。上述第一风格顺序可以为从白天风格变化至黑夜风格的顺序。目标迁移风格为四季更迭风格。则上述M个风格可以为春季风格、夏季风格、秋季风格和冬季风格。上述第一风格顺序可以为从春季风格变化至夏季风格、再从夏季风格变化至秋季风格、再从秋季风格变化至黑夜风格。本申请实施例对上述第一风格顺序中M个风格排列的顺序不作限定。
在一些实施例中,电子设备可以基于上述目标迁移风格,使用k个融合风格迁移模型对第一图像序列进行处理。k小于或等于n。k个融合风格迁移模型的输出图像为上述第二图像序列。其中,一个融合风格迁移模型的输出图像可以为第二图像序列中的一帧图像或连续多帧图像。
上述k个融合风格迁移模型中的一个融合风格迁移模型可以是M个单风格迁移模型加权生成的。上述一个融合风格迁移模型的输出图像的风格越接近第j个单风格迁移模型的输出图像的风格,生成这一个融合风格迁移模型时第j个单风格迁移模型的权重越大。上述M个单风格迁移模型各自输出图像的风格组成上述M个风格,j为小于或等于M的正整数。
例如,目标迁移风格为昼夜转换风格。电子设备用于生成融合模型的单风格迁移模型可以为白天风格迁移模型和黑夜风格迁移模型。用于对第一图像序列进行风格迁移处理的融合风格迁移模型中白天风格迁移模型和黑夜风格迁移模型的权重不同。具体的,从第一个融合风格迁移模型至第k个融合风格迁移模型,白天风格迁移模型的权重可以逐渐降低,黑夜风格迁移模型的权重可以逐渐增加。电子设备可以利用这k个融合风格迁移模型对第一图像序列进行风格迁移处理,得到第二图像序列。第二图像序列中的第一帧图像至第n帧图像的风格可以从白天风格逐渐变化到黑夜风格。
在一些实施例中,上述k个融合风格迁移模型和上述M个单风格迁移模型均为神经网络模型,且具有相同的神经网络结构。
上述用于生成融合风格迁移模型的单风格迁移模型是经过训练得到的。
在一些实施例中,电子设备可以获取训练数据集。上述训练数据集可包含一帧或多帧风格图像以及第一视频中的多帧内容图像。一帧或多帧风格图像的风格为经过训练的单风格迁移模型的输出图像的风格。上述内容图像可以为需要进行风格迁移的图像。电子设备可以利用待训练的单风格迁移模型处理第一视频中的多帧内容图像,得到多帧合成图像。
电子设备可以利用损失函数训练上述待训练的单风格迁移模型,得到经过训练的单风格迁移模型。其中,损失函数可以包括高层语义信息损失函数、风格损失函数、时域约束损失函数。上述高层语义信息损失函数由多帧内容图像的高层语义信息和多帧合成图像的高层语义信息确定。上述风格损失函数由多帧内容图像的风格和多帧合成图像的风格确定。上述时域约束损失函数由多帧合成图像中一帧合成图像的风格和与一帧合成图像相邻的多帧合成图像的风格确定。
在上述实施例中,电子设备在用于训练单风格迁移模型的损失函数中引入上述时域约束损失函数,可以考虑连续多帧内容图像之间的联系,减少单风格迁移模型在对视频中的多帧图像进行风格迁移处理时,相邻帧图像的风格出现跳变的概率。这样,训练好的风格迁移模型在对视频中的多帧图像进行风格迁移时,可以提高视频中连续多帧内容图像风格化效果的一致性,减少视频播放过程中的闪烁现象。
在一些实施例中,上述训练好的单风格迁移模型可存储在电子设备中。在需要利用融合风格迁移模型对视频进行风格迁移时,电子设备可以从本地获取单风格迁移模型进行融合。
可选的,上述训练好的单风格迁移模型可存储在云端。电子设备可以将需要进行风格迁 移的视频以及目标迁移风格上传至云端。云端可以利用融合风格迁移模型对视频进行风格迁移,并将得到的经过风格迁移的视频发送给电子设备。或者,电子设备100也可以仅将目标迁移风格发送给云端。云端可以根据上述目标迁移风格将需要融合的单风格迁移模型发送给电子设备。
在一些实施例中,电子设备可以开启摄像头采集得到第一视频,并根据第一视频得到上述第一图像序列中的n帧图像。其中,第一视频包含z帧图像。上述n帧图像是从上述z帧图像中抽取得到的。
在一些实施例中,电子设备在采集得到上述第一视频时,还可以对该第一视频进行防抖处理。上述防抖处理可以例如是电子防抖和/或光学防抖等方法中的防抖处理。
由上述实施例可知,电子设备可以对用户即时拍摄得到的视频进行抽帧以及风格迁移处理。这样,用户在短时间内拍摄出来的视频可以具有长时间拍摄视频的延时效果。例如,上述用户在1分钟内拍摄得到的延时视频可以具有原来需要拍摄12个小时甚至更长时间才能得到的从白天到黑夜快速渐变的延时效果。并且,电子设备对拍摄得到的视频进行防抖处理。这样,用户在拍摄延时视频时可以手持电子设备进行拍摄,而无需固定设备将用于拍摄的电子设备固定在一个地方。上述方法使得延时摄影突破了对拍摄场景、设备和时间的限制,提高了用户进行延时摄影的便利性和趣味性。
在一些实施例中,电子设备可以根据用户选择的第一视频从本地存储的视频中获取第一视频,并根据第一视频得到第一图像序列中的n帧图像。第一视频包含z帧图像,n帧图像是从z帧图像中抽取得到的。
也即是说,用户可以从图库应用程序等本地存储的视频中选择相应的视频。电子设备可以对用户选择的视频进行上述抽帧以及风格迁移处理,使得处理得到的视频可以具有长时间拍摄视频的延时效果。不限于本地存储的视频,电子设备还可以从云端获取视频进行上述抽帧以及风格迁移处理。
在一些实施例中,上述抽取的抽帧率可以由用户选择的第一图像序列的播放时长决定。其中,抽帧率可以为第一图像序列的播放时长和第一视频的采集时长的比值。例如,上述第一视频的采集时长为1分钟。用户选择的第一图像的播放时长为10秒。则电子设备可以按照1:6的比例从第一视频的多帧图像中抽取得到上述第一图像序列。
上述抽帧的方法可以根据用户选择想要生成的第一图像序列的播放时长来自定义对第一视频的抽帧率。这样,电子设备可以提供最大抽帧率不会受到第一视频的采集时长的限制。
在一些实施例中,电子设备可以将上述第二图像序列中的n帧图像按先后顺序串联保存为视频。该视频在播放的过程中可以呈现出上述M个风格按上述第一风格顺序变化的效果。
在一些实施例中,电子设备可以获取第一图像,并对该第一图像进行分割,得到上述第一图像序列中的n帧图像。
其中,电子设备通过滑动窗口截取图像的方式对上述第一图像进行分割。上述滑动窗口的长度可以为第一长度。滑动窗口每一次滑动的距离可以为第一滑动距离。电子设备可以将上述滑动窗口从第一图像的一侧向另一侧滑动n-1次,得到n帧长度均为第一长度的图像。上述第一长度可以小于第一滑动距离。即上述n帧图像中相邻图像存在重叠的部分。
在一些实施例中,电子设备可以从上述第二图像序列中的每一帧图像中截取一个拼接区域,得到n个拼接区域。上述n个拼接区域不存在重叠的部分。电子设备可以拼接上述n个拼接区域得到第二图像,并存储该第二图像。该第二图像的分辨率与上述第一图像的分辨率相同。该第二图像的高层语义信息与第一图像的高层语义信息相同。
上述n个拼接区域的分辨率可以是相同或者不同的。
上述第一图像可以是用户开启电子设备的摄像头即时拍摄得到的,或者可以是用户从电子设备的图库应用程序中选择的。
示例性的,目标迁移风格为昼夜转换风格。电子设备可以利用融合风格迁移模型对第一图像序列进行风格迁移处理。电子设备利用上述第二图像序列中的拼接区域进行拼接得到的第二图像从一侧至另一侧可以呈现从白天风格逐渐变化为黑夜风格的过程。
在上述方法中,由于第一图像序列中相邻图像存在重叠的部分。电子设备利用融合风格迁移模型对分别对第一图像序列进行风格迁移之后,从相邻图像截取的拼接区域的风格可以更加平滑地过渡。即上述对第一图像进行分割,以及从经过风格迁移得到的各图像截取拼接区域的方法可以提高第一图像风格化效果的平滑度。电子设备拼接得到的第二图像从一侧至另一侧的风格可以更加平滑地在上述M个风格中按第一风格顺序变化。
上述对第一图像进行分割得到第一图像序列并从第二图像序列中截取拼接区域进行拼接得到第二图像的实施例可以得到具有风格渐变效果的第二图像。这可以提高用户拍摄图像,特别是拍摄全景图的趣味性。
第二方面,本申请提供一种电子设备,该电子设备可包括显示屏、存储器、一个或多个处理器。该存储器可用于存储多个单风格迁移模型。该存储器还可用于存储计算机程序。该处理器可用于调用存储器中的计算机程序,使得电子设备执行上述第一方面中任一种可能的实现方式。
第三方面,本申请提供一种计算机存储介质,包括指令,当上述指令在电子设备上运行时,使得上述电子设备执行上述第一方面中任一种可能的实现方式。
第四方面,本申请实施例提供一种芯片,该芯片应用于电子设备,该芯片包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行上述第一方面中任一种可能的实现方式。
第五方面,本申请实施例提供一种包含指令的计算机程序产品,当上述计算机程序产品在设备上运行时,使得上述电子设备执行上述第一方面中任一种可能的实现方式。
可以理解地,上述第二方面提供的电子设备、第三方面提供的计算机存储介质、第四方面提供的芯片、第五方面提供的计算机程序产品均用于执行本申请实施例所提供的方法。因此,其所能达到的有益效果可参考对应方法中的有益效果,此处不再赘述。
附图说明
图1A~图1F是本申请实施例提供的一些拍摄延时视频的用户界面示意图;
图2A~图2C是本申请实施例提供的一些播放延时视频的用户界面示意图;
图3是本申请实施例提供的一种电子设备得到融合风格迁移模型的方法示意图;
图4是本申请实施例提供的一种电子设备利用融合风格迁移模型对视频进行风格迁移的方法示意图;
图5是本申请实施例提供的一种训练风格迁移模型的方法流程图;
图6是本申请实施例提供的一种拍摄方法流程图;
图7A~图7G是本申请实施例提供的一些对视频进行风格迁移的用户界面示意图;
图8A~图8E是本申请实施例提供的一些拍摄全景图的用户界面示意图;
图9是本申请实施例提供的一种电子设备利用融合风格迁移模型对全景图进行风格迁移 的方法示意图;
图10是本申请实施例提供的电子设备100的结构示意图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。
延时摄影是一种将时间压缩的摄影技术。
在一种可能的实现方式中,电子设备可以以正常摄影时采集图像的速率(例如每秒采集30帧图像)采集某一景点从早上(如7:00)到晚上(如22:00)的图像,得到原始视频。然后,电子设备可以对原始视频进行抽帧。例如,电子设备可以每隔1800帧图像抽取1帧图像。电子设备可以将经过抽帧处理得到的多帧图像按照采集时间的先后顺序串联,得到延时视频。以视频播放速率为每秒播放30帧图像,则电子设备可以将播放时间为15小时的视频压缩为播放时间为30秒的视频。即电子设备可以在30秒的延时视频中呈现某一景点从早上7:00到晚上22:00的变化。
在另一种可能的实现方式中,电子设备可以调整采集图像的速率来得到延时视频中包含的多帧图像。具体的,电子设备可以根据用户输入的延时速率或者预设的延时速率来采集图像。例如,延时速率为每秒采集2帧图像。然后,电子设备可以将在上述延时速率下采集得到的多帧图像串联,并以视频的形式播放。
可以看出,延时摄影一般需要花费较长的时间进行拍摄。并且为了避免拍摄视频出现抖动,用户在进行延时摄影时,往往需要利用三脚架等固定设备将用于拍摄的电子设备固定在一个地方。这对延时摄影的拍摄场景、设备以及时间都有极大的限制。用户手持电子设备(如手机)在短时间内就难以拍摄出具有延时效果(如白天到黑夜的快速变化)的视频。上述延时效果可以指将物体或者景物缓慢变化的过程压缩在较短的时间内,呈现出物体或者景物快速变化的过程。
本申请实施例提供一种图像处理方法。在该方法中,电子设备可以对在第一时间段内拍摄第一景物得到的第一视频进行防抖处理,得到第二视频。然后,电子设备可以对第二视频进行抽帧处理,得到第三视频。电子设备可以利用融合风格迁移模型对第三视频中的多帧图像进行风格迁移,得到第四视频。其中,上述融合风格迁移模型是电子设备至少融合了两个风格迁移模型得到的。例如,电子设备可以利用融合白天风格迁移模型和黑夜风格迁移模型得到的模型对第三视频中的多帧图像进行风格迁移。从而第四视频可以呈现出第一景物在时间上按照从白天到黑夜的先后顺序快速变化的效果。本申请实施例对上述第一时间段的长度不作限定。第一时间段可以是较短的时间段,例如30秒、1分钟等。
可以看出,电子设备可以对在短时间内拍摄得到的视频进行上述图像处理方法中的处理,使得在短时间内拍摄出来的视频可以具有长时间拍摄视频的延时效果。上述图像处理方法使得延时摄影突破了对拍摄场景、设备和时间的限制,提高了用户进行延时摄影的便利性和趣 味性。
为了便于理解本申请中的图像处理方法,这里对图像的风格、风格迁移、风格迁移模型、融合风格迁移模型的概念进行介绍。
1、图像的风格
图像的风格可以包括图像的纹理特征、图像的艺术表现形式。例如,图像的风格可以为卡通风格、漫画风格、油画风格、写实风格、浮世绘风格、白天风格、黑夜风格、春季风格、夏季风格、秋季风格、冬季风格、晴天风格、雨天风格等等。本申请实施例对图像的风格不作限定。
2、风格迁移
对图像进行风格迁移可以指将一幅具有风格迁移需要的第一图像与具有目标风格的第二图像融合,生成第三图像。上述融合的过程可以为利用风格迁移模型对第一图像进行处理的过程。上述风格迁移模型可用于输出具有上述目标风格的图像。上述风格迁移模型的说明可以参考下述第三点的介绍。
上述第三图像具有第一图像的内容中的高层语义信息和第二图像的风格。其中,图像的内容可以包括图像的低层语义信息和高层语义信息。低层语义信息可以指图像的颜色、纹理等。图像的低层语义信息也即为图像的风格。高层语义信息可以指图像所表达的最接近人类理解的事物。例如一幅包含沙子、蓝天、海水的图像。该图像的低层语义信息可以包括沙子、蓝天以及海水的颜色和纹理。该图像的高层语义信息可以为图像中包含沙子、蓝天、海水以及该图像是一幅海滩图像。
上述具有风格迁移需要的第一图像可以为内容图像。上述具有目标风格的第二图像可以为风格图像。对内容图形进行风格迁移得到的图像可以为合成图像。电子设备对内容图像进行风格迁移即为保存内容图像的高层语义信息,并将内容图像的风格替换为风格图像的风格。示例性的,内容图像为上述海滩图像。内容图像当前的风格为写实风格。风格图像的风格为卡通风格。电子设备对上述海滩图像进行风格迁移,可以将一幅写实风格的海滩图像转换为一幅卡通风格的海滩图像。进行风格迁移后得到的卡通风格的海滩图像中仍能呈现出沙子、蓝天、海水。但相比于写实风格的海滩图像的低层语义信息,该卡通风格的海滩图像的颜色、纹理等低层语义信息发生了改变。
3、风格迁移模型
风格迁移模型可用于接收内容图像,并生成合成图像。该合成图像可具有内容图像的高层语义信息和该风格迁移模型对应的风格。一个风格迁移模型可对应一种风格。例如,卡通风格迁移模型可对应卡通风格。卡通风格迁移模型可以将接收的内容图像的风格替换为卡通风格。风格迁移模型具体可以为神经网络模型。风格迁移模型可以利用大量训练数据得到。其中,一个训练数据可以由训练样本和与该训练样本对应的训练结果组成。上述训练样本可包括内容图像(前述第一图像)和风格图像(前述第二图像)。上述训练样本对应的训练结果可以为合成图像(前述第三图像)。
关于训练风格迁移模型的具体方法将在后续实施例中进行说明,这里先不展开介绍。
4、融合风格迁移模型
融合风格迁移模型是融合两个及以上风格迁移模型得到的风格迁移模型。进行融合的多个风格迁移模型中参数的权重可以不相同。电子设备可以通过改变进行融合的多个风格迁移模型中参数的权重,来改变得到的融合风格迁移模型对应的风格。示例性的,电子设备可以 融合白天风格迁移模型和黑夜风格迁移模型。电子设备改变白天风格迁移模型中参数的权重和黑夜风格迁移模型中参数的权重,可以得到介于白天风格和黑夜风格之间风格(例如黄昏风格)的融合风格迁移模型。电子设备可以利用融合得到的融合风格迁移模型对上述第三视频中的多帧图像分别进行风格迁移。其中,对每一帧图像进行风格迁移的融合风格迁移模型中,白天风格迁移模型中参数的权重和黑夜风格迁移模型中参数的权重可以发生改变。从而,经过融合风格迁移模型处理得到的第四视频可以呈现出第一景物从白天到黑夜快速渐变的过程。也即是说,用户在短时间内拍摄出来的视频也可以具有长时间拍摄视频的延时效果。
由于本申请涉及神经网络的应用,为了便于理解,下面对本申请实施例可能涉及的神经网络的相关术语进行介绍。
1、神经网络
神经网络可以是由神经单元组成的,神经单元可以是指以x s和截距1为输入的运算单元,该运算单元的输出可以参考下述公式(1):
Figure PCTCN2021135353-appb-000001
其中,s=1、2、……、n,n为大于1的自然数,W s为x s的权重,b为神经单元的偏置。f为神经单元的激活函数(activation functions),用于将非线性特性引入神经网络中,来将神经单元中的输入信号转换为输出信号。该激活函数的输出信号可以作为下一层卷积层的输入。激活函数可以是sigmoid函数。神经网络是将许多个上述单一的神经单元联结在一起形成的网络,即一个神经单元的输出可以是另一个神经单元的输入。每个神经单元的输入可以与前一层的局部接受域相连,来提取局部接受域的特征,局部接受域可以是由若干个神经单元组成的区域。
2、卷积神经网络
卷积神经网络(convolutional neuron network,CNN)是一种带有卷积结构的神经网络。卷积神经网络包含了一个由卷积层和子采样层构成的特征抽取器。该特征抽取器可以看作是滤波器,卷积过程可以看作是使用一个可训练的滤波器与一个输入的图像或者卷积特征平面(feature map)做卷积。卷积层是指卷积神经网络中对输入信号进行卷积处理的神经元层。在卷积神经网络的卷积层中,一个神经元可以只与部分邻层神经元连接。一个卷积层中,通常包含若干个特征平面,每个特征平面可以由一些矩形排列的神经单元组成。同一特征平面的神经单元共享权重,这里共享的权重就是卷积核。共享权重可以理解为提取图像信息的方式与位置无关。这其中隐含的原理是:图像的某一部分的统计信息与其他部分是一样的。即意味着在某一部分学习的图像信息也能用在另一部分上。所以对于图像上的所有位置,都能使用同样的学习得到的图像信息。在同一卷积层中,可以使用多个卷积核来提取不同的图像信息,一般地,卷积核数量越多,卷积操作反映的图像信息越丰富。
卷积核可以以随机大小的矩阵的形式初始化,在卷积神经网络的训练过程中卷积核可以通过学习得到合理的权重。另外,共享权重带来的直接好处是减少卷积神经网络各层之间的连接,同时又降低了过拟合的风险。
3、损失函数
在训练神经网络的过程中,因为希望神经网络的输出尽可能的接近真正想要预测的值,所以可以通过比较当前网络的预测值和真正想要的目标值,再根据两者之间的差异情况来更新每一层神经网络的权重向量(当然,在第一次更新之前通常会有初始化的过程,即为神经网络中的各层预先配置参数),比如,如果网络的预测值高了,就调整权重向量让它预测低一 些,不断的调整,直到神经网络能够预测出真正想要的目标值或与真正想要的目标值非常接近的值。因此,就需要预先定义“如何比较预测值和目标值之间的差异”,这便是损失函数(loss function)或目标函数(objective function),它们是用于衡量预测值和目标值的差异的重要方程。其中,以损失函数举例,损失函数的输出值(loss)越高表示差异越大,那么神经网络的训练就变成了尽可能缩小这个loss的过程。
4、反向传播算法
卷积神经网络可以采用误差反向传播(back propagation,BP)算法在训练过程中修正初始的超分辨率模型中参数的大小,使得超分辨率模型的重建误差损失越来越小。具体地,前向传递输入信号直至输出会产生误差损失,通过反向传播误差损失信息来更新初始的超分辨率模型中参数,从而使误差损失收敛。反向传播算法是以误差损失为主导的反向传播运动,旨在得到最优的超分辨率模型的参数,例如权重矩阵。
下面介绍本申请涉及的一种典型的拍摄场景。
如图1A所示,电子设备100可以包括摄像头193。其中,摄像头193可以为前置摄像头。摄像头193还可以包含后置摄像头。电子设备100可以显示图1A所示的用户界面210。用户界面210可包括应用图标显示区域211、具有常用应用程序图标的托盘212。其中:
应用图标显示区域211可包含图库图标211A。响应于作用在图库图标211A的用户操作,例如触摸操作,电子设备100可以开启图库应用程序,从而显示电子设备100中存储的图片和视频等信息。电子设备100中存储的图片和视频中包括电子设备100通过相机应用程序拍摄的照片和视频。应用图标显示区域211还可以包括更多的应用程序图标,例如邮件图标、音乐图标、运动健康图标等等,本申请实施例对此不作限定。
具有常用应用程序图标的托盘212可展示相机图标212A。响应于作用在相机图标212A上的用户操作,例如触摸操作,电子设备100可以开启相机应用程序,从而进行拍照以及录像等功能。其中,电子设备100开启相机应用程序时,可以开启摄像头193(前置摄像头和/或后置摄像头),来实现拍照以及录像等功能。具有常用应用程序图标的托盘212还可以展示更多的应用程序的图标,例如拨号图标、信息图标、联系人图标等等,本申请实施例对此不作限定。
用户界面210还可以包含更多或更少的内容,例如显示当前时间和日期的控件、显示天气的控件等等。可以理解的是,图1A仅仅示例性示出了电子设备100上的用户界面,不应构成对本申请实施例的限定。
响应于作用在相机图标212A的用户操作,电子设备100可以显示如图1B所示的用户界面220。用户界面220可包括预览区域221、闪光灯控件222、设置控件223、相机模式选项201、图库快捷控件202、快门控件203、摄像头翻转控件204。其中:
预览区域221可用于显示摄像头193实时采集的图像。电子设备可以实时刷新其中的显示内容,以便于用户预览摄像头193当前采集的图像。
闪光灯控件222可用于开启或者关闭闪光灯。
设置控件223可用于调整拍摄照片的参数(如分辨率、滤镜等)以及开启或关闭一些用于拍照的方式(如定时拍照、微笑抓拍、声控拍照等)等。设置控件223可用于设置更多其他拍摄的功能,本申请实施例对此不作限定。
相机模式选项201中可以显示有一个或多个拍摄模式选项。这一个或多个拍摄模式选项可以包括:大光圈模式选项201A、录像模式选项201B、拍照模式选项201C、延时摄影模式 选项201D和人像模式选项201E。这一个或多个拍摄模式选项在界面上可以表现为文字信息,例如“大光圈”、“录像”、“拍照”、“延时摄影”、“人像”。不限于此,这一个或多个摄像选项在界面上还可以表现为图标或者其他形式的交互元素(interactive element,IE)。当检测到作用于拍摄模式选项上的用户操作,电子设备100可以开启用户选择的拍摄模式。不限于图1B所示,相机模式选项201中还可以包含更多或更少的拍摄模式选项。用户可以通过在相机模式选项201中向左/右滑动来浏览其他拍摄模式选项。
图库快捷控件202可用于开启图库应用程序。响应于作用在图库快捷键202上的用户操作,例如点击操作,电子设备100可以开启图库应用程序。这样,用户可以便捷地查看拍摄的照片和视频,而无需先退出相机应用程序,再开启图库应用程序。图库应用程序是智能手机、平板电脑等电子设备上的一款图片管理的应用程序,又可以称为“相册”,本实施例对该应用程序的名称不做限制。图库应用程序可以支持用户对存储于电子设备100上的图片进行各种操作,例如浏览、编辑、删除、选择等操作。
快门控件203可用于监听触发拍照的用户操作。电子设备100可以检测到作用于快门控件203的用户操作,响应于该操作,电子设备100可以将预览区域221中的图像保存为图库应用程序中的图片。另外,电子设备100还可以在图库快捷键203中显示所保存的图像的缩略图。也即是说,用户可以点击快门控件203来触发拍照。其中,快门控件203可以是按钮或者其他形式的控件。
摄像头翻转控件204可用于监听触发翻转摄像头的用户操作。电子设备100可以检测到作用于摄像头翻转控件204的用户操作,例如点击操作,响应于该操作,电子设备100可以翻转用于拍摄的摄像头,例如将后置摄像头切换为前置摄像头,或者将前置摄像头切换为后置摄像头。
用户界面221还可以包含更多或更少的内容,本申请实施例对此不作限定。
下面对电子设备100在进行延时摄影时的一些用户界面进行介绍。
图1C~图1F示例性示出了电子设备100进行延时摄影的用户界面。
如图1B所示,响应于作用在延时摄影模式选项201D的用户操作,电子设备100可以显示如图1C所示的用户界面230。用户界面230包含的基本控件与用户界面210基本相同。另外,用户界面230中可包含风格选项231。
风格选项231中可包含一个或多个风格选项。例如,昼夜转换风格选项231A、四季更迭风格选项231B、晴雨交替风格选项231C。这一个或多个风格选项在界面上可表现为文字信息。例如“昼夜转换”、“四季更迭”、“晴雨交替”。不限于此,这一个或多个风格选项在界面上还可以表现为图标或者其他形式的交互元素。每一个风格选项均可用于指示电子设备100对拍摄得到的延时视频进行风格迁移,将视频的风格转换为该风格选项对应的风格。
例如,上述昼夜转换风格选项231A对应的风格迁移模型可以为融合白天风格迁移模型和黑夜风格迁移模型得到的融合风格迁移模型。电子设备100可以利用该融合风格迁移模型对视频中的多帧图像进行风格迁移,得到可以呈现拍摄的内容从白天到黑夜快速渐变的过程的视频。上述四季更迭转换风格选项231B对应的风格迁移模型可以为春季风格迁移模型、夏季风格迁移模型、秋季风格迁移模型、冬季风格迁移模型得到的融合风格迁移模型。电子设备100可以利用该融合风格迁移模型对视频中的多帧图像进行风格迁移,得到可以呈现拍摄的内容从春季到夏季,从夏季到秋季,再从秋季到冬季快速渐变的过程的视频。上述晴雨交替风格选项231C对应的风格迁移模型可以为融合晴天风格迁移模型和雨天风格迁移模型得到的融合风格迁移模型。电子设备100可以利用该融合风格迁移模型对视频中的多帧图像 进行风格迁移,得到可以呈现拍摄的内容从晴天到雨天快速渐变的过程的视频。
上述风格选项231中还可以包含更多或更少的风格选项。不限于图1C所示的融合风格迁移风格模型,风格选项231中也可以包含单个风格迁移模型(如卡通风格迁移模型)。
本申请实施例对上述利用融合风格迁移模型进行风格迁移得到的视频的风格的具体变化方式不作限定。例如,当接收到用户选择的风格为昼夜转换风格,电子设备100也可以对视频进行风格迁移处理,使得视频在播放的过程中呈现拍摄的内容从黑夜到白天快速渐变的过程。
如图1C所示,昼夜转换风格选项231A、四季更迭转换风格选项231B、晴雨交替风格选项231C均为未选中状态。例如,上述风格选项的颜色均为白色。响应于作用在昼夜转换风格选项231A的用户操作,例如触摸操作,电子设备100可以显示如图1D所示的用户界面230。在图1D中,昼夜转换风格选项231A可以呈现被选中状态。例如,昼夜转换风格选项231A的颜色可以变为灰色。本申请实施例对风格选项呈现未选中状态和被选中状态的表现方式不作限定。
响应于作用在快门控件203的用户操作,例如触摸操作,电子设备100可以显示如图1E所示的用户界面230。在图1E中,用户界面230可包含时间选择框232。该时间选择框232可用于用户选择生成的延时视频的时间长度。时间选择框232可包括提示语232A、时间选项232B、确认控件232C和取消控件232D。其中:
提示语232A可用于提示用户选择生成的延时视频的时间长度。例如,提示语232A可包含文字提示“请确定延时摄影得到的视频的时长”。
时间选项232B可用于用户选择延时视频的时间长度。例如10秒。
确认控件232C可用于指示电子设备100开始进行延时摄影。其中,电子设备100可以存储时间选项232B所指示的时间长度。进一步的,电子设备100可以对拍摄得到的视频进行处理,以得到时间长度为时间选项232B所指示的时间长度(即用户选择的时间长度)的延时视频。
取消控件232D可用于用户取消选择延时视频的时间长度。响应于作用在取消控件232D的用户操作,电子设备100可以显示如图1D所示的用户界面230。
本申请实施例对上述时间选择框的具体表现形式不作限定。
电子设备100在开始进行延时摄影前,可以获取用户想要生成的延时视频的时间长度。经过拍摄得到原始视频后,电子设备可以对该原始视频进行抽帧,使得生成的延时视频的时间长度为用户想要的时间长度。在一些实施例中,电子设备可以通过提供延时速率来确定在拍摄时采集图像的速率或者对拍摄得到的原始视频的抽帧率。但在这种方式中,电子设备可提供的最高延时速率往往是有限的。当拍摄的时间过长,电子设备生成的延时视频的时间长度仍然较长,延时视频所呈现的延时效果不明显。相比于上述实施例中的实现方式,本申请一些实施例在进行延时摄影时确定的抽帧率可以根据实际拍摄的时间长度和上述时间选项232B指示的时间长度来确定。即用户可以通过在上述时间选项232B中选择想要生成的延时视频的时间长度来自定义对原始视频的抽帧率。
如图1E所示,时间选项232B指示的时间长度为10秒。响应于作用在确认控件232C的用户操作,电子设备100可以显示如图1F所示的延时摄影界面240。延时摄影界面240中可包括预览区域221、拍摄时间指示符241和停止拍摄控件205。其中:
预览区域221可以显示在进行延时摄影时,电子设备100通过摄像头采集的图像。当结束延时摄影时,电子设备100可以将在延时摄影过程中(即从开始进行延时摄影到结束延时 摄影的时间段)先后显示在预览区域221中的一系列图像存储为原始视频。这样,电子设备100可以对该原始视频进行防抖处理、抽帧处理以及风格迁移等处理,来得到延时视频。
拍摄时间指示符241可用于指示电子设备100进行延时摄影已经拍摄的时间长度。如图1F所示,拍摄时间指示符241中包含“00:01:00”可以指示电子设备100进行延时摄影已经拍摄了1分钟。
停止拍摄控件205可用于结束延时摄影。如图1F所示,在拍摄的时间长度为1分钟时,响应于作用在停止拍摄控件205的用户操作,电子设备100可以结束延时摄影。其中,电子设备100可以得到时间长度为1分钟的原始视频。
电子设备100可以根据用户选择的风格以及想要生成延时视频的时间长度,对原始视频进行防抖处理、抽帧处理以及风格迁移,来得到延时视频。
图2A~图2C示例性示出了电子设备100播放在图1C~图1F所示拍摄过程中得到的延时视频的用户界面。
在播放延时视频时,电子设备100可以显示如图2A所示的视频播放界面250。视频播放界面250可包括时间控件251、图像显示区域252、暂停控件253、进度条254、视频已播放时间255和视频总时长256。其中:
时间控件251可以指示电子设备100存储该延时视频的时间。例如2020年11月9日上午8点过8分。
图像显示区域252可用于显示延时视频中包含的一帧帧图像。
暂停控件253可用于暂停播放延时视频。
进度条254可用于对比视频已播放时间和视频总时长,指示视频播放的进度。
视频已播放时间255可用于指示视频已播放的时间。
视频总时长256可用于指示延时视频的总时长。由图2A可以看出,该延时视频的总时长为10秒。即电子设备100根据图1E所示的时间选项232B指示的时间长度将拍摄时间长度为1分钟的视频,处理为时间长度为10秒的延时视频。那么,电子设备100可以将在1分钟内拍摄的内容在10秒的时间内呈现出来。
由图2A~图2C所示的视频播放过程可以看出,该延时视频是经过风格迁移的。其中,根据用户选择的风格选项,电子设备100可以利用对应的风格迁移模型对视频中的多帧图像分别进行风格迁移。例如,根据昼夜转换风格选项231A,电子设备100可以利用融合有白天风格迁移模型和黑夜风格迁移模型的融合风格迁移模型对视频进行风格迁移。经过上述风格迁移得到的延时视频可以呈现出拍摄的内容从白天到黑夜快速渐变的过程。
在图2A~图2C中,延时视频中包含的图像保留有原始视频中包含的图像的高层语义信息(如树木、河流)。但延时视频中包含第一帧图像至最后一帧图像的风格逐渐从白天风格变化到黑夜风格。如图2A所示,在延时视频播放至第2秒时,图像的风格为白天风格。即延时视频在播放时间为第2秒时的图像可以呈现出拍摄的内容(如树木、河流)在白天的景象。如图2B所示,在延时视频播放至第4秒时,图像的风格为介于白天风格与黑夜风格之间的风格(如黄昏风格)。即延时视频在播放时间为4秒时的图像可以呈现出拍摄的内容(如树木、河流)在黄昏时的景象。如图2C所示,在延时视频播放至第8秒时,图像的风格为黑夜风格。即延时视频在播放时间为第2秒时的图像可以呈现出拍摄的内容(如树木、河流)在黑夜的景象。
上述图2A~图2C仅是对图像的风格从白天风格逐渐变化为黑夜风格的示例性说明,不对对应风格的图像的具体呈现内容产生限定。
由上述拍摄和播放延时视频的实施例可以看出,用户在短时间内拍摄出来的视频可以具有长时间拍摄视频的延时效果。例如,上述用户在1分钟内拍摄得到的延时视频可以具有原来需要拍摄12个小时甚至更长时间才能得到的从白天到黑夜快速渐变的延时效果。另外,电子设备可以对拍摄得到的视频进行防抖处理。这样,用户在拍摄延时视频时可以手持电子设备进行拍摄,而无需固定设备将用于拍摄的电子设备固定在一个地方。上述图像处理方法使得延时摄影突破了对拍摄场景、设备和时间的限制,提高了用户进行延时摄影的便利性和趣味性。
下面具体介绍本申请实施例中电子设备100进行防抖处理的实现方式。
拍摄的视频出现抖动一般是由于在拍摄过程中,用于拍摄的电子设备的位姿发生变化。电子设备可以通过计算自己在拍摄过程中位姿的变化,对视频中的各帧图像进行处理,以消除抖动。具体的,电子设备100可以利用运动传感器(如陀螺仪传感器、加速度传感器)来计算自己在拍摄过程中位姿的变化。电子设备100可以根据自己位姿的变化确定在拍摄过程中的原始运动路径。进一步的,电子设备100可以对原始运动路径进行平滑处理(即消除运动路径上存在抖动的部分),以得到电子设备100在平稳拍摄状态下位姿的变化。根据电子设备100在采集某一帧图像时实际的位姿与在平稳拍摄状态下采集这一帧图像时的位姿之间的变换关系,电子设备100可以对这一帧图像进行图像配准,从而得到这一帧图像的各像素在平稳拍摄状态下对应的坐标。电子设备100可以将经过图像配准的各帧图像按采集时间的先后顺序串联,得到更加稳定的视频。
不限于上述电子防抖的实现方式,电子设备100还可以通过光学防抖的方法来减少或者消除拍摄得到的视频的抖动。例如,电子设备100摄像头的镜片组中包含磁力悬浮镜片。在拍摄过程中,电子设备100可以利用运动传感器检测到抖动。根据运动传感器的测量值,电子设备100可以控制磁力悬浮镜片,对光路进行补偿,避免光路发生抖动。这样,电子设备100可以减少或者消除拍摄得到的视频的抖动。
在一些实施例中,电子设备100还可以结合上述电子防抖和光学防抖的方法进行防抖处理。
本申请实施例对电子设备100进行防抖处理的方法不作限定,上述防抖处理的方法还可以参考现有技术中其他的视频防抖方法。
下面具体介绍本申请实施例中电子设备100融合多个风格迁移模型并利用融合风格迁移模型对视频进行风格迁移的一种实现方式。
1、融合多个风格迁移模型
图3示例性示出了电子设备100融合M个风格迁移模型的流程图。M为大于或等于2的正整数。其中:
第一风格迁移模型、第二风格迁移模型、…、第M风格迁移模型均为已经训练好的风格迁移模型,且均对应单一的风格。例如,第一风格迁移模型对应的风格为黑夜风格。则该第一风格迁移模型可以将输入图像的风格转变为黑夜风格。
这M个风格迁移模型具体可以为神经网络模型,例如卷积神经网络模型。并且,这M个风格迁移模型的网络结构是相同的。
电子设备100可以通过插值融合的方法将这M个风格迁移模型融合为一个具有特定风格的融合风格迁移模型。具体的,电子设备100可以将这M个风格迁移模型相同位置的参数进 行插值融合,并将插值融合后得到的参数作为融合风格迁移模型在这一位置上的参数。
电子设备100对M个风格迁移模型相同位置的参数进行插值融合的方法可以参考下述公式(2):
θ interp=α 1θ 12θ 2+…+α iθ i+…+α Mθ M   (2)
其中,i为大于或等于1且小于或等于M的正整数。θ i可以表示第i个风格迁移模型在第一位置上的参数。上述第一位置可以是第i个风格迁移模型中的任意一个位置。该参数可以例如是第i个风格迁移模型中某一神经单元的偏置b,以及这一神经单元上一层中各神经单元的权重W s。α i可以表示第i个风格迁移模型的融合权重。α i大于或等于0,且α 12+…+α i+…+α M=1。本申请实施例对α i的取值不作其它限定。θ interp可以表示经过插值融合后得到的参数。即融合风格迁移模型在第一位置上的参数。
根据上述公式(2),电子设备100可以确定融合风格迁移模型在各位置上的参数的值,从而得到融合了上述M个风格迁移模型的融合风格迁移模型。可以理解的,该融合风格迁移模型的网络结构与这M个风格迁移模型的网络结构是相同的。上述第i个风格迁移模型的第一位置与上述融合风格迁移模型的第一位置是相同网络结构中的同一位置。
这里具体以电子设备100融合白天风格迁移模型和黑夜风格迁移模型为例进行说明。
融合得到的融合风格迁移模型在第一位置的参数的计算公式可参考下式(3):
θ interp=α dayθ daynightθ night     (3)
其中,α day、α night可以分别表示白天风格迁移模型和黑夜风格迁移模型的融合权重。α day、α night均为大于或等于0的正数,且α daynight=1。θ day、θ night可以分别表示白天风格迁移模型和黑夜风格迁移模型在第一位置的参数。
可以理解的,该融合风格迁移模型对应的风格是介于白天风格和黑夜风格之间的风格,且由α day、α night的取值决定。当α day的值越大,α night的值越小,该融合风格迁移模型对应的风格越接近白天风格。当α day的值越小,α night的值越大,该融合风格迁移模型对应的风格越接近黑夜风格。
电子设备100利用该融合风格迁移模型对视频中的多帧图像进行风格迁移时,可以通过改变上述α day、α night的取值来得到能呈现从白天到黑夜快速渐变这一延时效果的视频。具体的,在对视频的第一帧图像至最后一帧图像进行风格迁移的融合风格迁移模型中,α day的值可以逐渐减小,且α night的值逐渐增大。
例如,视频中包含n帧图像。n为大于1的整数。电子设备100对这n帧图像中第j帧图像进行风格迁移的融合风格迁移模型中,第一位置的参数的计算公式可以参考下式(4):
Figure PCTCN2021135353-appb-000002
其中,j为大于或等于1且小于或等于n的整数。第一位置为该融合风格迁移模型中任意一个位置。
2、利用融合风格迁移模型对视频进行风格迁移
电子设备100利用融合风格迁移模型对视频中的多帧图像进行风格迁移时,可以调整其中被融合的风格迁移模型所占的权重,从而让视频呈现从一个风格逐渐变化至另一个风格的效果。其中,一个融合风格迁移模型输出图像的风格越接近其中一个被融合的风格迁移模型输出图像的风格时,这一个被融合的风格迁移模型所占的权重越大。
电子设备100可以根据前述实施例中的风格选项来确定被融合的风格迁移模型。例如,风格选项为昼夜转换风格选项。电子设备100可以确定被融合的风格迁移模型包括白天风格 迁移模型和黑夜风格迁移模型。电子设备利用融合有白天风格迁移模型和黑夜风格迁移模型的融合风格迁移模型对视频进行风格迁移后,该视频中包含图像的风格可以从白天风格逐渐变化到黑夜风格。也即上述经过风格迁移的视频在播放过程中可以呈现视频中的景物在时间上从白天到黑夜的变化。
图4示例性示出了电子设备100利用融合有白天风格迁移模型和黑夜风格迁移模型的融合风格迁移模型对视频进行风格迁移的实现方式。
电子设备100可以根据上述公式(4)对白天风格迁移模型和黑夜风格迁移模型进行插值融合。
如图4所示,该视频包含n帧图像。对这n帧图像中第j帧图像进行风格迁移的模型可以为融合风格迁移模型j。融合风格迁移模型j在第一位置的参数可以为((n-j+1)/n)θ day+((j-1)/n)θ night。电子设备100可以计算融合风格迁移模型j在所有位置上的参数,从而得到该融合风格迁移模型j。
电子设备100可以利用融合风格迁移模型1~融合风格迁移模型n分别对第1帧图像~第n帧图像进行风格迁移。融合风格迁移模型1~融合风格迁移模型n对应的风格分别为风格1~风格n。根据上述公式(4)可知,上述融合风格迁移模型1可以为白天风格迁移模型。风格1可以为白天风格。上述融合风格迁移模型n可以为黑夜风格迁移模型。风格n可以为黑夜风格。
由图4可以看出,第1帧图像~第n帧图像的风格分别为风格1~风格n。第1帧图像~第n帧图像的风格逐渐从白天风格变化为黑夜风格。经过风格迁移的视频在播放的过程中可以呈现从白天到黑夜快速渐变的效果。
在一些实施例中,电子设备100利用M个风格迁移模型得到的融合风格迁移模型的数量可以少于需要进行风格迁移的视频中图像的帧数。其中,一个融合风格迁移模型可以对需要进行风格迁移的视频中一帧图像或连续多帧图像进行风格迁移。
下面介绍本申请实施例提供的一种训练风格迁移模型的实现方法。
训练风格迁移模型的设备可以为训练设备。训练得到的风格迁移模型为对应一种风格的风格迁移模型。例如对应黑夜风格的风格迁移模型。这里以训练黑夜风格迁移模型为例进行说明。
在一种可能的实现方式中,用于训练黑夜风格迁移模型的训练集中可包含需要进行风格迁移的内容图像以及黑夜风格图像。训练设备可以将内容图像输入待训练的黑夜风格迁移模型中,得到合成图像。进一步的,训练设备可以计算损失函数的loss。这个loss可用于表示合成图像的风格与上述黑夜风格图像的风格之间的差距,以及合成图像的内容与输入黑夜风格迁移模型的内容图像在高层语义信息上的差距。loss的值越高,合成图像的风格与上述黑夜风格图像的风格之间的差距以及合成图像的内容与输入黑夜风格迁移模型的内容图像在高层语义信息上的差距越大。根据计算得到的loss,训练设备可以通过反向传播算法调整待训练的黑夜风格迁移模型中的参数。其中,训练设备朝着使得loss的值越低(即合成图像的风格与上述黑夜风格图像的风格之间的差距越小、合成图像的内容与输入黑夜风格迁移模型的内容图像在高层语义信息上的差距越小)的方向调整待训练的黑夜风格迁移模型中的参数。当上述loss的值小于预设阈值,训练设备可以得到训练好的黑夜风格迁移模型。
相比于对单独的一帧帧图像进行风格迁移,对视频包含的多帧图像进行风格迁移需要考虑视频中多帧图像是连续的。利用上述实现方式中的训练方法得到的风格迁移模型对视频包含的多帧图像进行风格迁移后,相邻两帧合成图像之间的风格可能会出现跳变。那么视频在 播放的过程中可能会出现不希望的闪烁等现象。为了提高视频中连续多帧内容图像风格化效果的一致性,减少视频播放过程中的闪烁现象,训练设备在训练风格迁移模型时可以考虑连续多帧内容图像之间的联系。具体的,训练设备训练风格迁移模型时,可以在损失函数中引入多帧时域损失。
图5示例性示出了另一种训练风格迁移模型的方法流程图。该方法可特别适用于训练对视频进行风格迁移的风格迁移模型。如图5所示,该训练方法可包括步骤S101~S104。该训练方法中用于训练风格迁移模型的训练集可包含需要进行风格迁移的视频。其中:
S101、训练设备将视频第r帧内容图像输入待训练的风格迁移模型,并计算第r帧内容图像对应的损失loss_cur。
训练设备可以依次利用视频中的第一帧内容图像至最后一帧内容图像来训练风格迁移模型。计算第r帧内容图像对应的损失的方法可以参考前述实现方式中的说明。
S102、训练设备获取第r帧内容图像的前h帧内容图像输入待训练的风格迁移模型得到h帧合成图像。
上述r和h的均为正整数,且h小于r。
S103、训练设备分别计算第r帧内容图像输入待训练的风格迁移模型得到的合成图像与上述h帧合成图像的差距,得到多帧时域损失L ct
训练设备可以参考下述公式(5)来计算多帧时域损失L ct
Figure PCTCN2021135353-appb-000003
其中,N可以表示待训练的风格迁移模型。f cur可以表示用于训练风格迁移模型的当前一帧内容图像,即视频的第r帧内容图像。N(f cur)可以表示将第r帧内容图像输入待训练的风格迁移模型得到的内容图像。f pre_i可以表示第r帧内容图像的前第i帧内容图像。N(f pre_i)可以表示将第r帧内容图像的前第i帧内容图像输入待训练的风格迁移模型得到的内容图像。λ pre_i可以表示与第r帧内容图像的前第i帧内容图像对应的权重系数。λ pre_i大于或等于0,且λ 1+…+λ pre_i+…+λ h=1。
S104、根据上述损失loss_cur和多帧时域损失L ct,训练设备利用反向传播算法调整待训练风格迁移模型的参数。
训练设备可以以上述loss_cur和多帧时域损失L ct的和作为待训练风格迁移模型的输入为第r帧内容图像时对应的损失,进而利用反向传播算法调整待训练风格迁移模型的参数。
训练设备可以利用多个视频对风格迁移模型进行训练。当包含有多帧时域损失的loss的值小于预设阈值,训练设备可以得到训练好的风格迁移模型。该训练好的风格迁移模型在对视频中的多帧图像进行风格迁移时,可以提高视频中连续多帧内容图像风格化效果的一致性,减少视频播放过程中的闪烁现象。
计算上述多帧时域损失的方法不限于是公式(5)指示的计算方法,训练设备还可以通过其他方法来计算第r帧内容图像输入待训练的风格迁移模型得到的合成图像与上述h帧合成图像的差距。
另外,不限于第r帧内容图像的前h帧内容图像,训练设备还可以获取第r帧内容图像的后若干帧图像来计算上述多帧时域损失。
在一些实施例中上述训练设备与本申请中的电子设备100可以为同一个设备。
可以理解的,由于引入多帧时域损失训练得到的风格迁移模型可以减小经过风格迁移的 视频中连续多帧图像风格的差异,那么电子设备利用这些风格迁移模型中的两个或多个进行插值融合得到的融合风格迁移模型也可以减小经过风格迁移的视频中连续多帧图像风格的差异。也即是说,电子设备根据前述实施例中的方法利用融合风格迁移模型对视频进行风格迁移后,视频中相邻帧图像的风格可以平滑过渡,视频播放过程中的由于相邻帧图像的风格跳变导致的闪烁现象得以减少。
在一些实施例中,上述训练好的风格迁移模型(如白天风格迁移模型、黑夜风格迁移模型)可存储在电子设备100中。在需要利用融合风格迁移模型对视频进行风格迁移时,电子设备100可以从本地获取风格迁移模型进行融合。
在另一些实施例中,上述训练好的风格迁移模型可存储在云端。电子设备100将需要进行风格迁移的视频以及被选择的风格选项(如图1D所示的昼夜转换风格选项231A)上传至云端。云端可以利用融合风格迁移模型对视频进行风格迁移,并将得到的经过风格迁移的视频发送给电子设备100。可选的,电子设备100也可以仅将被选择的风格选项发送给云端。云端可以根据上述风格选项将需要融合的风格迁移模型发送给电子设备100。
本申请实施例对上述训练好的风格迁移模型的存储位置不作具体限定。
下面介绍本申请提供的一种拍摄方法。
图6示例性示出了本申请提供的一种拍摄方法的流程图。该方法可包括步骤S201~S207。其中:
S201、电子设备100开启相机应用程序和摄像头。
响应于开启相机应用程序的用户操作,例如图1A所示作用于相机图标212A的触摸操作,电子设备100可以开启相机应用程序和摄像头。
S202、电子设备100接收到用于选择相机模式中延时摄影模式的第一用户操作,显示用于确定延时视频的风格的选项。
相机应用程序中可包括多种相机模式。这多种相机模式用于实现不同的拍摄功能。上述第一用户操作可以例如是图1B所示作用于延时摄影选项201D的用户操作。
响应于该第一用户操作,电子设备100可以在用户界面上显示用于确定延时视频的风格的选项。例如,如图1C所示的对拍摄得到的视频进行风格迁移的风格选项231。
上述延时视频可以为摄像头在进行延时摄影过程中采集的图像在经过下述步骤S205~S207处理后,按照采集时间的先后顺序串联得到的视频。其中,用户可以通过图库应用程序来查看电子设备得到的延时视频。
S203、电子设备100接收到用于选择延时视频的风格和时间长度第二用户操作,存储被选择的延时视频的风格和时间长度。
上述第二用户操作可以例如是图1C所示作用在昼夜转换风格选项231A的用户操作以及图1E所示在时间选项232B中选择时间后作用在确认控件232C的用户操作。
上述被选择的风格选项可用于指示电子设备100根据该风格选项对摄像头采集得到的视频进行风格迁移。示例性的,根据上述被选择的风格选项为昼夜转换风格选项231A,电子设备100可以从本地或者云端获取白天风格选项和黑夜风格选项。当确定延时视频中包含的图像的帧数,电子设备100可以对白天风格选项和黑夜风格选项进行插值融合,并利用得到的融合风格迁移模型对视频进行风格迁移。
上述被选项的时间长度可用于指示电子设备100的摄像头采集得到的视频进行抽帧时的 抽帧率。
在一些实施例中,当接收到上述第一用户操作,电子设备100可以先显示用于确定延时视频的时间的选项。之后,电子设备100载显示用于确定延时视频的风格的选项。
或者,当接收到上述第一用户操作,电子设备100还可以同时显示用于确定延时视频的风格和时间的选项。本申请实施例对电子设备100显示用于确定延时视频的风格和时间的选项的方式不作限定。
S204、电子设备100接收到开始进行延时摄影和结束延时摄影的用户操作,得到第一视频,第一视频包含从开始进行延时摄影到结束延时摄影过程中摄像头采集的图像。
上述开始进行延时摄影的用户操作可以例如是图1E所示在时间选项232B中选择时间后作用在确认控件232C的用户操作。上述结束延时摄影的用户操作可以例如是图1F所示作用在停止拍摄控件205的用户操作。上述从开始进行延时摄影到结束延时摄影的这一过程为进行延时摄影的过程。
电子设备100的摄像头在延时摄影的过程中可以按照正常录像采集图像的速率来采集图像(例如每秒采集30帧图像)。电子设备100可以将摄像头在延时摄影过程中采集的图像按照采集时间的先后顺序串联,得到第一视频。
S205、电子设备100对第一视频进行防抖处理,得到第二视频。
根据前述实施例中对视频进行防抖处理的方法,电子设备100可以对第一视频进行防抖处理,得到第二视频。这里对防抖处理的具体方法不再赘述。
S206、电子设备100根据第二用户操作中用户选择的时间长度,对第二视频进行抽帧,得到第三视频,第三视频的时间长度为用户选择的时间长度。
电子设备100可以根据上述摄像头进行延时摄影的过程的时间长度以及第二用户操作中用户选择的时间长度,确定抽帧率。电子设备100可以根据该抽帧率对第二视频进行抽帧,抽取得到的图像按照采集时间的先后顺序串联,得到第三视频。
示例性的,如图1F所示,在延时摄影进行至1分钟内时,电子设备100接收到作用在停止拍摄控件205的用户操作,电子设备100可以停止拍摄。即摄像头进行延时摄影的过程为1分钟。如图1E所示,上述第二用户操作中用户选择的时间长度为10秒。那么,电子设备100可以确定抽帧率为1:6。电子设备10可以在每6帧图像中抽取1帧图像。在一种可能的实现方式中,电子设备100可以等间隔抽帧。即电子设备100可以抽取第二视频中的第1帧图像、第7帧图像、第13帧图像……。然后,电子设备100可以将抽取的图像按采集时间的先后顺序串联,得到第三视频。
不限于上述等间隔抽帧,电子设备100还可以根据得到的抽帧率按照其它方式抽帧。
S207、电子设备100根据第二用户操作中用户选择的风格,利用融合风格迁移模型对第三视频进行风格迁移,并将经过风格迁移的视频保存为延时视频。
电子设备100可以根据第二用户操作中用户选择的风格,确定需要融合的风格迁移模型。进一步的,电子设备100可以根据第三视频包含的图像的帧数,确定对第三视频中各帧图像进行风格迁移的融合风格迁移模型。根据前述实施例中利用融合风格迁移模型对视频进行风格迁移的方法,电子设备100可以利用得到的融合风格迁移模型对第三视频中的各帧图像进行风格迁移,得到延时视频。电子设备100可以保存该延时视频。电子设备100对第三视频进行风格迁移的具体实现方法这里不再赘述。
由图6所示的拍摄方法可知,用户在进行延时摄影时可以选择想要的视频风格以及延时 视频的时间长度。电子设备可以对采集得到的视频进行抽帧,以将原始视频的时间长度压缩至用户想要的时间长度。电子设备对根据用户选择的视频风格对视频进行风格迁移,可以使得用户在短时间内拍摄出来的视频具有长时间拍摄视频的延时效果。例如,在1分钟内拍摄得到的延时视频可以具有原来需要拍摄12个小时甚至更长时间才能得到的从白天到黑夜快速渐变的延时效果。另外,电子设备可以对拍摄得到的视频进行防抖处理。这样,用户在拍摄延时视频时可以手持电子设备进行拍摄,而无需固定设备将用于拍摄的电子设备固定在一个地方。上述拍摄方法使得延时摄影突破了对拍摄场景、设备和时间的限制,提高了用户进行延时摄影的便利性和趣味性。
在一些实施例中,电子设备100未接收到用户选择的延时视频的风格。电子设备100可以不对视频进行风格迁移。示例性的,如图1C所示,用户未在风格选项231中选择风格。即风格选项231中的所有风格均为未选中状态。进一步的,电子设备100接收到用于开始进行延时摄影的用户操作,电子设备100可以开始进行延时摄影。对于在延时摄影过程中采集得到的原始视频,电子设备100可以进行防抖处理和抽帧处理得到延时视频。也即是说,用户可以选择仅将摄像头采集的原始视频在时间上进行压缩,以得到具有相应延时效果的延时视频。
在一些实施例中,若电子设备100判断出用户选择的延时视频的时间长度比延时摄影过程的时间长度长,电子设备100可以对采集得到的原始视频进行防抖处理,并根据用户选择的延时视频的风格对经过防抖处理的视频进行风格迁移,得到延时视频。也即是说,电子设备100可以不对视频进行抽帧处理。或者,若电子设备100判断出用户选择的延时视频的时间长度比延时摄影过程的时间长度长,电子设备100可以先对采集得到的原始视频进行防抖处理。然后,电子设备100可以对经过防抖处理的视频进行插帧,来将视频的时间长度增加至用户选择的延时视频的时间长度。最后,电子设备100可以根据用户选择的延时视频的风格对经过插帧处理的视频进风格迁移,得到延时视频。上述对视频进行插帧的具体实现方式可以参考现有技术中的插帧方法,本申请实施例对此不作限定。
下面介绍本申请实施例提供的另一种得到延时视频的场景。
图7A~图7G示例性示出了一种得到延时视频的场景。在该场景中,电子设备100可以对在非延时摄影模式下采集得到的视频进行处理(如抽帧、风格迁移),得到延时视频。
如图7A所示,响应于作用在用户界面210中图库图标211A的用户操作,电子设备100可以开启图库应用程序,显示如图7B所示的图库界面260。图库界面260可包含第一时间指示符261、第二时间指示符265、第一视频缩略图262、第二视频缩略图263、第一照片缩略图264、第二照片缩略图266。其中:
第一视频缩略图262、第二视频缩略图263可以分别为第一视频和第二视频的封面。其中,电子设备100可以以视频的第一帧图像作为视频缩略图的封面。响应于作用在第一视频缩略图262或第二视频缩略图263上的用户操作,电子设备100可以显示用于播放第一视频或第二视频的用户界面。
第一照片缩略图264、第二照片缩略图266可分别为第一照片和第二照片的缩略图。响应于作用在第一照片缩略图264或第二照片缩略图266的用户操作,电子设备100可以显示第一照片或第二照片。
第一时间指示符261和第二时间指示符265可分别用于指示在第一时间指示符261下颌 第二时间指示符265下的视频和照片拍摄的时间。示例性的,第一时间指示符261指示的时间为今天(今天即为显示在用户界面210上的时间11月9日)。第一视频缩略图262、第二视频缩略图263、第一照片缩略图264位于第一时间指示符261下。也即是说,第一视频、第二视频、第一照片是电子设备100在11月9日拍摄的。第二时间指示符265指示的时间为昨天(昨天即为11月8日)。第二照片缩略图166位于第二时间指示符265下。也即是说,第二照片是电子设备100在11月8日拍摄的。
响应于作用在图库界面260上下滑动的用户操作,电子设备100可以在图库界面260上显示更多的内容。
响应于作用在第一视频缩略图262的用户操作,电子设备100可以显示如图7C所示的用户界面270。用户界面270可包括时间控件271、视频播放区域272、设置选项273。其中:
时间控件271可用于指示电子设备100存储第一视频的时间。例如2020年11月9日上午7点30分。上述存储第一视频的时间可以是第一视频拍摄完成的时间。
视频播放区域272可包含播放控件272A。该播放控件272A可用于指示电子设备272A播放第一视频。
设置选项273可包含分享选项273A、收藏选项273B、编辑选项273C和删除选项273D。其中,分享选项273A可用于用户将第一视频分享给其他设备。收藏选项273B可用于用户收藏第一视频。编辑选项273C可用于用户对第一视频进行例如旋转裁剪、添加滤镜等编辑操作。删除选项273D可用于用户将第一视频从电子设备100中删除。
响应于作用在编辑选项273C的用户操作,电子设备100可以显示如图7D所示的视频编辑界面280。视频编辑界面280可包括视频播放区域281、编辑选项282。其中:
视频播放区域281可以参考前述图7C视频播放区域272的介绍。
编辑选项282可包含旋转裁剪选项282A、滤镜选项282B、配乐选项282C、文本选项282D、水印选项282E、延时摄影选项282F。其中,选项裁剪选项282A可用于对第一视频中的各帧图像进行旋转和裁剪。滤镜选项282B、配乐选项282C、文本选项282D、水印选项282E可分别为第一视频中的各帧图像添加滤镜、添加背景音乐、添加文本、添加水印。延时摄影选项282F可用于对第一视频进行抽帧处理和风格迁移处理,以得到具有延时效果的视频。
编辑选项282中可以包含更多或更少的选项。
示例性的,响应于作用在延时摄影选项282F的用户操作,电子设备100可以显示如图7E所示的视频编辑界面280。视频编辑界面280还可以包含风格选择选项283。风格选择选项283可包括提示语控件283A、昼夜转换风格选项283B、四季更迭风格选项283C、晴雨交替风格选项283D、取消控件283E和下一步控件283F。其中:
提示语控件283A可用于提示用户选择需要对第一视频进行风格迁移处理的风格。提示语控件283A中可包括文字提示“风格选择”。本申请实施例对提示语控件283A的具体形式不作限定。
昼夜转换风格选项283B、四季更迭风格选项283C、晴雨交替风格选项283D的作用可以参考前述实施例对图1C中昼夜转换风格选项231A、四季更迭风格选项231B、晴雨交替风格选项231C的介绍,这里不再赘述。
取消控件283E可用于用户取消风格选择。响应于作用在取消控件283E的用户操作,电子设备100可以显示如图7D所示的视频编辑界面280。
下一步控件283可用于用户进一步完成延时摄影编辑的相关设置。例如,设置延时视频的时间长度。
上述风格选择选项283中还可以包含更多或更少的风格选项。
如图7E所示,当接收到作用在昼夜转换风格选项283B的用户操作,以及作用在下一步控件283的用户操作,电子设备100可以显示如图7F所示的视频编辑界面280。视频编辑界面280还可包含时长选择选项284。时长选择选项284可包含提示语控件284A、时间选项284B、上一步控件284C、保存控件284D。其中:
提示语控件284A可用于提示用户选择最终生成的延时视频的时间长度。提示语控件284A中可包括文字提示“时长选择(请确定处理后视频的时长)”。
时间选项284B的作用可以参考前述实施例对图1E中时间选项232B的介绍,这里不再赘述。
上一步控件284C可用于用户返回上一步,重新选择对第一视频进行风格迁移处理的风格。响应于作用在上一步控件284C上的用户操作,电子设备100可以显示如图7D所示的视频编辑界面280。
保存控件284D可用于电子设备100存储用户选择的风格(如昼夜转换风格)和时间长度(如10秒)。响应于作用在保存控件284D的用户操作,电子设备100可以根据用户选择的风格和时间长度,对第一视频进行抽帧和风格迁移,得到延时视频。该延时视频的时间长度即为用户选择的时间长度。电子设备100可以利用融合风格迁移模型(如融合有白天风格迁移模型和黑夜风格迁移模型的融合风格迁移模型)对经过抽帧处理的第一视频进行风格迁移。利用融合风格迁移模型对视频进行风格迁移的具体实现方式可以参考前述实施例的介绍,这里不再赘述。
当得到延时视频,电子设备100可以显示如图7G所示的用户界面270。图7G所示的用户界面270中包含的内容与图7C所示的用户界面270包含的控件一致。不同的是,视频播放区域272中包含的视频为对第一视频进行抽帧和风格迁移后得到的延时视频的封面。上述延时视频的封面可以为延时视频的第一帧图像。
另外,时间控件271可以指示电子设备100存储上述延时视频的时间。例如2020年11月9日上午8点过8分。可以看出,电子设备100存储第一视频的时间与存储上述延时视频的时间是不同的。第一视频是2020年11月9日上午7点30分拍摄完成并存储在电子设备100中的。上述延时视频是2020年11月9日上午8点过8分由电子设备100处理第一视频得到并存储在电子设备100中的。
其中,响应于作用在图7G所示播放控件272A的用户操作,电子设备100可以播放延时视频。该延时视频播放的过程具体可以参考前述图2A~图2C所示的视频播放过程。这里不再赘述。
由上述实施例可以看出,用户可以选择风格和时间长度对已经拍摄好的视频进行延时处理。上述已经拍摄好的视频可以例如是在图1B所示的相机模式选项中录像模式选项201B被选中时,电子设备100通过摄像头采集得到的视频。也即是说,电子设备100可以对任意视频进行抽帧和风格迁移处理,得到延时视频。该延时视频的延时效果可以不受原始视频时间长度的影响。用户在短时间内拍摄出来的任意视频经过本申请实施例中的抽帧和风格迁移处理后,均可以具有长时间拍摄视频的延时效果。
下面介绍本申请实施例提供的一种拍摄具有风格渐变效果的全景图的场景。
图8A~图8E示例性示出了电子设备100拍摄具有风格迁移效果的全景图的场景示意图。
如图8A所示,响应于作用在用户界面210上相机图标212A的用户操作,电子设备100 可以开启相机应用程序以及摄像头。其中,电子设备100可以显示如图8B所示的用户界面290。
如图8B所示,用户界面290可包括预览区域291、风格选项292、相机模式选项201、图库快捷控件202、快门控件203、摄像头翻转控件204。其中:
相机模式选项201、图库快捷控件202、快门控件203、摄像头翻转控件204的介绍可以分别参考前述实施例对图1B所示的相机模式选项201、图库快捷控件202、快门控件203、摄像头翻转控件204的说明。在图8B中,相机模式选项201还可以包含延时全景模式选项201G,且该延时全景模式选项201G处于被选中状态。在该延时全景模式下,电子设备100可以拍摄得到具有风格渐变效果的全景图。
风格选项292中可包含一个或多个风格选项。例如,昼夜转换风格选项292A、四季更迭风格选项292B、晴雨交替风格选项292C。上述风格选项均可用于指示电子设备100对拍摄得到的全景图进行风格迁移,将全景图的风格转换为该风格选项对应的风格。
例如,上述昼夜转换风格选项292A对应的风格迁移模型可以为融合白天风格迁移模型和黑夜风格迁移模型得到的融合风格迁移模型。电子设备100可以将拍摄得到的全景图从左至右分割成m个区域。这m个区域中可存在重叠的部分。进一步的,电子设备100可以利用上述融合风格迁移模型对从全景图中分割得到的m个区域进行风格迁移。电子设备100可以从上述经过风格迁移的m个区域中的每个区域选取一个拼接区域,并将m个区域进行拼接得到经过风格迁移的全景图。该拼接得到的全景图从左侧至右侧的风格可以为从白天风格逐渐变化为黑夜风格。上述四季更迭转换风格选项231B对应的风格迁移模型可以为春季风格迁移模型、夏季风格迁移模型、秋季风格迁移模型、冬季风格迁移模型得到的融合风格迁移模型。电子设备100可以利用上述融合风格迁移模型对从全景图中分割得到的m个区域进行风格迁移。电子设备100可以从上述经过风格迁移的m个区域中的每个区域选取一个拼接区域,并将m个区域进行拼接得到经过风格迁移的全景图。该拼接得到的全景图从左侧至右侧的风格可以为从春季风格逐渐变化为夏季风格、再从夏季风格变化为秋季风格、再从秋季风格变化为冬季风格。上述晴雨交替风格选项231C对应的风格迁移模型可以为融合晴天风格迁移模型和雨天风格迁移模型得到的融合风格迁移模型。电子设备100可以利用上述融合风格迁移模型对从全景图中分割得到的m个区域进行风格迁移。电子设备100可以从上述经过风格迁移的m个区域中的每个区域选取一个拼接区域,并将m个区域进行拼接得到经过风格迁移的全景图。该拼接得到的全景图从左侧至右侧的风格可以为从晴天风格逐渐变化为雨天风格。
上述风格选项292中还可以包含更多或更少的风格选项。不限于图8B所示的融合风格迁移风格模型,风格选项292中也可以包含单个风格迁移模型(如卡通风格迁移模型)。
本申请实施例对上述利用融合风格迁移模型进行风格迁移得到的全景图的风格的具体变化方式不作限定。例如,当接收到用户选择的风格为昼夜转换风格,电子设备100也可以对从全景图分割得到的多帧图像进行风格迁移处理,再将这多帧图像拼接为全景图。其中,上述拼接得到的全景图从左侧至右侧的风格可以为从黑夜风格逐渐变化为白天风格。
可选的,电子设备100分割全景图的方式还可以为从上侧至下侧进行分割。经过上述风格迁移处理以及拼接得到的全景图从上侧至下侧的风格可以为从白天风格逐渐变化为黑夜风格,或者从黑夜风格逐渐变化为白天风格。不限于上述实施例所示的分割方式,电子设备100还可以沿全景图的任一方向对全景图进行分割。
预览区域291可用于显示摄像头实时采集到的图像。预览区域291中可包含操作提示291A和拍摄进度指示291B。其中:
操作提示291A可用于提示用户拍摄全景图的操作说明。操作提示291A可包括文字提示“按下快门键并沿箭头方向缓慢移动”。上述文字提示中“快门键”即为快门控件203。上述文字提示中的“箭头”即为拍摄进度指示291B中的箭头。
拍摄进度指示291B可包括全景图缩略图和箭头。上述全景图缩略图可用于呈现从开始进行全景图拍摄时到当前时刻得到的全景图的缩略图。上述箭头可用于指示在全景图拍摄过程中电子设备100移动的方向。例如,上述箭头指向水平向右的方向可以表示在全景图拍摄的过程中,电子设备100从开始进行全景图拍摄时所在时刻的位置沿水平向右的方向移动。
如图8B所示,电子设备100接收到作用在昼夜转换风格选项292A的用户操作,以及作用在快门控件203的用户操作。电子设备100可以开始进行全景图拍摄。示例性的,电子设备100可以显示如图8C所示的用户界面290。用户界面290可包括预览区域291、暂停拍摄控件206。其中,预览区域可用于与显示摄像头实时采集到的图像。在图8C中,预览区域291可包含拍摄进度指示291B和操作提示291C。上述拍摄进度指示291B可以参考前述实施例的说明。上述操作指示291C可用于在全景图拍摄过程中提示用户进行拍摄的操作说明。操作提示291C可包括文字提示“请将箭头保持在中心线上”。
暂停拍摄控件206可用于结束全景图拍摄。响应于作用在暂停拍摄控件206的用户操作,电子设备100可以将从开始进行全景图拍摄到结束全景图拍摄这一过程中,摄像头采集的图像拼接为全景图。
在一些实施例中,上述拍摄进度指示291B中的箭头可以随着电子设备100的移动而移动。当上述箭头从初始位置移动到结束位置,电子设备100可以结束全景图拍摄。上述箭头的结束位置可例如是拍摄进度指示291B中最右侧的位置。当结束全景图拍摄,电子设备100可以将从开始进行全景图拍摄到结束全景图拍摄这一过程中,摄像头采集的图像拼接为全景图。
在一些实施例中,电子设备100可以对在上述全景图拍摄过程中采集的多帧图像进行防抖处理。进一步的,电子设备100可以将经过防抖处理的多帧图像拼接为全景图。
本申请实施例对电子设备100拼接摄像头采集的图像,得到全景图的方式不作限定,具体实现方式可以参考现有技术中拍摄全景图的方法。
当得到原始的全景图,电子设备100可以根据用户选择的风格对上述原始的全景图进行风迁移,得到具有风格渐变效果的全景图。电子设备100可以存储上述具有风格渐变效果的全景图。
如图8D所示,当结束全景图拍摄,电子设备100可以显示用户界面290。用户界面290上包含的内容可以参考前述实施例对图8B所示用户界面290的介绍。用户可以通过图库快捷键202查看上述拍摄得到的具有风格渐变效果的全景图。
具体的,响应于作用在图库快捷键202的用户操作,电子设备100可以显示如图8E所示的用户界面310。用户界面310可包括时间控件311和第一全景图312。
第一全景图312为经过上述图8B和图8C所示全景图拍摄过程得到的全景图。第一全景图312从左至右的风格可以为从白天风格逐渐变化为黑夜风格。
上述时间控件311可用于指示电子设备100存储上述第一全景图312的时间。
用户界面310还可以包含更多的内容,本申请实施例对此不作限定。
由上述实施例可以看出,电子设备100可以利用融合风格迁移模型对全景图进行风格迁移,得到具有风格渐变效果的全景图。上述对全景图的处理方法提高了用户拍摄全景图的趣味性。
下面具体介绍本申请实施例提供的一种对全景图进行风格迁移的实现方法。
图9示例性示出了对全景图进行风格迁移的实现方法。对全景图进行风格迁移来得到具有风格渐变效果的全景图的主要步骤可包括:分割全景图、对分割得到的多个区域进行风格迁移并选取拼接区域、将拼接区域拼接得到具有风格渐变效果的全景图。
1、分割全景图
如图9所示,第一全景图的长度为L。第一全景图可以是利用前述图8B~图8C所示在全景图拍摄过程中摄像头采集的图像拼接得到的原始的全景图。电子设备100可以将第一全景图从左至右分割为m个区域。
在一种可能的实现方式中,电子设备100可以通过滑动窗口截取图像的方式对第一全景图进行分割。具体的,滑动窗口的长度可以为d。滑动窗口每一次滑动的距离为△c。电子设备100将上述滑动窗口从第一全景图的最左侧向右滑动m-1次,可以得到m个长度均为d的区域。其中,上述△c小于上述d。即相邻区域存在重叠的部分。并且,第一全景图的长度L与上述△c和上述d存储如下关系:d+(m-1)*△c=L。
2、对分割得到的多个区域进行风格迁移并选取拼接区域
当得到上述m个区域,电子设备100可以根据用户选择的风格对这m个区域分别进行风格迁移。上述进行风格迁移的方法可以参考前述实施例中电子设备100利用融合风格迁移模型对视频中的多帧图像进行风格迁移的介绍。
示例性的,用户选择的风格为昼夜转换风格。电子设备100可以对白天风格迁移模型和黑夜风格迁移模型进行插值融合。得到的融合风格迁移模型的网络结构与白天风格迁移模型和黑夜风格迁移模型的网络结构是相同的。电子设备100对第j区域进行风格迁移的模型为融合风格迁移模型a j。该融合风格迁移模型a j在第一位置的参数可以为((m-j+1)/m)θ day+((j-1)/m)θ night。上述θ day和θ night分别为白天风格迁移模型和黑夜风格迁移模型在第一位置上的参数。上述第一位置为上述融合风格迁移模型a j网络结构中的任一位置。白天风格迁移模型的第一位置、黑夜风格迁移模型的第一位置、上述融合风格迁移模型a j的第一位置均为相同网络结构中的同一位置。上述j为大于或等于1,且小于或等于m的整数。
电子设备100可以利用融合风格迁移模型a1~融合风格迁移模型am分别对第1区域~第m区域进行风格迁移。融合风格迁移模型1~融合风格迁移模型m对应的风格分别为风格1~风格m。根据融合风格迁移模型a j的计算公式可知,上述融合风格迁移模型1可以为白天风格迁移模型。风格1可以为白天风格。上述融合风格迁移模型m可以为黑夜风格迁移模型。风格m可以为黑夜风格。
由图9可以看出,第1区域~第m区域的风格分别为风格1~风格m。第1区域~第m区域的风格逐渐从白天风格变化为黑夜风格。
进一步的,电子设备100可以从经过风格迁移的第1区域至第m区域的每个区域中截取一部分拼接区域。电子设备100可以将截取得到的各拼接区域进行拼接。这样,电子设备100可以得到具有风格渐变效果的全景图。
在一种可能的实现方式中,电子设备100可以从每一个区域中截取相同长度的拼接区域。拼接区域的长度可以为△c`。其中,△c`=L/m。为了保证各拼接区域没有重叠的部分,电子设备100从第k区域截取第k拼接区域时,可以从离第k区域最左侧长度为(k-1)*(△c`-△c)的位置处开始,截取长度为△c`的拼接区域。上述k为大于或等于1,且小于或等于m 的整数。
示例性的,电子设备100从经过风格迁移的第1区域最左侧开始,截取长度为△c`的拼接区域,得到第1拼接区域。电子设备100从离经过风格迁移的第2区域最左侧长度为△c`-△c的位置处开始,截取长度为△c`的拼接区域,得到第2拼接区域。……电子设备100从离经过风格迁移的第m区域最左侧长度为(m-1)*(△c`-△c)的位置处开始,截取长度为△c`的拼接区域,得到第m拼接区域。
3、将拼接区域拼接得到具有风格渐变效果的全景图
电子设备100可以将上述第1拼接区域~第m拼接区域按照从左至右的顺序进行拼接。其中,每一拼接区域在拼接时不存在重叠的部分。如图9所示,电子设备100可以得到长度为L的全景图。该全景图具有风格渐变效果。例如,该全景图从左至右的风格为从白天风格逐渐变化为黑夜风格。
在一些实施例中,电子设备100还可以从图库应用程序中获取全景图或者其他类型的图像,并利用前述图9所示实施例中的方法对得到的全景图或者其它类型的图像进行风格迁移。
在上述对全景图进行风格迁移的方法中,用于风格迁移的相邻区域之间存在重叠的部分。电子设备100利用融合风格迁移模型对分别对这些区域进行风格迁移之后,从相邻区域截取的拼接区域的风格可以更加平滑地过渡。即上述对第一全景图进行分割,以及从经过风格迁移得到的各区域截取拼接区域的方法可以提高全景图风格化效果的平滑度。电子设备100生成的全景图从左至右的风格可以更加平滑地从白天风格变化为黑夜风格。
本申请实施例对上述滑动窗口的长度d、滑动窗口每一次滑动的距离△c、分割第一全景图得到的区域数量m、各拼接区域的长度△c`的具体取值均不作限定。
在一些实施例中,滑动窗口每一次滑动的距离可以是不相同的。即电子设备100分割第一全景图得到的各个区域的长度可以不相等。电子设备100从各个区域截取拼接区域的长度也可以是不相同的。即用于拼接全景图的各个拼接区域的长度可以不相等。
图10示出了电子设备100的结构示意图。
电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU), 图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。
充电管理模块140用于从充电器接收充电输入。
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处 理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。
在一些实施例中,ISP还可以对视频中的多帧图像进行防抖处理。其中,ISP可以根据运动传感器采集的数据对图像进行补偿,减少在拍摄过程中由于电子设备100抖动而导致的画面不平稳、失焦等问题。
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
在本申请的一些实施例中,NPU中可存储有多个风格迁移模型。NPU可以利用风格迁移模型对经过ISP处理得到的图像进行风格迁移。示例性的,当接收到视频中的多帧图像,NPU可以前述图3和图4所示的方法对多个风格迁移模型进行融合,并利用融合得到的融合风格迁移模型对这多帧图像分别进行风格迁移。
外部存储器接口120可以用于连接外部存储卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。 处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。
耳机接口170D用于连接有线耳机。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。
环境光传感器180L用于感知环境光亮度。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
温度传感器180J用于检测温度。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反 馈。
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。
SIM卡接口195用于连接SIM卡。
在本申请实施例中,电子设备可以获取第一图像序列。上述第一图像序列可以为视频或者为从全景图分割得到的多帧图像。在第一图像序列为视频时,上述第一图像序列可以为前述图1A~图1F实施例所示电子设备开启摄像头拍摄得到的。或者,上述第一图像序列可以为前述图7A~图7C实施例所示电子设备从图库应用程序中获取的。在第一图像序列为从全景图分割得到的多帧图像时,该全景图可以为前述图8A~图8D所示电子设备开启摄像头拍摄得到的。
在本申请实施例中,电子设备可以基于目标迁移风格对第一图像序列进行处理,得到第二图像序列。上述目标迁移风格可以例如是前述实施例中的昼夜转换风格、四季更迭风格、晴雨交替风格。如前述图1C所示的实施例,电子设备100可以根据作用在风格选项231中任一风格选项上的用户操作来确定上述目标迁移风格。
在本申请实施例中,上述目标迁移风格可用于指示第二图像序列中第一帧图像的风格至第n帧图像的风格在M个风格中按第一风格顺序变化。其中,上述目标迁移风格可用于确定上述M的大小。也即用于融合的风格迁移模型的数量。在一些实施例中,按上述第一风格顺序变化得到的第二图像序列中图像的高层语义信息可以呈现在自然时间上先后顺序的变化。例如,目标迁移风格为昼夜转换风格。则上述M个风格可以为白天风格和黑夜风格。上述第一风格顺序可以为从白天风格变化至黑夜风格的顺序。目标迁移风格为四季更迭风格。则上述M个风格可以为春季风格、夏季风格、秋季风格和冬季风格。上述第一风格顺序可以为从春季风格变化至夏季风格、再从夏季风格变化至秋季风格、再从秋季风格变化至黑夜风格。本申请实施例对上述第一风格顺序中M个风格排列的顺序不作限定。
在本申请实施例中,电子设备可以基于上述目标迁移风格,使用k个融合风格迁移模型对上述第一图像序列进行处理。这k个融合风格迁移模型可以是M个单风格迁移模型加权生成的。其中,电子设备生成k个融合风格迁移模型并利用这k个融合风格迁移模型对第一图像序列进行处理的实现方法可以参考前述图4所示的实施例。
上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或“响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或 多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (15)

  1. 一种图像处理方法,其特征在于,所述方法包括:
    电子设备获取第一图像序列;
    所述电子设备基于目标迁移风格对所述第一图像序列进行处理,得到第二图像序列;所述第一图像序列和所述第二图像序列均包含n帧图像,所述第一图像序列中第i帧图像与所述第二图像序列中第i帧图像的高层语义信息相同,所述第一图像序列中第i帧图像与所述第二图像序列中第i帧图像的风格不同;所述目标迁移风格用于指示所述第二图像序列中第一帧图像的风格至第n帧图像的风格在M个风格中按第一风格顺序变化,所述n和所述M为大于1的整数,所述i为小于或等于所述n的正整数;
    所述电子设备保存所述第二图像序列。
  2. 根据权利要求1所述的方法,其特征在于,所述电子设备基于目标迁移风格对所述第一图像序列进行处理,具体包括:
    所述电子设备基于所述目标迁移风格,使用k个融合风格迁移模型对所述第一图像序列进行处理;所述k小于或等于n;所述k个融合风格迁移模型的输出图像为所述第二图像序列,其中,一个融合风格迁移模型的输出图像为所述第二图像序列中的一帧图像或连续多帧图像。
  3. 根据权利要求2所述的方法,其特征在于,所述一个融合风格迁移模型是M个单风格迁移模型加权生成的,所述一个融合风格迁移模型的输出图像的风格越接近第j个单风格迁移模型的输出图像的风格,生成所述一个融合风格迁移模型时所述第j个单风格迁移模型的权重越大;所述M个单风格迁移模型各自输出图像的风格组成所述M个风格,所述j为小于或等于M的正整数。
  4. 根据权利要求3所述的方法,其特征在于,所述k个融合风格迁移模型和所述M个单风格迁移模型是神经网络模型,具有相同的神经网络结构。
  5. 根据权利要求4所述的方法,其特征在于,所述单风格迁移模型是经过训练的,所述方法还包括:
    所述电子设备获取训练数据集,所述训练数据集包含一帧或多帧风格图像以及第一视频中的多帧内容图像;所述一帧或多帧风格图像的风格为经过训练的所述单风格迁移模型的输出图像的风格;
    所述电子设备利用待训练的单风格迁移模型处理所述第一视频中的多帧内容图像,得到多帧合成图像;
    所述电子设备利用损失函数训练所述待训练的单风格迁移模型,得到经过训练的所述单风格迁移模型;其中,所述损失函数包括高层语义信息损失函数、风格损失函数、时域约束损失函数,所述高层语义信息损失函数由所述多帧内容图像的高层语义信息和所述多帧合成图像的高层语义信息确定,所述风格损失函数由所述多帧内容图像的风格和所述多帧合成图像的风格确定,所述时域约束损失函数由所述多帧合成图像中一帧合成图像的风格和与所述一帧合成图像相邻的多帧合成图像的风格确定。
  6. 根据权利要求1-5中任一项所述的方法,其特征在于,所述电子设备获取第一图像序列,具体包括:
    所述电子设备开启摄像头采集得到第一视频,并根据所述第一视频得到所述第一图像序 列中的n帧图像;所述第一视频包含z帧图像,所述n帧图像是从所述z帧图像中抽取得到的。
  7. 根据权利要求6所述的方法,其特征在于,所述电子设备根据所述第一视频得到所述第一图像序列中的n帧图像之前,所述方法还包括:
    所述电子设备对所述第一视频进行防抖处理。
  8. 根据权利要求1-5中任一项所述的方法,其特征在于,所述电子设备获取第一图像序列,具体包括:
    所述电子设备根据用户选择的第一视频从本地存储的视频中获取所述第一视频,并根据所述第一视频得到所述第一图像序列中的n帧图像;所述第一视频包含z帧图像,所述n帧图像是从所述z帧图像中抽取得到的。
  9. 根据权利要求6-8中任一项所述的方法,其特征在于,所述抽取的抽帧率由用户选择的所述第一图像序列的播放时长决定,所述抽帧率为所述第一图像序列的播放时长和所述第一视频的采集时长的比值。
  10. 根据权利要求6-9中任一项所述的方法,其特征在于,所述电子设备保存所述第二图像序列,具体包括:
    所述电子设备将所述第二图像序列中的n帧图像按先后顺序串联保存为视频。
  11. 根据权利要求1-5中任一项所述的方法,其特征在于,所述电子设备获取第一图像序列,具体包括:
    所述电子设备获取第一图像,并对所述第一图像进行分割,得到所述第一图像序列中的n帧图像。
  12. 根据权利要求11所述的方法,其特征在于,所述电子设备保存所述第二图像序列,具体包括:
    所述电子设备从所述第二图像序列中的每一帧图像中截取一个拼接区域,得到n个拼接区域;所述n个拼接区域不存在重叠的部分;
    所述电子设备拼接所述n个拼接区域得到第二图像,并存储所述第二图像;所述第二图像的分辨率与所述第一图像的分辨率相同。
  13. 根据权利要求11或12所述的方法,其特征在于,所述第一图像序列中的每一帧图像的分辨率相同,所述第一图像序列中相邻两帧图像存在重叠的部分。
  14. 一种电子设备,包括显示屏、存储器、一个或多个处理器,其特征在于,所述存储器用于存储多个单风格迁移模型,还用于存储计算机程序;所述处理器用于调用所述计算机程序,使得所述电子设备执行权利要求1-13中任一项所述的方法。
  15. 一种计算机存储介质,其特征在于,包括:计算机指令;当所述计算机指令在电子设备上运行时,使得所述电子设备执行权利要求1-13中任一项所述的方法。
PCT/CN2021/135353 2020-12-07 2021-12-03 图像处理方法及电子设备 Ceased WO2022121796A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21902503.8A EP4246955A4 (en) 2020-12-07 2021-12-03 IMAGE PROCESSING METHOD AND ELECTRONIC DEVICE
US18/256,158 US12567129B2 (en) 2020-12-07 2021-12-03 Image processing method and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011420630.6A CN114615421B (zh) 2020-12-07 2020-12-07 图像处理方法及电子设备
CN202011420630.6 2020-12-07

Publications (1)

Publication Number Publication Date
WO2022121796A1 true WO2022121796A1 (zh) 2022-06-16

Family

ID=81856144

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/135353 Ceased WO2022121796A1 (zh) 2020-12-07 2021-12-03 图像处理方法及电子设备

Country Status (4)

Country Link
US (1) US12567129B2 (zh)
EP (1) EP4246955A4 (zh)
CN (1) CN114615421B (zh)
WO (1) WO2022121796A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116527940A (zh) * 2023-04-07 2023-08-01 腾讯科技(深圳)有限公司 一种视频编码方法、装置及计算机设备、介质
CN117440181A (zh) * 2023-10-24 2024-01-23 广州虎牙科技有限公司 图像渲染方法、装置、计算机设备及可读存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117750186B (zh) * 2022-09-22 2025-11-04 荣耀终端股份有限公司 相机功能控制方法、电子设备及存储介质
US12536713B2 (en) * 2023-05-16 2026-01-27 Salesforce, Inc. Systems and methods for controllable image generation
CN119130778A (zh) * 2023-05-31 2024-12-13 北京字跳网络技术有限公司 一种图像处理方法、装置、计算机设备及存储介质
CN118014854B (zh) * 2023-11-20 2024-09-27 北京汇畅数宇科技发展有限公司 基于ai模型的人脸风格化处理方法、装置及计算机设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09189610A (ja) * 1996-01-11 1997-07-22 Nippon Avionics Co Ltd 2次元時間遅延積分型熱画像装置
CN109360261A (zh) * 2018-09-28 2019-02-19 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
CN109636712A (zh) * 2018-12-07 2019-04-16 北京达佳互联信息技术有限公司 图像风格迁移及数据存储方法、装置和电子设备
CN109697690A (zh) * 2018-11-01 2019-04-30 北京达佳互联信息技术有限公司 图像风格迁移方法和系统
CN109919829A (zh) * 2019-01-17 2019-06-21 北京达佳互联信息技术有限公司 图像风格迁移方法、装置和计算机可读存储介质
CN110909790A (zh) * 2019-11-20 2020-03-24 Oppo广东移动通信有限公司 图像的风格迁移方法、装置、终端及存储介质
US20200105029A1 (en) * 2018-09-28 2020-04-02 Samsung Electronics Co., Ltd. Display apparatus control method and display apparatus using the same

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9615177B2 (en) * 2014-03-06 2017-04-04 Sphere Optics Company, Llc Wireless immersive experience capture and viewing
US9762846B2 (en) 2015-05-08 2017-09-12 Microsoft Technology Licensing, Llc Real-time hyper-lapse video creation via frame selection
US9846840B1 (en) * 2016-05-25 2017-12-19 Adobe Systems Incorporated Semantic class localization in images
US10147459B2 (en) * 2016-09-22 2018-12-04 Apple Inc. Artistic style transfer for videos
US10565757B2 (en) * 2017-06-09 2020-02-18 Adobe Inc. Multimodal style-transfer network for applying style features from multi-resolution style exemplars to input images
CN110278368A (zh) * 2018-03-15 2019-09-24 株式会社理光 图像处理装置、摄影系统、图像处理方法
CN110363293A (zh) * 2018-03-26 2019-10-22 腾讯科技(深圳)有限公司 神经网络模型的训练、延时摄影视频的生成方法及设备
JP7117872B2 (ja) * 2018-03-28 2022-08-15 キヤノン株式会社 画像処理装置、撮像装置、画像処理方法、及びプログラム
CN110086985B (zh) * 2019-03-25 2021-03-30 华为技术有限公司 一种延时摄影的录制方法及电子设备
CN110175951B (zh) 2019-05-16 2022-12-02 西安电子科技大学 基于时域一致性约束的视频风格迁移方法
CN110460770B (zh) * 2019-07-25 2021-01-26 上海晰图信息科技有限公司 一种图像处理方法和系统
CN111294509A (zh) * 2020-01-22 2020-06-16 Oppo广东移动通信有限公司 视频拍摄方法、装置、终端及存储介质
CN111556244B (zh) 2020-04-23 2022-03-11 北京百度网讯科技有限公司 视频风格迁移方法和装置
CN111667399B (zh) * 2020-05-14 2023-08-25 华为技术有限公司 风格迁移模型的训练方法、视频风格迁移的方法以及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09189610A (ja) * 1996-01-11 1997-07-22 Nippon Avionics Co Ltd 2次元時間遅延積分型熱画像装置
CN109360261A (zh) * 2018-09-28 2019-02-19 北京达佳互联信息技术有限公司 图像处理方法、装置、电子设备及存储介质
US20200105029A1 (en) * 2018-09-28 2020-04-02 Samsung Electronics Co., Ltd. Display apparatus control method and display apparatus using the same
CN109697690A (zh) * 2018-11-01 2019-04-30 北京达佳互联信息技术有限公司 图像风格迁移方法和系统
CN109636712A (zh) * 2018-12-07 2019-04-16 北京达佳互联信息技术有限公司 图像风格迁移及数据存储方法、装置和电子设备
CN109919829A (zh) * 2019-01-17 2019-06-21 北京达佳互联信息技术有限公司 图像风格迁移方法、装置和计算机可读存储介质
CN110909790A (zh) * 2019-11-20 2020-03-24 Oppo广东移动通信有限公司 图像的风格迁移方法、装置、终端及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4246955A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116527940A (zh) * 2023-04-07 2023-08-01 腾讯科技(深圳)有限公司 一种视频编码方法、装置及计算机设备、介质
CN117440181A (zh) * 2023-10-24 2024-01-23 广州虎牙科技有限公司 图像渲染方法、装置、计算机设备及可读存储介质

Also Published As

Publication number Publication date
US12567129B2 (en) 2026-03-03
CN114615421B (zh) 2023-06-30
EP4246955A4 (en) 2024-05-15
US20240037708A1 (en) 2024-02-01
CN114615421A (zh) 2022-06-10
EP4246955A1 (en) 2023-09-20

Similar Documents

Publication Publication Date Title
CN113727017B (zh) 拍摄方法、图形界面及相关装置
CN114615421B (zh) 图像处理方法及电子设备
CN113556461B (zh) 一种图像处理方法、电子设备及计算机可读存储介质
CN112532869B (zh) 一种拍摄场景下的图像显示方法及电子设备
WO2022042776A1 (zh) 一种拍摄方法及终端
WO2021052232A1 (zh) 一种延时摄影的拍摄方法及设备
CN113497890B (zh) 一种拍摄方法及设备
CN112887584A (zh) 一种视频拍摄方法与电子设备
CN113727015A (zh) 一种视频拍摄方法及电子设备
CN113170037A (zh) 一种拍摄长曝光图像的方法和电子设备
US12382163B2 (en) Shooting method and related device
CN114466133B (zh) 拍照方法及装置
WO2022156473A1 (zh) 一种播放视频的方法及电子设备
WO2021180046A1 (zh) 图像留色方法及设备
CN115529378A (zh) 一种视频处理方法及相关装置
WO2024041394A1 (zh) 拍摄方法及相关装置
WO2022143921A1 (zh) 一种图像重建方法、相关装置及系统
CN115914823B (zh) 拍摄方法及电子设备
WO2022206783A1 (zh) 拍摄方法、装置、电子设备及可读存储介质
CN119277215B (zh) 图像处理方法、电子设备、计算机程序产品及存储介质
CN113542575A (zh) 设备位姿调整方法和图像拍摄方法及电子设备
CN113452895A (zh) 一种拍摄方法及设备
CN121724850A (zh) 一种图像增强的方法、电子设备和计算机存储介质
CN121099209A (zh) 图像处理方法、电子设备、计算机程序产品及存储介质
HK40101788B (zh) 一种拍摄方法及相关设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21902503

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 18256158

Country of ref document: US

ENP Entry into the national phase

Ref document number: 2021902503

Country of ref document: EP

Effective date: 20230612

NENP Non-entry into the national phase

Ref country code: DE

WWG Wipo information: grant in national office

Ref document number: 18256158

Country of ref document: US