WO2022007565A1 - 增强现实的图像处理方法、装置、电子设备及存储介质 - Google Patents
增强现实的图像处理方法、装置、电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022007565A1 WO2022007565A1 PCT/CN2021/098456 CN2021098456W WO2022007565A1 WO 2022007565 A1 WO2022007565 A1 WO 2022007565A1 CN 2021098456 W CN2021098456 W CN 2021098456W WO 2022007565 A1 WO2022007565 A1 WO 2022007565A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- augmented reality
- target
- audio data
- model
- reality model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating three-dimensional [3D] models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—Three-dimensional [3D] animation
- G06T13/205—Three-dimensional [3D] animation driven by audio data
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating three-dimensional [3D] models or images for computer graphics
- G06T19/20—Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/174—Facial expression recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2012—Colour editing, changing, or manipulating; Use of colour codes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2021—Shape modification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Definitions
- Embodiments of the present invention relate to virtual reality technology, and in particular, to an augmented reality image processing method, device, electronic device, and storage medium.
- Augmented Reality is a technology in which real information and virtual information are superimposed.
- the computer system processes real information, and generates virtual information that matches and contains virtual objects, sounds or texts according to the real information; then, the virtual information is superimposed on the human-computer interaction interface that displays real information, thereby enhancing the user's understanding of the real world. perception.
- augmented reality models can only be displayed in a preset fixed manner, which is single, lacks interaction, and is not easy to use.
- the present invention provides an augmented reality image processing method, device, electronic device and storage medium, so as to improve the interactivity of the augmented reality model and the ease of use.
- an embodiment of the present invention provides an augmented reality image processing method, including:
- the augmented reality model is driven according to the playback progress and audio features of the target audio data.
- an embodiment of the present invention further provides an augmented reality image processing device, including:
- the target image acquisition module is used to acquire the target image in response to the image acquisition instruction triggered by the user, and the target image contains the target object;
- the augmented reality model acquisition module is used to acquire the augmented reality model of the target object, and output the augmented reality model in combination with the target object;
- a target audio acquisition module used to acquire the target audio data selected by the user
- an audio feature determination module used for determining audio features with timing according to the target audio data
- the output module is used to drive the augmented reality model according to the playback progress and audio features of the target audio data when the target audio data is output.
- an embodiment of the present invention further provides an electronic device, where the electronic device includes:
- processors one or more processors
- the one or more processors When one or more programs are executed by one or more processors, the one or more processors implement the augmented reality image processing method as shown in the embodiments of the present disclosure.
- embodiments of the present disclosure further provide a storage medium containing computer-executable instructions, and the computer-executable instructions, when executed by a computer processor, are used to execute the augmented reality image processing method shown in the embodiments of the present disclosure .
- an embodiment of the present invention further provides a computer program product, where the computer program product includes: a computer program, where the computer program is stored in a readable storage medium, and one or more processors of the electronic device may The readable storage medium reads the computer program, and the one or more processors execute the computer program, so that the electronic device executes the augmented reality image processing method as shown in the embodiments of the present disclosure.
- an embodiment of the present disclosure further provides a computer program, where the computer program is stored in a readable storage medium, and one or more processors of a device can read the computer from the readable storage medium A program, wherein the one or more processors execute the computer program, so that the electronic device executes the augmented reality image processing method as shown in the embodiments of the present disclosure.
- the augmented reality image processing solution disclosed in the embodiments of the present disclosure can respond to an image acquisition instruction triggered by a user to acquire a target image, where the target image contains a target object; acquire an augmented reality model of the target object, and output the augmented reality model in combination with the target object; For the target audio data selected by the user, the time-series audio features are determined according to the target audio data; when outputting the target audio data, the augmented reality model is driven according to the playback progress and audio features of the target audio data. Compared with the current augmented reality model, it lacks interactivity and has poor ease of use.
- the augmented reality image processing solution disclosed in the embodiments of the present disclosure can drive the output of the augmented display model in combination with the audio features of the target audio data selected by the user when the augmented reality model is output, so that the user can participate in the display process of the augmented display model , by selecting different target audio data, the augmented reality model is driven to display according to the audio features of the target audio data, which improves the usability.
- Embodiment 1 is a flowchart of an image processing method for augmented reality in Embodiment 1 of the present invention
- FIG. 2 is a schematic structural diagram of an image processing apparatus for augmented reality in Embodiment 2 of the present invention
- FIG. 3 is a schematic structural diagram of an electronic device in Embodiment 3 of the present invention.
- Embodiment 1 is a flowchart of an augmented reality image processing method provided in Embodiment 1 of the present invention. This embodiment is applicable to the case of displaying an augmented reality model.
- the method can be executed by an electronic device that implements augmented reality, and the electronic device can For smartphones, tablet computers, etc., the specific steps include the following:
- Step 110 in response to an image acquisition instruction triggered by the user, acquire a target image, where the target image includes a target object.
- the user can issue an image acquisition instruction in the preset application.
- the electronic device acquires the target image through the camera.
- the user can open the preset application under the premise of the intention to use the augmented reality model.
- the target object can be an object with an augmented reality model such as a landmark building.
- Landmark buildings can be buildings with a unique design in an area.
- the preset application may be a photographing application of an electronic device, or an application having an augmented reality function.
- the camera of the electronic device acquires the target image, and the electronic device displays the acquired target image on the preview page.
- the preview page can provide users with real-time images obtained by the camera.
- Step 120 Acquire an augmented reality model of the target object, and output the augmented reality model in combination with the target object.
- the augmented reality model is mapped to the target object in the target image, and the augmented reality model is output in combination with the target object.
- an interface for manually adjusting the size of the augmented reality model can be provided for the user. If the combination of machines is not effective, such as the augmented reality model cannot be accurately combined with the target object, the size of the augmented reality model can be adjusted through this interface, so that the user can adjust the size of the augmented reality model. Adjust the size of augmented reality models to improve ease of use.
- obtaining an augmented reality model of the target object can be implemented in the following manner:
- the target object identifier is determined according to the current position information of the electronic device and the shooting angle, and the augmented reality model represented by the target object identifier is determined as the augmented reality model of the target object.
- GPS Global Positioning System
- the orientation of the electronic device is obtained through the gyroscope, and the orientation is used as the shooting angle. According to the positioning information and the shooting angle, it is determined whether there is a landmark building within a certain shooting range. If there is a landmark building, the augmented reality model of the landmark building is used as the augmented reality model of the target object.
- acquiring the augmented reality model of the target object may be implemented in the following manner:
- the basic image of the target object is determined from the target image, and the augmented reality model of the target object is determined according to the basic image.
- the base image may be an image of the target object in the target image.
- the basic image may be analyzed by the network-side server, so as to determine what the target object in the basic image is.
- the target objects can be objects such as buildings, vehicles, clothing, shoes and hats.
- the above two implementation manners can be used to find the augmented reality model of the target object, and can also be used to check whether the augmented reality model of the target object is accurate.
- Step 130 Acquire the target audio data selected by the user, and determine audio features with time sequence according to the target audio data.
- the target audio data may be audio data such as songs or recordings selected by the user.
- the above audio data may be locally stored audio data. It can also be audio data selected by the user in the playlist provided by the server.
- the target audio data may also be audio data input by the user in real time.
- audio feature detection is performed on the target audio data to obtain time-series audio features of the target audio data, where the audio features include one or a combination of accents, beats or beats.
- Audio feature detection is used to analyze beat patterns, accents, and rebeat locations or patterns in target audio data.
- a beat is a unit representing the tempo of target audio data.
- a series of beats with a certain strength and weakness are repeated at regular intervals. Such as 2/4 beats (four or two beats), 4/4 beats (four or four beats), 3/4 beats (four or three beats), etc.
- the beat changes periodically over time. Accents are the louder notes in the target audio.
- Rebeat refers to the strong beat in the beat.
- Step 140 When outputting the target audio data, drive the augmented reality model according to the playback progress and audio characteristics of the target audio data.
- the target audio data is output through the audio output module of the electronic device. While outputting the target audio data, the augmented reality model is driven according to the current playback progress and the target audio features.
- the overall color of the augmented reality model may be driven to change.
- the color change is different for heavy beats, taps, or accents.
- the augmented reality model includes a plurality of model units.
- Driving the augmented reality model according to the playback progress and audio features of the target audio data includes: driving model units in the augmented reality model according to the playback progress and audio features of the target audio data.
- the augmented reality model can be composed of multiple model units, and each model unit can be a cube unit. Splicing multiple cube elements to form an augmented reality model of the target object. Multiple cubes can be processed in parallel by shaders. According to the playback progress and audio characteristics of the target audio data, the model unit in the augmented reality model is driven.
- drive model units in the augmented reality model according to the playback progress and audio features of the target audio data including:
- Step 1 Determine the target time and target amplitude for the morphological change of the model unit according to the audio feature with time sequence.
- the time at which the tap, the overbeat and the stress are located in the beat can be determined as the target time, and the target amplitude is determined according to the degree of the tap, the overshot and the stress.
- Step 2 If the playback progress of the target audio data is the target time, drive the model unit in the augmented reality model according to the target amplitude.
- the driving mode of the model unit can include convex motion, color change or transparency change, etc.
- model units in the augmented reality model are driven according to the target amplitude, including:
- the preset model units in the augmented reality model are driven to perform a bulging action; or, multiple model units in the augmented reality model are driven to perform color changes; or, multiple model units in the augmented reality model are driven to perform transparency changes.
- the preset model When driving a preset model unit in the augmented reality model to perform a protruding action, the preset model may be a randomly selected model unit. Determine the bulge amplitude of the bulge action according to the target amplitude.
- multiple model units When driving multiple model units in the augmented reality model to perform color change, multiple model units may be randomly selected from all model units to perform color change or to perform color change on all model units.
- Transparency refers to the transparency of the model element texture graphics.
- the texture image can be a solid color or the actual texture pattern of the target object.
- the method further includes:
- the emotional feature of the target audio data determines the first deformation feature of the augmented reality model according to the emotional feature, and the first deformation feature is used to drive the shape of the augmented reality model to be consistent with the emotion expressed by the emotional feature; when outputting the target audio data, according to the first deformation feature Deformable features drive augmented reality models.
- the emotional feature may be determined from the song title of the target audio data. Or, the emotional feature is determined according to the text content entered by the user in the target audio.
- the emotional feature is used to represent the emotional bias of the audio content of the target audio data, such as cheerfulness, melancholy, etc.
- Deformation features corresponding to different emotional features can be preconfigured. For example, in the first deformation feature corresponding to the cheerful emotion feature, the bulge amplitude of the model unit is higher, and the bulge frequency is faster. For example, in the first deformation feature corresponding to the melancholic emotion feature, the bulge amplitude of the model unit is lower, and the bulge frequency is slower.
- the emotional feature is sadness
- the TV tower model is in a bent shape. If the building model is deformed, delete the original image of the building by cropping, and use the model to cover the original image area.
- the method further includes:
- the user's limb movements can be obtained through limb detection.
- the user's expression can be obtained through face recognition.
- a second deformation feature is generated based on body movements or expressions. Exemplarily, after the TV tower is photographed, if the facial expression of the person is sad, the second deformation feature is a stooped shape, and the TV tower model is in a stooped shape at this time. If the building model is deformed, delete the original image of the building by cropping, and use the model to cover the original image area.
- the augmented reality image processing method disclosed in the embodiments of the present disclosure can respond to an image acquisition instruction triggered by a user to acquire a target image, where the target image contains a target object; acquire an augmented reality model of the target object, and output the augmented reality model in combination with the target object; For the target audio data selected by the user, the time-series audio features are determined according to the target audio data; when outputting the target audio data, the augmented reality model is driven according to the playback progress and audio features of the target audio data. Compared with the current augmented reality model, it lacks interactivity and has poor ease of use.
- the augmented reality image processing method disclosed in the embodiment of the present disclosure can drive the output of the augmented display model in combination with the audio features of the target audio data selected by the user when the augmented reality model is output, so that the user can participate in the display process of the augmented display model , by selecting different target audio data, the augmented reality model is driven to display according to the audio features of the target audio data, which improves the usability.
- FIG. 2 is a schematic structural diagram of an image processing apparatus for augmented reality provided by Embodiment 2 of the present disclosure. This embodiment is applicable to a situation in which an augmented reality model is displayed.
- the method may be performed by an electronic device that implements augmented reality.
- the electronic device may For smart phones, tablet computers, etc., the device includes: a target image acquisition module 210 , an augmented reality model acquisition module 220 , a target audio acquisition module 230 , an audio feature determination module 240 or an output module 250 .
- a target image acquisition module 210 configured to acquire a target image in response to an image acquisition instruction triggered by a user, and the target image includes a target object;
- the augmented reality model acquisition module 220 is used to acquire the augmented reality model of the target object, and output the augmented reality model in combination with the target object;
- the target audio acquisition module 230 is used to acquire the target audio data selected by the user;
- An audio feature determining module 240 configured to determine audio features with timing according to the target audio data
- the output module 250 is configured to drive the augmented reality model according to the playback progress and audio characteristics of the target audio data when outputting the target audio data.
- the audio feature determination module 240 is used for:
- the augmented reality model includes a plurality of model units; the output module 250 is used for:
- the model unit in the augmented reality model is driven.
- the output module 250 is used for:
- the model unit in the augmented reality model is driven according to the target amplitude.
- the output module 250 is used for:
- the first deformation feature acquisition module is used for:
- the augmented reality model is driven according to the first deformation feature.
- the second deformation feature acquisition module is used for:
- the augmented reality model is driven according to the second deformation feature.
- the target image acquisition module 210 acquires a target image in response to an image acquisition instruction triggered by a user, and the target image contains a target object;
- the augmented reality model acquisition module 220 acquires an augmented reality model of the target object , output the augmented reality model in combination with the target object;
- the target audio acquisition module 230 acquires the target audio data selected by the user, and the audio feature determination module 240 determines the audio features with timing according to the target audio data; when the output module 250 outputs the target audio data, according to the target audio data
- the playback progress of the audio data and the audio features drive the augmented reality model.
- it lacks interactivity and has poor ease of use.
- the augmented reality image processing apparatus disclosed in the embodiments of the present disclosure can drive the output of the augmented display model in combination with the audio features of the target audio data selected by the user when outputting the augmented reality model, so that the user can participate in the display process of the augmented display model , by selecting different target audio data, the augmented reality model is driven to display according to the audio features of the target audio data, which improves the usability.
- the augmented reality image processing apparatus provided by the embodiment of the present invention can execute the augmented reality image processing method provided by any embodiment of the present invention, and has functional modules and beneficial effects corresponding to the execution method.
- Terminal devices in the embodiments of the present disclosure may include, but are not limited to, such as mobile phones, notebook computers, digital broadcast receivers, Personal digital assistants (PDAs, personal digital assistants), Portable android devices (PADs, tablet computers), Portable media players ( Mobile terminals such as PMP, Portable Multimedia Player), in-vehicle terminals (eg, in-vehicle navigation terminals), etc., and stationary terminals such as digital TVs (Television), desktop computers, and the like.
- PDAs Personal digital assistants
- PDAs personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Personal digital assistants
- PDAs Portable android devices
- the electronic device 800 may include a processing device (eg, a central processing unit, a graphics processor, etc.) 801 , which may be stored in a read only memory (ROM) 802 or from a storage device 808 according to a program stored in a read only memory (ROM) 802
- ROM read only memory
- RAM random access memory
- the processing device 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
- Input/output (I/O) interface 805 is also connected to bus 804 .
- the following devices can be connected to the I/O interface 805: input devices 806 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD) Output device 807 , speaker, vibrator, etc.; storage device 808 including, eg, magnetic tape, hard disk, etc.; and communication device 809 .
- Communication means 809 may allow electronic device 800 to communicate wirelessly or by wire with other devices to exchange data. While FIG. 3 shows electronic device 800 having various means, it should be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
- embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the methods illustrated in the flowcharts.
- the computer program may be downloaded and installed from the network via the communication device 809, or from the storage device 808, or from the ROM 802.
- the steps in the method of the embodiment of the present disclosure are executed to realize the above-mentioned functions defined by the computer program.
- a computer program product includes: a computer program, the computer program is stored in a readable storage medium, and one or more processors of an electronic device The computer program is read from the readable storage medium, and the one or more processors execute the computer program, so that the electronic device executes the solution provided by any of the foregoing embodiments.
- One or more embodiments of the present disclosure provide a computer program, where the computer program is stored in a readable storage medium, and one or more processors of an electronic device can read the computer from the readable storage medium A program, wherein the one or more processors execute the computer program, so that the electronic device executes the solution provided by any of the foregoing embodiments.
- the computer-readable medium mentioned above in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
- the computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable of the above combination.
- a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
- a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with computer-readable program code embodied thereon. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
- a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
- the program code embodied on the computer readable medium may be transmitted by any suitable medium, including but not limited to: electric wire, optical fiber cable, radio frequency (RF), etc., or any suitable combination of the above.
- the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or may exist alone without being assembled into the electronic device.
- the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires at least two Internet Protocol addresses; A node evaluation request for an Internet Protocol address, wherein the node evaluation device selects an Internet Protocol address from the at least two Internet Protocol addresses and returns it; receives the Internet Protocol address returned by the node evaluation device; wherein the obtained The Internet Protocol address indicates an edge node in the content distribution network.
- the above computer-readable medium carries one or more programs, and when the above one or more programs are executed by the electronic device, the electronic device: receives a node evaluation request including at least two Internet Protocol addresses; From the at least two Internet Protocol addresses, the Internet Protocol address is selected; the selected Internet Protocol address is returned; wherein, the received Internet Protocol address indicates an edge node in the content distribution network.
- Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as the "C" language or similar programming language.
- the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or server.
- the remote computer can be connected to the user's computer through any kind of network—including a Local Area Network (LAN) or Wide Area Network (WAN)—or, can be connected to an external computer ( For example, using an Internet service provider to connect via the Internet).
- LAN Local Area Network
- WAN Wide Area Network
- each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
- the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
- each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
- the units involved in the embodiments of the present disclosure may be implemented in a software manner, and may also be implemented in a hardware manner.
- the name of the unit does not constitute a limitation of the unit itself under certain circumstances, for example, the first obtaining unit may also be described as "a unit that obtains at least two Internet Protocol addresses".
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computer Graphics (AREA)
- General Engineering & Computer Science (AREA)
- Computer Hardware Design (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychiatry (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Architecture (AREA)
- Social Psychology (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Signal Processing (AREA)
- Hospice & Palliative Care (AREA)
- Child & Adolescent Psychology (AREA)
- Processing Or Creating Images (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
Claims (12)
- 一种增强现实的图像处理方法,其特征在于,包括:响应于用户触发的图像获取指令,获取目标图像,所述目标图像包含目标物体;获取所述目标物体的增强现实模型,结合所述目标物体输出所述增强现实模型;获取所述用户选择的目标音频数据,根据所述目标音频数据确定具有时序性的音频特征;输出所述目标音频数据时,根据所述目标音频数据的播放进度和所述音频特征驱动所述增强现实模型。
- 根据权利要求1所述的方法,其特征在于,所述根据所述目标音频数据确定具有时序性的音频特征,包括:对所述目标音频数据进行音频特征检测,得到所述目标音频数据的所述具有时序性的音频特征,所述音频特征包括重音、重拍或节拍中的一种或多种的组合。
- 根据权利要求1或2所述的方法,其特征在于,所述增强现实模型包括多个模型单元;所述根据所述目标音频数据的播放进度和所述音频特征驱动所述增强现实模型,包括:根据所述目标音频数据的播放进度和所述音频特征,驱动所述增强现实模型中的模型单元。
- 根据权利要求3所述的方法,其特征在于,根据所述目标音频数据的播放进度和所述音频特征,驱动所述增强现实模型中的模型单元,包括:根据所述具有时序性的音频特征确定所述模型单元进行形态变化的目标时间和目标幅度;若所述目标音频数据的播放进度为所述目标时间,则根据所述目标幅度驱动所述增强现实模型中的模型单元。
- 根据权利要求4所述的方法,其特征在于,所述根据所述目标幅度驱动所述增强现实模型中的模型单元,包括:驱动所述增强现实模型中的预设模型单元进行凸起动作;或者,驱动所述增强现实模型中多个模型单元执行颜色变化;或者,驱动所述增强现实模型中多个模型单元执行透明度变化。
- 根据权利要求1-5中任一所述的方法,其特征在于,在获取所述用户选择的目标音频数据之后,还包括:获取所述目标音频数据的情绪特征;根据所述情绪特征确定所述增强现实模型的第一形变特征,所述第一形变特征用 于驱动所述增强现实模型的形状与所述情绪特征表达的情绪一致;输出所述目标音频数据时,根据所述第一形变特征驱动所述增强现实模型。
- 根据权利要求1-5中任一所述的方法,其特征在于,在获取所述用户选择的目标音频数据之后,还包括:获取所述用户的肢体动作或表情;根据所述肢体动作或表情确定所述增强现实模型的第二形变特征,所述第二形变特征用于驱动所述增强现实模型的形状与所述肢体动作或表情一致;输出所述目标音频数据时,根据所述第二形变特征驱动所述增强现实模型。
- 一种增强现实的图像处理装置,其特征在于,包括:目标图像获取模块,用于响应于用户触发的图像获取指令,获取目标图像,所述目标图像包含目标物体;增强现实模型获取模块,用于获取所述目标物体的增强现实模型,结合所述目标物体输出所述增强现实模型;目标音频获取模块,用于获取所述用户选择的目标音频数据;音频特征确定模块,用于根据所述目标音频数据确定具有时序性的音频特征;输出模块,用于输出所述目标音频数据时,根据所述目标音频数据的播放进度和所述音频特征驱动所述增强现实模型。
- 一种电子设备,其特征在于,所述电子设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一所述的增强现实的图像处理方法。
- 一种包含计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如权利要求1-7中任一所述的增强现实的图像处理方法。
- 一种计算机程序产品,其特征在于,包括计算机程序指令,所述计算机程序指令使得计算机执行如权利要求1-7中任一所述的增强现实的图像处理方法。
- 一种计算机程序,其特征在于,所述计算机程序使得计算机执行如权利要求1-7中任一所述的增强现实的图像处理方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023501106A JP7674462B2 (ja) | 2020-07-10 | 2021-06-04 | 拡張現実の画像処理方法、装置、電子機器及び記憶媒体 |
| EP21837648.1A EP4167192B1 (en) | 2020-07-10 | 2021-06-04 | Image processing method and apparatus for augmented reality, electronic device and storage medium |
| US18/053,476 US11756276B2 (en) | 2020-07-10 | 2022-11-08 | Image processing method and apparatus for augmented reality, electronic device, and storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010662819.X | 2020-07-10 | ||
| CN202010662819.XA CN111833460B (zh) | 2020-07-10 | 2020-07-10 | 增强现实的图像处理方法、装置、电子设备及存储介质 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/053,476 Continuation US11756276B2 (en) | 2020-07-10 | 2022-11-08 | Image processing method and apparatus for augmented reality, electronic device, and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022007565A1 true WO2022007565A1 (zh) | 2022-01-13 |
Family
ID=72899721
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/098456 Ceased WO2022007565A1 (zh) | 2020-07-10 | 2021-06-04 | 增强现实的图像处理方法、装置、电子设备及存储介质 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US11756276B2 (zh) |
| EP (1) | EP4167192B1 (zh) |
| JP (1) | JP7674462B2 (zh) |
| CN (1) | CN111833460B (zh) |
| WO (1) | WO2022007565A1 (zh) |
Families Citing this family (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111833460B (zh) * | 2020-07-10 | 2024-07-26 | 北京字节跳动网络技术有限公司 | 增强现实的图像处理方法、装置、电子设备及存储介质 |
| CN112288877B (zh) * | 2020-10-28 | 2024-12-13 | 北京字节跳动网络技术有限公司 | 视频播放方法、装置、电子设备及存储介质 |
| CN112672185B (zh) * | 2020-12-18 | 2023-07-07 | 脸萌有限公司 | 基于增强现实的显示方法、装置、设备及存储介质 |
| CN113031781A (zh) * | 2021-04-16 | 2021-06-25 | 深圳市慧鲤科技有限公司 | 增强现实资源显示方法及装置、电子设备和存储介质 |
| WO2022222082A1 (zh) * | 2021-04-21 | 2022-10-27 | 深圳传音控股股份有限公司 | 图像控制方法、移动终端及存储介质 |
| US11769289B2 (en) | 2021-06-21 | 2023-09-26 | Lemon Inc. | Rendering virtual articles of clothing based on audio characteristics |
| JP2025056741A (ja) * | 2023-09-26 | 2025-04-08 | ソフトバンクグループ株式会社 | システム |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105608745A (zh) * | 2015-12-21 | 2016-05-25 | 大连新锐天地传媒有限公司 | 应用于图像或视频的ar显示系统 |
| CN106506464A (zh) * | 2016-10-17 | 2017-03-15 | 武汉秀宝软件有限公司 | 一种基于增强现实的玩具交互方法和系统 |
| WO2019029100A1 (zh) * | 2017-08-08 | 2019-02-14 | 山东科技大学 | 一种基于虚拟现实与增强现实的采矿操作多交互实现方法 |
| CN109407918A (zh) * | 2018-09-25 | 2019-03-01 | 苏州梦想人软件科技有限公司 | 增强现实内容多级交互方式的实现方法 |
| CN111833460A (zh) * | 2020-07-10 | 2020-10-27 | 北京字节跳动网络技术有限公司 | 增强现实的图像处理方法、装置、电子设备及存储介质 |
Family Cites Families (20)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP3037865B2 (ja) * | 1994-04-01 | 2000-05-08 | シャープ株式会社 | 3次元スプライト描画装置 |
| JPH0816820A (ja) * | 1994-04-25 | 1996-01-19 | Fujitsu Ltd | 3次元アニメーション作成装置 |
| JP4764663B2 (ja) * | 2005-05-30 | 2011-09-07 | 富士通株式会社 | 仮想3次元座標空間における幾何形状の自動認識方法、その3次元cadシステム及び3次元cadプログラム |
| JP4919467B2 (ja) * | 2006-03-17 | 2012-04-18 | 三菱重工業株式会社 | 活動支援装置 |
| US8370747B2 (en) * | 2006-07-31 | 2013-02-05 | Sony Mobile Communications Ab | Method and system for adapting a visual user interface of a mobile radio terminal in coordination with music |
| KR20100028858A (ko) * | 2008-09-05 | 2010-03-15 | 엔에이치엔(주) | 온라인 음악 게임을 제공하는 시스템 및 그 방법 |
| CN101577114B (zh) * | 2009-06-18 | 2012-01-25 | 无锡中星微电子有限公司 | 一种音频可视化实现方法及装置 |
| US9514570B2 (en) * | 2012-07-26 | 2016-12-06 | Qualcomm Incorporated | Augmentation of tangible objects as user interface controller |
| CN103544724A (zh) * | 2013-05-27 | 2014-01-29 | 华夏动漫集团有限公司 | 一种利用增强现实与卡片识别技术在移动智能终端实现虚拟动漫角色的系统及方法 |
| JP6268287B2 (ja) * | 2014-06-20 | 2018-01-24 | 株式会社ソニー・インタラクティブエンタテインメント | 動画像生成装置、動画像生成方法、プログラム |
| US10445936B1 (en) * | 2016-08-01 | 2019-10-15 | Snap Inc. | Audio responsive augmented reality |
| WO2018139117A1 (ja) * | 2017-01-27 | 2018-08-02 | ソニー株式会社 | 情報処理装置、情報処理方法およびそのプログラム |
| US11232645B1 (en) * | 2017-11-21 | 2022-01-25 | Amazon Technologies, Inc. | Virtual spaces as a platform |
| CN108322802A (zh) * | 2017-12-29 | 2018-07-24 | 广州市百果园信息技术有限公司 | 视频图像的贴图处理方法、计算机可读存储介质及终端 |
| CN108769535B (zh) * | 2018-07-04 | 2021-08-10 | 腾讯科技(深圳)有限公司 | 图像处理方法、装置、存储介质和计算机设备 |
| US10679393B2 (en) * | 2018-07-24 | 2020-06-09 | Snap Inc. | Conditional modification of augmented reality object |
| CN109144610B (zh) * | 2018-08-31 | 2020-11-10 | 腾讯科技(深圳)有限公司 | 音频播放方法、装置、电子装置及计算机可读存储介质 |
| CN110072047B (zh) * | 2019-01-25 | 2020-10-09 | 北京字节跳动网络技术有限公司 | 图像形变的控制方法、装置和硬件装置 |
| US10924875B2 (en) * | 2019-05-24 | 2021-02-16 | Zack Settel | Augmented reality platform for navigable, immersive audio experience |
| EP4295314A4 (en) * | 2021-02-08 | 2025-04-16 | Sightful Computers Ltd | EXTENDED REALITY CONTENT SHARING |
-
2020
- 2020-07-10 CN CN202010662819.XA patent/CN111833460B/zh active Active
-
2021
- 2021-06-04 WO PCT/CN2021/098456 patent/WO2022007565A1/zh not_active Ceased
- 2021-06-04 JP JP2023501106A patent/JP7674462B2/ja active Active
- 2021-06-04 EP EP21837648.1A patent/EP4167192B1/en active Active
-
2022
- 2022-11-08 US US18/053,476 patent/US11756276B2/en active Active
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105608745A (zh) * | 2015-12-21 | 2016-05-25 | 大连新锐天地传媒有限公司 | 应用于图像或视频的ar显示系统 |
| CN106506464A (zh) * | 2016-10-17 | 2017-03-15 | 武汉秀宝软件有限公司 | 一种基于增强现实的玩具交互方法和系统 |
| WO2019029100A1 (zh) * | 2017-08-08 | 2019-02-14 | 山东科技大学 | 一种基于虚拟现实与增强现实的采矿操作多交互实现方法 |
| CN109407918A (zh) * | 2018-09-25 | 2019-03-01 | 苏州梦想人软件科技有限公司 | 增强现实内容多级交互方式的实现方法 |
| CN111833460A (zh) * | 2020-07-10 | 2020-10-27 | 北京字节跳动网络技术有限公司 | 增强现实的图像处理方法、装置、电子设备及存储介质 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4167192A4 * |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111833460A (zh) | 2020-10-27 |
| JP2023533295A (ja) | 2023-08-02 |
| EP4167192A4 (en) | 2023-12-13 |
| US20230061012A1 (en) | 2023-03-02 |
| US11756276B2 (en) | 2023-09-12 |
| EP4167192A1 (en) | 2023-04-19 |
| EP4167192B1 (en) | 2025-07-09 |
| CN111833460B (zh) | 2024-07-26 |
| EP4167192C0 (en) | 2025-07-09 |
| JP7674462B2 (ja) | 2025-05-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2022007565A1 (zh) | 增强现实的图像处理方法、装置、电子设备及存储介质 | |
| US11158102B2 (en) | Method and apparatus for processing information | |
| JP7199527B2 (ja) | 画像処理方法、装置、ハードウェア装置 | |
| CN112672185B (zh) | 基于增强现实的显示方法、装置、设备及存储介质 | |
| WO2020248900A1 (zh) | 全景视频的处理方法、装置及存储介质 | |
| WO2021057740A1 (zh) | 视频生成方法、装置、电子设备和计算机可读介质 | |
| CN112051961A (zh) | 虚拟交互方法、装置、电子设备及计算机可读存储介质 | |
| CN113806306B (zh) | 媒体文件处理方法、装置、设备、可读存储介质及产品 | |
| CN112132859B (zh) | 贴纸生成方法、装置、介质和电子设备 | |
| US12555607B2 (en) | Audio data processing method and apparatus, and device and storage medium | |
| CN114461064A (zh) | 虚拟现实交互方法、装置、设备和存储介质 | |
| CN109600559B (zh) | 一种视频特效添加方法、装置、终端设备及存储介质 | |
| CN110069191B (zh) | 基于终端的图像拖拽变形实现方法和装置 | |
| WO2020155915A1 (zh) | 用于播放音频的方法和装置 | |
| CN111652675A (zh) | 展示方法、装置和电子设备 | |
| US11810336B2 (en) | Object display method and apparatus, electronic device, and computer readable storage medium | |
| WO2020233143A1 (zh) | 播放进度显示方法、装置、电子设备和存储介质 | |
| CN114677738A (zh) | Mv录制方法、装置、电子设备及计算机可读存储介质 | |
| WO2025113515A1 (zh) | 一种增强现实方法、装置、电子设备及存储介质 | |
| WO2022012349A1 (zh) | 动画处理方法、装置、电子设备及存储介质 | |
| CN112380380A (zh) | 显示歌词的方法、装置、设备及计算机可读存储介质 | |
| CN113687902A (zh) | 资源展示方法、装置、计算机设备及存储介质 | |
| CN114329001B (zh) | 动态图片的显示方法、装置、电子设备及存储介质 | |
| CN111292773A (zh) | 音视频合成的方法、装置、电子设备及介质 | |
| JP2025519066A (ja) | 特殊効果ビデオを確定する方法、装置、電子機器及び記憶媒体 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21837648 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2023501106 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2021837648 Country of ref document: EP Effective date: 20230112 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 2021837648 Country of ref document: EP |