WO2024002065A1 - Procédé et appareil de codage vidéo, dispositif électronique et support - Google Patents

Procédé et appareil de codage vidéo, dispositif électronique et support Download PDF

Info

Publication number
WO2024002065A1
WO2024002065A1 PCT/CN2023/102731 CN2023102731W WO2024002065A1 WO 2024002065 A1 WO2024002065 A1 WO 2024002065A1 CN 2023102731 W CN2023102731 W CN 2023102731W WO 2024002065 A1 WO2024002065 A1 WO 2024002065A1
Authority
WO
WIPO (PCT)
Prior art keywords
video frame
information
pose information
target
pixel block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2023/102731
Other languages
English (en)
Chinese (zh)
Inventor
高立鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Vivo Mobile Communication Co Ltd
Original Assignee
Vivo Mobile Communication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vivo Mobile Communication Co Ltd filed Critical Vivo Mobile Communication Co Ltd
Publication of WO2024002065A1 publication Critical patent/WO2024002065A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/513Processing of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • H04N19/137Motion inside a coding unit, e.g. average field, frame or block difference
    • H04N19/139Analysis of motion vectors, e.g. their magnitude, direction, variance or reliability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/182Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a pixel

Definitions

  • This application belongs to the field of video processing technology, and specifically relates to a video encoding method, device, electronic equipment and media.
  • the purpose of the embodiments of the present application is to provide a video encoding method, device, electronic equipment and medium, which can solve the problem of long time consuming video encoding.
  • embodiments of the present application provide a video encoding method.
  • the method includes: obtaining relative pose information of a first video frame and a second video frame.
  • the first video frame is a video frame to be encoded
  • the second video frame is a video frame to be encoded.
  • the video frame is a reference video frame; according to the relative pose information, the motion vector and residual information of the first pixel block are determined, and the first pixel block is the pixel block to be encoded in the first video frame; the motion vector and residual information are Encode.
  • inventions of the present application provide a video encoding device.
  • the video encoding device includes: an acquisition module, a determination module, and an encoding module.
  • the acquisition module is used to acquire the relative pose information of a first video frame and a second video frame
  • the first video frame is the video frame to be encoded
  • the second video frame is the reference video frame.
  • the determination module is configured to determine the motion vector and residual information of the first pixel block based on the relative pose information obtained by the acquisition module.
  • the encoding module is used to encode the motion vector and residual information determined by the determination module.
  • inventions of the present application provide an electronic device.
  • the electronic device includes a processor and a memory.
  • the memory stores programs or instructions that can be run on the processor.
  • the programs or instructions are processed by the processor.
  • the processor is executed, the steps of the method described in the first aspect are implemented.
  • embodiments of the present application provide a readable storage medium.
  • Programs or instructions are stored on the readable storage medium.
  • the steps of the method described in the first aspect are implemented. .
  • inventions of the present application provide a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the first aspect. steps of the method.
  • embodiments of the present application provide a computer program product, the program product is stored in a storage medium, and the program product is executed by at least one processor to implement the steps of the method described in the first aspect.
  • the electronic device can obtain the relative pose information of the first video frame and the second video frame, and determine the motion vector of the first pixel block to be encoded in the first video frame based on the relative pose information. and residual information, so that the electronic device can encode the motion vector and residual information. Since electronic devices can directly For pose information, the motion vector and residual information of the first pixel block are determined without the need for the electronic device to search for a pixel block matching the first pixel block in the second video frame and then determine the motion vector of the first pixel block. The motion vector and residual information, therefore, can reduce the time it takes for the electronic device to determine the motion vector and the residual information of the first pixel block, thereby reducing the time it takes for the electronic device to encode the first pixel block. In this way, the time it takes for the electronic device to encode the first pixel block can be reduced. How long does it take to encode a video.
  • Figure 1 is one of the schematic flow diagrams of the video encoding method provided by the embodiment of the present application.
  • Figure 2 is the second schematic flow chart of the video encoding method provided by the embodiment of the present application.
  • Figure 3 is the third schematic flowchart of the video encoding method provided by the embodiment of the present application.
  • Figure 4 is the fourth schematic flowchart of the video encoding method provided by the embodiment of the present application.
  • Figure 5 is a schematic flow chart of the video encoding method provided by the embodiment of the present application.
  • Figure 6 is a schematic flowchart No. 6 of the video encoding method provided by the embodiment of the present application.
  • Figure 7 is a schematic flow chart of the video encoding method provided by the embodiment of the present application.
  • Figure 8 is a schematic diagram of the mapping relationship between the first pixel block and the second pixel block provided by the embodiment of the present application.
  • Figure 9 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
  • Figure 10 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • Figure 11 is a schematic diagram of the hardware structure of an electronic device provided by an embodiment of the present application.
  • first, second, etc. in the description and claims of this application are used to distinguish similar objects and are not used to describe a specific order or sequence. It is to be understood that the figures so used are interchangeable under appropriate circumstances so that the embodiments of the present application can be practiced in orders other than those illustrated or described herein, and that "first,” “second,” etc. are distinguished Objects are usually of one type, and the number of objects is not limited. For example, the first object can be one or multiple.
  • “and/or” in the description and claims indicates at least one of the connected objects, and the character “/" generally indicates that the related objects are in an "or” relationship.
  • the electronic device can directly obtain the relative pose information of the certain video frame and the reference video frame, and based on the relative pose information, The motion vector and residual information of a certain pixel block are determined, and then the motion vector and residual information are encoded. It can be understood that the electronic device can directly determine the motion vector and residual information of the certain pixel block based on the relative pose information of the certain video frame and the reference video frame, without the need for the electronic device to use the size of the certain pixel block as The sliding window searches for matching blocks that match the certain pixel block one by one in all the pixel blocks of the second video frame.
  • the electronic device can reduce the time it takes for the electronic device to determine the motion vector and residual information of the certain pixel block, thus The time it takes for the electronic device to encode the certain pixel block can be reduced. In this way, the time it takes for the electronic device to encode the video can be reduced.
  • Figure 1 shows a flow chart of a video encoding method provided by an embodiment of the present application.
  • the video encoding method provided by the embodiment of the present application may include the following steps 101 to 103.
  • Step 101 The electronic device obtains the relative pose information of the first video frame and the second video frame.
  • the electronic device may be any of the following: a mobile phone, a tablet computer, a laptop computer, a wearable device, an extended reality (extended reality, XR) device, etc.
  • XR equipment can specifically be XR head-mounted display equipment, such as XR glasses, XR helmets, etc.
  • the first video frame is a video frame to be encoded
  • the second video frame is a reference video frame
  • the first video frame may be a video frame currently being encoded by the electronic device; the second video frame may be a specific video frame, or may be a video frame preceding the first video frame.
  • video frame preceding the first video frame can be understood as: a video frame before the first video frame in the acquired video frame sequence.
  • the camera of the electronic device when the electronic device opens an XR application, the camera of the electronic device can be turned on, and N video frames can be obtained through the camera, so that the electronic device can encode the N video frames.
  • the electronic device when encoding the first video frame, the electronic device can obtain the relative pose information, N is a positive integer.
  • the N video frames may all be XR video frames.
  • first video frame and the second video frame may both be video frames among N video frames.
  • the electronic device may first obtain one posture information of the first video frame in the first coordinate system and another posture information of the second video frame in the first coordinate system, and then Then, the relative pose information is determined based on the pose information and the other pose information.
  • the electronic device may obtain the pose information of the first video frame in the first coordinate system and the pose information of the second video frame in the first coordinate system through the following various examples.
  • the electronic device can directly perform calculations based on any one of the first video frame and the second video frame to obtain the pose information of the either video frame in the first coordinate system.
  • the electronic device may perform inertial measurement based on any one of the first video frame and the second video frame and the motion information corresponding to the any video frame (that is, when acquiring the any video frame)
  • the motion information obtained by the inertial measurement unit (IMU)) is calculated to obtain the pose information of any video frame in the first coordinate system.
  • the electronic device can calculate the pose information of any video frame in the first coordinate system based on the motion tracking image frame and the motion information corresponding to the motion tracking image frame. It should be noted that the timing of the motion tracking image frame is synchronized with the timing of any one of the first video frame and the second video frame.
  • the motion information corresponding to the motion tracking image frame refers to the motion information obtained by the IMU when acquiring the motion tracking image frame.
  • the electronic device can collect motion tracking image frames through at least one motion tracking camera, such as a fisheye camera, and The timing synchronization module synchronizes the timing of the motion tracking image frame with the timing of any video frame, so that the electronic device can obtain any video frame based on the motion information corresponding to the motion tracking image frame and the motion tracking image frame. Pose information in a coordinate system.
  • the electronic device can directly obtain the pose information of the previous video frame in the first coordinate system of any one of the first video frame and the second video frame, and then based on the previous video frame
  • the pose information of any video frame in the first coordinate system and the motion information corresponding to any video frame are calculated to obtain the pose information of any video frame in the first coordinate system.
  • the motion information corresponding to any video frame refers to It is the motion information obtained by the IMU when acquiring any video frame.
  • the electronic device while the electronic device obtains the pose information of any one of the first video frame and the second video frame in the first coordinate system, the electronic device can also obtain another one through the above-mentioned examples.
  • the pose information of a video frame that is, the video frame other than any one of the first video frame and the second video frame.
  • the way in which the electronic device obtains the pose information of any video frame in the first coordinate system may be the same as the way in which it obtains the pose information of another video frame in the first coordinate system, or may be different.
  • the electronic device can perform calculations based on the first video frame to obtain the pose information of the first video frame in the first coordinate system; and perform calculations based on the second video frame to obtain the pose information of the second video frame in the first coordinate system.
  • the pose information in the coordinate system that is, the way in which the electronic device obtains the pose information of the first video frame in the first coordinate system is the same as the way in which it obtains the pose information of the second video frame in the first coordinate system.
  • the electronic device can perform calculations based on the first video frame to obtain the pose information of the first video frame in the first coordinate system; and based on the second video frame and the motion information corresponding to the second video frame, Calculate the pose information of the second video frame in the first coordinate system, that is, the way in which the electronic device obtains the pose information of the first video frame in the first coordinate system is the same as obtaining the pose information of the second video frame in the first coordinate system. pose information in different ways.
  • Step 102 The electronic device determines the motion vector and residual information of the first pixel block based on the relative pose information.
  • the first pixel block is a pixel block to be encoded in the first video frame.
  • the first pixel block may specifically be a pixel block currently being encoded by the electronic device.
  • step 102 can be specifically implemented through the following steps 102a and 102b.
  • Step 102a The electronic device determines a second pixel block in the second video frame that matches the first pixel block based on the relative pose information.
  • any pair of matching feature points (pixel blocks) will be located on each other's epipolar lines, that is, any pair of matching feature points (pixel blocks) ) satisfies the epipolar constraint. Therefore, the electronic device can determine the pixel block in the second video frame that matches the first pixel block based on the relative pose information.
  • the electronic device may use the position information of the first pixel block to determine the position information of the second pixel block according to the relative pose information to determine the second pixel block.
  • the position information of the first pixel block may be: the position information of the first pixel block in the first video frame; the position information of the second pixel block may be: the position of the second pixel block in the second video frame. information.
  • step 102a may be specifically implemented through the following steps 102a1 and 102a2.
  • Step 102a1 The electronic device determines the mapping parameters based on the relative pose information and the target internal parameter matrix.
  • the above-mentioned target internal parameter matrix is the internal parameter matrix of the first camera, and the above-mentioned first video frame and the second video frame are obtained by the first camera; the above-mentioned mapping parameter is used to indicate the relationship between the pixel block of the first video frame and the first video frame. Mapping relationship between pixel blocks of two video frames.
  • the above-mentioned first camera may specifically be: a color (red green blue, RGB) camera.
  • the relative pose information may include a translation vector and a rotation matrix, such as the third translation vector and the third rotation matrix in the following embodiments.
  • the translation vector is: when obtaining the first The relative translation vector in the first coordinate system when one video frame is acquired and when the second video frame is acquired.
  • the rotation matrix is: the relative translation vector in the first coordinate system when the first video frame is acquired and when the second video frame is acquired. Relative rotation matrix; thus the electronic device can use the epipolar constraint equation to calculate the fourth translation vector based on the translation vector, and use the epipolar constraint equation to calculate the fourth rotation matrix based on the rotation matrix to obtain the mapping parameter.
  • the first coordinate system may be a world coordinate system.
  • mapping parameters include a fourth translation vector and a fourth rotation matrix.
  • Step 102a2 The electronic device calculates the position information of the second pixel block based on the position information and mapping parameters of the first pixel block.
  • the position information of the first pixel block may be: the coordinate information of the first pixel block in the first video frame; the position information of the second pixel block may be: the second pixel block is in the second video frame. Coordinates in the video frame information.
  • the electronic device may translate the coordinate information of the first pixel block according to the fourth translation vector, and rotate the coordinate information of the first pixel block according to the fourth rotation matrix, so as to Obtain the coordinate information of the second pixel block.
  • the electronic device when the electronic device determines the second pixel block corresponding to the first pixel block, the electronic device can directly determine the mapping parameters based on the relative pose information and the target internal parameter matrix, and use the mapping parameters to directly determine the mapping parameters based on the first pixel block.
  • the position information of the second pixel block is calculated to obtain the position information of the second pixel block without searching for the second pixel block matching the first pixel block in the second video frame. Therefore, the time-consuming search for the second pixel block can be reduced. , In this way, the time-consuming video encoding performed by the electronic device can be reduced.
  • Step 102b The electronic device determines the motion vector and residual information of the first pixel block based on the position information of the first pixel block and the second pixel block.
  • the electronic device may calculate the motion vector of the first pixel block based on the position information of the first pixel block and the position information of the second pixel block.
  • the electronic device may determine the difference between the coordinate information of the first pixel block and the coordinate information of the second pixel block as the motion vector of the first pixel block.
  • the above residual information is used to indicate the residual between the first pixel block and the second pixel block.
  • the above-mentioned residual information may specifically include: brightness residual information and color difference residual information; thus, the electronic device can combine the brightness component of the first pixel block and the brightness component of the second pixel block. The difference between them is determined as the brightness residual information; and the difference between the color difference component of the first pixel block and the color difference component of the second pixel block is determined as the color difference residual information to determine the residual information.
  • the electronic device can directly determine the second pixel block in the second video frame that matches the first pixel block based on the relative pose information, without searching for the first pixel block in the second video frame. Therefore, the electronic device can quickly determine the motion vector and residual information of the first pixel block based on the position information of the first pixel block and the second pixel block, thereby improving the electronic device's ability to obtain the first pixel block. Efficiency of motion vector and residual information of pixel blocks.
  • Step 103 The electronic device encodes the motion vector and residual information.
  • the residual information may contain more 0 values, so that the motion vector and the residual information may be different in the electronic device. Encoding can compress the many 0 values to improve the efficiency of video compression.
  • the electronic device after the electronic device encodes the motion vector and residual information, the electronic device can transmit the encoded video stream to the recipient of the XR application.
  • the electronic device when the electronic device encodes the first pixel block in the first coded video frame, the electronic device can obtain the relative pose information and directly determine the motion vector of the first pixel block based on the relative pose information. and residual information, so that the electronic device can encode the motion vector and residual information.
  • the electronic device can obtain the relative pose information of the first video frame and the second video frame, and determine the first pixel block to be encoded in the first video frame based on the relative pose information. motion vector and residual information, so that the electronic device can encode the motion vector and residual information.
  • the electronic device can directly determine the motion vector and residual information of the first pixel block based on the obtained relative pose information, without the need for the electronic device to search for a pixel block matching the first pixel block in the second video frame.
  • the motion vector and residual information of the first pixel block are then determined, thus reducing the time it takes for the electronic device to determine the motion vector and residual information of the first pixel block, thereby reducing the time it takes for the electronic device to encode the first pixel block. time, thus reducing the time it takes for electronic devices to encode video.
  • the electronic device since the electronic device does not need to search for the second pixel block matching the first pixel block in the second video frame, the consumption of computing power and power caused by using the search algorithm can be avoided; and when the user passes the XR class
  • video calls are applied (such as holographic video calls, remote expert systems, telemedicine, etc.)
  • the time-consuming video encoding performed by electronic devices is reduced, the real-time nature of the video calls can be improved.
  • the first video frame and the second video frame are acquired by a second camera.
  • the video encoding method provided by the embodiment of the present application can also include the following step 201, and the above step 101 can be implemented through the following step 101a. .
  • Step 201 The electronic device obtains the first posture information and the second posture information.
  • the above-mentioned first pose information is the pose information of the first video frame in the first coordinate system
  • the above-mentioned second pose information is the pose information of the second video frame in the first coordinate system
  • the first coordinate system may be a world coordinate system.
  • the second camera may be an RGB camera.
  • the second camera can be the same camera as the first camera.
  • step 201 can be specifically implemented through the following step 201a.
  • Step 201a The electronic device calculates the first target pose information based on the image information of the first target video frame.
  • the first target video frame is a first video frame or a second video frame.
  • the first target pose information when the first target video frame is the first video frame, the first target pose information is the first pose information; when the first target video frame is the second video frame, the first target pose information is the first pose information.
  • the first target pose information is the second pose information.
  • the image information of the first target video frame may specifically include at least one of the following: position information of feature points in the first target video frame, and depth information of the first target video frame.
  • the electronic device may use a simultaneous localization and mapping (SLAM) algorithm to calculate the first target pose information based on the image information of the first target video frame.
  • SLAM simultaneous localization and mapping
  • the electronic device can directly calculate the first pose information (or the second pose information) based on the image information of the first video frame (or the second video frame), so that the electronic device can quickly obtain the relative pose information, Therefore, the time-consuming video encoding performed by the electronic device can be further reduced.
  • step 201 can be specifically implemented through the following steps 201b to 201d.
  • Step 201b The electronic device obtains the second target video frame and the first motion information.
  • the second target video frame is a first video frame or a second video frame.
  • the above-mentioned first motion information is the motion information obtained by the IMU when the second camera acquires the second target video frame.
  • the above-mentioned first motion information may be at least one of the following: acceleration information and angular velocity information.
  • the electronic device when the user watches XR images or makes a video call through an XR application, the electronic device can obtain the second target video frame through the second camera, and obtain the first motion information through the IMU .
  • Step 201c The electronic device calculates the third pose information based on the second target video frame, the first motion information and the first external parameter matrix.
  • the above-mentioned first external parameter matrix is the external parameter matrix between the second camera and the IMU; the above-mentioned third external parameter matrix
  • the pose information is the pose information of the IMU in the first coordinate system.
  • the first extrinsic parameter matrix is pre-stored in the electronic device, so that the electronic device can directly obtain the first extrinsic parameter matrix, and calculate the first extrinsic parameter matrix based on the second target video frame, the first motion information and the third An external parameter matrix is used to calculate the third pose information.
  • the electronic device may use a SLAM algorithm to calculate the third pose information based on the second target video frame, the first motion information and the first external parameter matrix.
  • Step 201d The electronic device calculates the second target pose information based on the third pose information and the first external parameter matrix.
  • the second target pose information when the second target video frame is the first video frame, the second target pose information is the first pose information; when the second target video frame is the second video frame, the second target pose information is the first pose information.
  • the second target pose information is the second pose information.
  • the electronic device can obtain the first motion information when acquiring the first video frame or the second video frame, the electronic device can directly calculate the first posture information or the second posture information, so that the electronic device can Relative pose information can be obtained quickly, thus further reducing the time consuming of video encoding by electronic devices.
  • step 201 can be specifically implemented through the following steps 201e to 201g.
  • Step 201e The electronic device obtains the motion tracking image frame and the second motion information.
  • the above-mentioned second motion information is motion information obtained by the IMU when at least one third camera acquires a motion tracking image frame.
  • the at least one third camera may specifically include X third cameras, where X is a positive even number.
  • each third camera can be a motion tracking camera.
  • the X third cameras may be two third cameras, or four third cameras, or the like.
  • the second motion information may be at least one of the following: acceleration information and angular velocity information.
  • the above-mentioned motion tracking image frame may be a first motion image frame or a second motion tracking image frame; thus, when the user watches XR images or makes a video call through an XR application, the electronic The device can acquire the first video frame through the second camera, acquire the first motion tracking image frame through at least one third camera, and acquire the second motion information through the IMU; or, the electronic device can acquire the second video frame through the second camera. , and obtain the second motion tracking image frame through at least one third camera, and obtain the second motion information through the IMU.
  • Step 201f The electronic device calculates fourth pose information based on the motion tracking image frame, the second motion information and at least one second external parameter matrix.
  • the fourth pose information is the pose information of the IMU in the first coordinate system
  • the at least one second extrinsic parameter matrix is an extrinsic parameter matrix between at least one third camera and the IMU.
  • the electronic device may use a SLAM algorithm to calculate the fourth pose information based on the motion tracking image frame, the second motion information and at least one second external parameter matrix.
  • Step 201g The electronic device calculates the third target pose information based on the fourth pose information and the first external parameter matrix.
  • the above-mentioned first external parameter matrix is an external parameter matrix between the second camera and the IMU.
  • the third target pose information when the third target video frame is the first video frame, the third target pose information is the first pose information; when the third target video frame is the second video frame, the third target pose information is the first pose information.
  • the three-target pose information is the second pose information; the timing of the above-mentioned motion tracking image frame and the third target video frame is synchronized.
  • the motion tracking image frame is the first motion tracking image frame
  • the motion tracking image frame is Second motion tracking map like frames.
  • the electronic device can obtain the second motion information when acquiring the motion tracking image frame that is synchronized with the timing of the first video frame (or the second video frame), the electronic device can directly calculate the first posture information or the second motion information. Two pose information, so that the electronic device can quickly obtain the relative pose information, therefore, the time consuming of video encoding by the electronic device can be further reduced.
  • step 201 can be implemented through the following steps 201h to 201j.
  • Step 201h The electronic device obtains the second pose information and the third motion information.
  • the above-mentioned third motion information is the motion information obtained by the IMU when the second camera acquires the first video frame.
  • the third motion information may be at least one of the following: acceleration information and angular velocity information.
  • the second video frame is the previous video frame of the first video frame.
  • the electronic device can obtain the first video frame through the second camera and obtain the third motion information through the IMU.
  • Step 201i The electronic device integrates and calculates the fifth pose information based on the second pose information and the third motion information.
  • the fifth pose information is the pose information of the IMU in the first coordinate system.
  • Step 201j The electronic device calculates the first posture information based on the fifth posture information and the first external parameter matrix.
  • the above-mentioned first external parameter matrix is an external parameter matrix between the second camera and the IMU.
  • the electronic device can obtain the second pose information by using the above-mentioned steps 201h to 201j. That is, the electronic device can obtain the pose information and the fourth motion information of the previous video frame in the first coordinate system of the second video frame, and based on the pose information and the fourth motion information of the previous video frame in the first coordinate system, The motion information is integrated and calculated to obtain the pose information of the IMU in the first coordinate system, so that the electronic device can calculate the second pose information based on the pose information and the first external parameter matrix.
  • the electronic device can directly obtain the pose information of the second video frame in the first coordinate system and the third motion information obtained by the IMU when the second camera obtains the first video frame, the electronic device can directly calculate The first posture information or the second posture information allows the electronic device to quickly obtain the relative posture information, therefore, the time consuming of video encoding by the electronic device can be further reduced.
  • Step 101a The electronic device determines relative pose information based on the first pose information and the second pose information.
  • the electronic device may use a preset algorithm to determine the relative pose information based on the first pose information and the second pose information.
  • the electronic device can directly determine the relative pose information based on the first pose information and the second pose information, the accuracy of obtaining the relative pose information can be improved. In this way, the electronic device can improve the accuracy of determining the first pixel. The accuracy of the second pixel block corresponding to the block.
  • the following will take the preset algorithm including the first preset algorithm and the second preset algorithm as an example for illustration.
  • the above-mentioned first attitude information includes a first translation vector and a first rotation matrix.
  • the first translation vector is the position of the second camera relative to the origin of the first coordinate system when acquiring the first video frame.
  • the translation vector, the first rotation matrix is the rotation matrix of the second camera relative to the first coordinate axis of the first coordinate system when acquiring the first video frame;
  • the above-mentioned second pose information includes a second translation vector and a second rotation Matrix, the second translation vector is the translation vector of the second camera relative to the origin of the first coordinate system when acquiring the second video frame, and the second rotation matrix is the translation vector of the second camera relative to the first coordinate when acquiring the second video frame
  • the above step 101a can be implemented through the following steps 101a1 to 101a3.
  • Step 101a1 The electronic device determines a third rotation matrix based on the first rotation matrix and the second rotation matrix.
  • the above-mentioned first coordinate axis may be the X-axis, or the Y-axis, or Z axis.
  • the electronic device may use a first preset algorithm to determine the third rotation matrix based on the first rotation matrix and the second rotation matrix.
  • R is the third rotation matrix
  • R1 is the first rotation matrix
  • R2 is the second rotation matrix
  • Step 101a2 The electronic device calculates a third translation vector based on the first translation vector, the second translation vector and the first rotation matrix.
  • the electronic device may use a second preset algorithm to determine the third translation vector based on the first translation vector, the second translation vector and the first rotation matrix.
  • t is the third translation vector
  • t1 is the first translation vector
  • t2 is the second translation vector
  • R2 is the first rotation matrix
  • the relative pose information includes a third rotation matrix and a third translation vector.
  • the electronic device can accurately determine the third translation vector and the third rotation matrix based on the first translation vector, the first rotation matrix, the second translation vector and the second rotation matrix, so as to improve the accuracy of determining the relative pose information. .
  • the way in which the electronic device obtains the first posture information may be the same as or different from the way in which the electronic device obtains the second posture information (or the first posture information).
  • the electronic device can obtain the first posture information using the method of Example 1, and obtain the second posture information using the method of Example 1, that is, the electronic device can obtain the first posture information and obtain the second posture information.
  • Information is the same way.
  • the electronic device can obtain the first posture information by using the method of Example 1, and obtain the second posture information by using the method of Example 3. That is, the method of obtaining the first posture information by the electronic device is the same as obtaining the second posture information. posture information in different ways.
  • Scenario 1 Users watch XR images through XR applications.
  • the electronic device can turn on the camera of the electronic device (such as the first camera or the second camera) to collect N video frames, and obtain N pieces of motion information through the IMU. corresponds to N video frames one-to-one, and then when encoding the first video frame among the N video frames, the electronic device can read the first external signal between the camera (i.e., the first camera or the second camera) and the IMU.
  • the camera of the electronic device such as the first camera or the second camera
  • the electronic device can determine the relative pose information of the first video frame and the second video frame based on the first pose information and the second pose information; at this time, the electronic device can read the camera of the electronic device (i.e., the first camera or second camera), and determine mapping parameters based on the relative pose information and the target internal parameter matrix.
  • the mapping parameters are used to indicate the mapping relationship between the pixel blocks of the first video frame and the pixel blocks of the second video frame. , and then the electronic device can calculate the position information of the second pixel block based on the position information and mapping parameters of the first pixel block, and determine the motion vector of the first pixel block based on the position information of the first pixel block and the second pixel block. and residual information, and the electronic device can then encode the motion vector and residual information.
  • the first pixel block is a 4*4 pixel block
  • the position information of the first pixel block in the first video frame (for example, I1) is (x, y), so that the electronic device
  • the position information of the second pixel block in the second video frame (for example, I2) can be calculated based on the position information and mapping parameters of the first pixel block.
  • the position information of the second pixel block in I2 is (x', y ').
  • Scenario 2 Users make video calls through XR applications.
  • the electronic device can turn on the camera of the electronic device (such as the first camera or the second camera) to collect N video frames, and turn on at least one third camera of the electronic device to collect N video frames respectively.
  • motion tracking image frames, and N motion information is obtained through the IMU.
  • the N motion information corresponds to the N motion tracking image frames one-to-one.
  • the electronic device can first compare the timing of each video frame with each motion tracking image. The timing of the frame is synchronized, and then the external parameter matrix between at least one third camera and the IMU is read.
  • the electronic device can track the image frame based on the first motion synchronized with the timing of the first video frame, and the first motion tracking
  • the motion information corresponding to the image frame and at least one second external parameter matrix determine the first posture information of the first video frame in the first coordinate system, and track the image frame based on the second motion synchronized with the timing of the second video frame
  • the motion information corresponding to the second motion tracking image frame and at least one second extrinsic parameter matrix determine the second pose information of the second video frame in the first coordinate system.
  • the electronic device can determine the relative pose information of the first video frame and the second video frame based on the first pose information and the second pose information; at this time, the electronic device can read the camera of the electronic device (i.e., the first camera or second camera), and determine mapping parameters based on the relative pose information and the target internal parameter matrix.
  • the mapping parameters are used to indicate the mapping relationship between the pixel blocks of the first video frame and the pixel blocks of the second video frame.
  • the electronic device can calculate the position information of the second pixel block based on the position information and mapping parameters of the first pixel block, and determine the motion vector of the first pixel block based on the position information of the first pixel block and the second pixel block. and residual information, and then the electronic device can encode the motion vector and residual information, and transmit the encoded video stream to the receiver of the XR application.
  • the execution subject may be a video encoding device.
  • a video encoding device performing a video encoding method is used as an example to illustrate the video encoding device provided by the embodiments of the present application.
  • FIG. 9 shows a possible structural diagram of the video encoding device involved in the embodiment of the present application.
  • the video encoding device 60 may include: an acquisition module 61 , a determination module 62 and an encoding module 63 .
  • the acquisition module 61 is used to acquire the relative pose information of the first video frame and the second video frame, the first video frame is the video frame to be encoded, and the second video frame is the reference video frame.
  • the determination module 62 is configured to determine the motion vector and residual information of the first pixel block based on the relative pose information obtained by the acquisition module 61 .
  • the encoding module 63 is used to encode the motion vector and residual information determined by the determination module 62 .
  • the above-mentioned determination module 62 is specifically configured to determine a second pixel block matching the first pixel block in the second video frame based on the relative pose information; and based on the first pixel block and the second The position information of the pixel block determines the motion vector and residual information of the first pixel block.
  • the above-mentioned determination module 62 is specifically used to determine the mapping parameters according to the relative pose information and the target internal parameter matrix.
  • the target internal parameter matrix is the internal parameter matrix of the first camera, the first video frame and the second
  • the video frame is acquired by the first camera;
  • the mapping parameter is used to indicate the mapping relationship between the pixel block of the first video frame and the pixel block of the second video frame; and based on the position information and mapping parameters of the first pixel block, the third video frame is calculated.
  • the position information of the two-pixel block is specifically used to determine the mapping parameters according to the relative pose information and the target internal parameter matrix.
  • the target internal parameter matrix is the internal parameter matrix of the first camera, the first video frame and the second
  • the video frame is acquired by the first camera
  • the mapping parameter is used to indicate the mapping relationship between the pixel block of the first video frame and the pixel block of the second video frame
  • the third video frame is calculated.
  • the position information of the two-pixel block is calculated.
  • the first video frame and the second video frame are acquired by a second camera.
  • the above-mentioned acquisition module 61 is specifically used to obtain the first pose information and the second pose information.
  • the first pose information is the pose information of the first video frame in the first coordinate system.
  • the second pose information is The pose information of the second video frame in the first coordinate system.
  • the above-mentioned determining module 62 is specifically used to determine relative pose information based on the first pose information and the second pose information acquired by the acquisition module 61 .
  • the above-mentioned determination module 62 is also used to calculate the first target pose information based on the image information of the first target video frame.
  • the first target pose information is the first pose information
  • the first target pose information is the second pose information.
  • the above-mentioned acquisition module 61 is also used to acquire the second target video frame and the first motion information.
  • the first motion information is the motion information acquired by the IMU when the second camera acquires the target video frame.
  • the above-mentioned determination module 62 is also used to calculate the third pose information based on the second target video frame, the first motion information and the first external parameter matrix acquired by the acquisition module 61.
  • the third pose information is the IMU in the first The pose information in the coordinate system, the first external parameter The matrix is the external parameter matrix between the second camera and the IMU; and based on the third pose information and the first external parameter matrix, the second target pose information is calculated.
  • the second target pose information is the first pose information
  • the second target pose information is the second pose information
  • the second target pose information is the second pose information
  • the second target pose information is the second pose information.
  • the above-mentioned acquisition module 61 is also used to acquire motion tracking image frames and second motion information.
  • the second motion information is the motion acquired by the IMU when at least one third camera acquires the motion tracking image frame. information.
  • the above-mentioned determination module 62 is also used to calculate the fourth pose information based on the motion tracking image frame, the second motion information and at least one second external parameter matrix acquired by the acquisition module 61.
  • the fourth pose information is the IMU in the third pose information in a coordinate system, the at least one second extrinsic parameter matrix is an extrinsic parameter matrix between at least a third camera and the IMU; and based on the fourth pose information and the first extrinsic parameter matrix, the third Target pose information, the first extrinsic parameter matrix is the extrinsic parameter matrix between the second camera and the IMU.
  • the third target pose information is the first pose information
  • the third target video frame is the second video frame
  • the third target pose information is the second pose information; the timing synchronization of the motion tracking image frame and the third target video frame.
  • the above-mentioned acquisition module 61 is also used to acquire the second pose information and the third motion information.
  • the third motion information is the motion acquired by the IMU when the second camera acquires the first video frame. information.
  • the above-mentioned determination module 62 is also used to perform integral calculation to obtain the fifth pose information based on the second pose information and the third motion information acquired by the acquisition module 61.
  • the fifth pose information is the position of the IMU in the first coordinate system. pose information; and calculate the first pose information based on the fifth pose information and the first extrinsic parameter matrix, which is the extrinsic parameter matrix between the second camera and the IMU.
  • the first posture information includes a first translation vector and a first rotation matrix
  • the second posture information includes a second translation vector and a second rotation matrix.
  • the above-mentioned determination module 62 is specifically configured to calculate a third rotation matrix based on the first rotation matrix and the second rotation matrix; and calculate the third translation vector based on the first translation vector, the second translation vector and the first rotation matrix.
  • the above-mentioned relative pose information includes a third rotation matrix and a third translation vector;
  • the above-mentioned first translation vector is the translation vector of the second camera relative to the origin of the first coordinate system when acquiring the first video frame
  • the above-mentioned first rotation matrix is the rotation matrix of the second camera relative to the first coordinate axis of the first coordinate system when acquiring the first video frame;
  • the above-mentioned second translation vector is the rotation matrix of the second camera relative to the origin of the first coordinate system when acquiring the second video frame.
  • Translation vector, the above-mentioned second rotation matrix is the rotation matrix of the second camera relative to the first coordinate axis of the first coordinate system when acquiring the second video frame.
  • the video coding device can directly determine the motion vector and residual information of the first pixel block based on the obtained relative pose information, without the need for the video coding device to search in the second video frame. After the pixel block matches the first pixel block, the motion vector and residual information of the first pixel block are determined. Therefore, the time consuming for the video encoding device to determine the motion vector and residual information of the first pixel block can be reduced. Therefore, the time it takes for the video encoding device to encode the first pixel block can be reduced. In this way, the time it takes for the video encoding device to encode the video can be reduced.
  • the video encoding device in the embodiment of the present application may be an electronic device or a component in the electronic device, such as an integrated circuit or chip.
  • the electronic device may be a terminal or other devices other than the terminal.
  • the electronic device can be a mobile phone, a tablet computer, a notebook computer, a handheld computer, a vehicle-mounted electronic device, a mobile internet device (MID), or augmented reality (AR)/virtual reality (VR).
  • MID mobile internet device
  • AR augmented reality
  • VR virtual reality
  • the video encoding device in the embodiment of the present application may be a device with an operating system.
  • the operating system can be an Android operating system, an iOS operating system, or other possible operating systems. This application implements Examples are not specifically limited.
  • the video encoding device provided by the embodiments of the present application can implement various processes implemented by the method embodiments of Figures 1 to 8. To avoid repetition, they will not be described again here.
  • this embodiment of the present application also provides an electronic device 70, which includes a processor 71 and a memory 72.
  • the memory 72 stores information that can run on the processor 71.
  • the program or instruction when executed by the processor 71, implements each process step of the above video encoding method embodiment, and can achieve the same technical effect. To avoid duplication, it will not be described again here.
  • the electronic devices in the embodiments of the present application include the above-mentioned mobile electronic devices and non-mobile electronic devices.
  • Figure 11 is a schematic diagram of the hardware structure of an electronic device that implements an embodiment of the present application.
  • the electronic device 1100 includes but is not limited to: radio frequency unit 1101, network module 1102, audio output unit 1103, input unit 1104, sensor 1105, display unit 1106, user input unit 1107, interface unit 1108, memory 1109, processor 1110, etc. part.
  • the electronic device 1100 may also include a power supply (such as a battery) that supplies power to various components.
  • the power supply may be logically connected to the processor 1110 through a power management system, thereby managing charging, discharging, and function through the power management system. Consumption management and other functions.
  • the structure of the electronic device shown in Figure 11 does not constitute a limitation on the electronic device.
  • the electronic device may include more or less components than shown in the figure, or combine certain components, or arrange different components, which will not be described again here. .
  • the processor 1110 is used to obtain the relative pose information of the first video frame and the second video frame.
  • the first video frame is the video frame to be encoded
  • the second video frame is the reference video frame; according to the relative pose information , determine the motion vector and residual information of the first pixel block, which is the pixel block to be encoded in the first video frame; encode the motion vector and residual information.
  • the electronic device provided by the embodiments of the present application can directly determine the motion vector and residual information of the first pixel block based on the obtained relative pose information, without the need for the electronic device to search for the second video frame corresponding to the first pixel block. After a pixel block is matched to a pixel block, the motion vector and residual information of the first pixel block are determined. Therefore, the time consuming for the electronic device to determine the motion vector and residual information of the first pixel block can be reduced, thereby reducing the electronic device's time consumption. The time it takes for the device to encode the first pixel block can reduce the time it takes for the electronic device to encode the video.
  • the processor 1110 is specifically configured to determine a second pixel block in the second video frame that matches the first pixel block according to the relative pose information; according to the first pixel block and the second pixel The position information of the block determines the motion vector and residual information of the first pixel block.
  • the electronic device can directly determine the second pixel block in the second video frame that matches the first pixel block based on the relative pose information, there is no need to search for the second pixel block in the second video frame that matches the first pixel block. Therefore, the electronic device can quickly determine the motion vector and residual information of the first pixel block based on the position information of the first pixel block and the second pixel block, thereby improving the electronic device's ability to obtain the first pixel block. Efficiency of motion vector and residual information.
  • the processor 1110 is specifically configured to determine the mapping parameters according to the relative pose information and the target internal parameter matrix.
  • the target internal parameter matrix is the internal parameter matrix of the first camera, the first video frame and the second The video frame is acquired by the first camera; the mapping parameter is used to indicate the mapping relationship between the pixel block of the first video frame and the pixel block of the second video frame; according to the position information and mapping parameters of the first pixel block, the second Location information of the pixel block.
  • the electronic device when the electronic device determines the second pixel block corresponding to the first pixel block, the electronic device can directly determine the mapping parameters based on the relative pose information and the target internal parameter matrix, and use the mapping parameters to directly determine the mapping parameters based on the first pixel block.
  • the position information of the second pixel block is calculated to obtain the position information of the second pixel block without searching for the second pixel block matching the first pixel block in the second video frame. Therefore, the time-consuming search for the second pixel block can be reduced. , In this way, the time-consuming process of video encoding by electronic devices can be reduced.
  • the first video frame and the second video frame are acquired by a second camera.
  • the processor 1110 is also configured to obtain first posture information and second posture information.
  • the first posture information is the posture information of the first video frame in the first coordinate system.
  • the second posture information is the second posture information.
  • the processor 1110 is specifically configured to determine relative pose information based on the first pose information and the second pose information.
  • the electronic device can directly determine the relative pose information based on the first pose information and the second pose information, the accuracy of obtaining the relative pose information can be improved. In this way, the electronic device can improve the accuracy of determining the first pixel. The accuracy of the second pixel block corresponding to the block.
  • the processor 1110 is also configured to calculate the first target pose information based on the image information of the first target video frame.
  • the first target pose information is the first pose information
  • the first target pose information is the second pose information
  • the electronic device can directly calculate the first posture information (or the second posture information) based on the image information of the first video frame (or the second video frame), so that the electronic device can quickly obtain the relative position. Therefore, the time-consuming video encoding by electronic devices can be further reduced.
  • the processor 1110 is also used to obtain the second target video frame and the first motion information.
  • the first motion information is the motion information obtained by the IMU when the second camera acquires the target video frame. ; And based on the second target video frame, the first motion information and the first external parameter matrix, the third pose information is calculated.
  • the third pose information is the pose information of the IMU in the first coordinate system.
  • the first The extrinsic parameter matrix is an extrinsic parameter matrix between the second camera and the IMU; and the second target pose information is calculated based on the third pose information and the first extrinsic parameter matrix.
  • the second target pose information is the first pose information
  • the second target pose information is the second pose information
  • the electronic device can obtain the first motion information when acquiring the first video frame or the second video frame, the electronic device can directly calculate the first posture information or the second posture information, so that the electronic device can Relative pose information can be obtained quickly, thus further reducing the time consuming of video encoding by electronic devices.
  • the processor 1110 is also configured to obtain a motion tracking image frame and second motion information.
  • the second motion information is the motion acquired by the IMU when at least one third camera acquires a motion tracking image frame. information; calculate the fourth pose information according to the motion tracking image frame, the second motion information and at least one second external parameter matrix.
  • the fourth pose information is the pose information of the IMU in the first coordinate system, which is at least A second extrinsic parameter matrix is an extrinsic parameter matrix between at least a third camera and the IMU; the third target pose information is calculated based on the fourth pose information and the first extrinsic parameter matrix, and the first extrinsic parameter matrix is External parameter matrix between the second camera and IMU.
  • the third target pose information is the first pose information
  • the third target video frame is the second video frame
  • the third target pose information The information is the second pose information; the timing synchronization of the motion tracking image frame and the third target video frame.
  • the electronic device can obtain the second motion information when acquiring the motion tracking image frame that is synchronized with the timing of the first video frame (or the second video frame), the electronic device can directly calculate the first posture information or the third motion information. Two pose information, so that the electronic device can quickly obtain the relative pose information, therefore, the time consuming of video encoding by the electronic device can be further reduced.
  • the processor 1110 is also used to obtain the second pose information and the third motion information.
  • the third motion information is the motion acquired by the IMU when the second camera acquires the first video frame. information; and based on the second pose information and the third motion information, the fifth pose information is obtained by integral calculation, and the fifth pose information is the pose information of the IMU in the first coordinate system; and, according to the fifth pose information Information and the first external parameter matrix, the first attitude information is calculated.
  • the first external parameter matrix is the external parameter matrix between the second camera and the IMU.
  • the electronic device can directly obtain the pose information of the second video frame in the first coordinate system and the third motion information obtained by the IMU when the second camera obtains the first video frame, the electronic device can directly calculate The first posture information or the second posture information allows the electronic device to quickly obtain the relative posture information, therefore, the time consuming of video encoding by the electronic device can be further reduced.
  • the first posture information includes a first translation vector and a first rotation matrix
  • the second posture information includes a second translation vector and a second rotation matrix
  • the processor 1110 is specifically configured to calculate a third rotation matrix based on the first rotation matrix and the second rotation matrix; and calculate the third translation vector based on the first translation vector, the second translation vector, and the first rotation matrix.
  • the above-mentioned relative pose information includes a third rotation matrix and a third translation vector;
  • the above-mentioned first translation vector is the translation vector of the second camera relative to the origin of the first coordinate system when acquiring the first video frame
  • the above-mentioned first rotation matrix is the rotation matrix of the second camera relative to the first coordinate axis of the first coordinate system when acquiring the first video frame;
  • the above-mentioned second translation vector is the rotation matrix of the second camera relative to the origin of the first coordinate system when acquiring the second video frame.
  • Translation vector, the above-mentioned second rotation matrix is the rotation matrix of the second camera relative to the first coordinate axis of the first coordinate system when acquiring the second video frame.
  • the electronic device can accurately determine the third translation vector and the third rotation matrix based on the first translation vector, the first rotation matrix, the second translation vector and the second rotation matrix, so as to improve the accuracy of determining the relative pose information. .
  • the input unit 1104 may include a graphics processing unit (GPU) 11041 and a microphone 11042.
  • the graphics processor 11041 is responsible for the operation of the image capture device (GPU) in the video capture mode or the image capture mode. Process the image data of still pictures or videos obtained by cameras (such as cameras).
  • the display unit 1106 may include a display panel 11061, which may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like.
  • the user input unit 1107 includes at least one of a touch panel 11071 and other input devices 11072 .
  • Touch panel 11071 also called touch screen.
  • the touch panel 11071 may include two parts: a touch detection device and a touch controller.
  • Other input devices 11072 may include but are not limited to physical keyboards, function keys (such as volume control keys, switch keys, etc.), trackballs, mice, and joysticks, which will not be described again here.
  • Memory 1109 may be used to store software programs as well as various data.
  • the memory 1109 may mainly include a first storage area for storing programs or instructions and a second storage area for storing data, wherein the first storage area may store an operating system, an application program or instructions required for at least one function (such as a sound playback function, Image playback function, etc.) etc.
  • memory 1109 may include volatile memory or nonvolatile memory, or memory 1109 may include both volatile and nonvolatile memory.
  • non-volatile memory can be read-only memory (ROM), programmable ROM (PROM), erasable programmable read-only memory (erasable PROM, EPROM), electrically removable memory.
  • Volatile memory can be random access memory (RAM), static random access memory (static RAM, SRAM), dynamic random access memory (dynamic RAM, DRAM), synchronous dynamic random access memory (synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (double data rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (synch link DRAM) , SLDRAM) and direct memory bus random access memory (direct rambus RAM, DRRAM).
  • RAM random access memory
  • static random access memory static random access memory
  • DRAM dynamic random access memory
  • DRAM dynamic random access memory
  • SDRAM synchronous dynamic random access memory
  • double data rate SDRAM double data rate SDRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced synchronous dynamic random access memory
  • synchronous link dynamic random access memory synchronous link dynamic random access memory
  • SLDRAM direct memory bus random access memory
  • Memory 1109 in embodiments of the present application includes, but is not limited to, these and any
  • the processor 1110 may include one or more processing units; optionally, the processor 1110 integrates an application processor and a modem processor, where the application processor mainly handles operations related to the operating system, user interface, application programs, etc., Modem processors mainly process wireless communication signals, such as baseband processors. It can be understood that the above modem processor may not be integrated into the processor 1110.
  • Embodiments of the present application also provide a readable storage medium.
  • Programs or instructions are stored on the readable storage medium.
  • the program or instructions are executed by a processor, each process of the above video encoding method embodiment is implemented, and the same can be achieved. The technical effects will not be repeated here to avoid repetition.
  • the processor is the processor in the electronic device described in the above embodiment.
  • the readable storage medium includes computer readable storage media, such as computer read-only memory ROM, random access memory RAM, magnetic disk or CD etc.
  • An embodiment of the present application further provides a chip.
  • the chip includes a processor and a communication interface.
  • the communication interface is coupled to the processor.
  • the processor is used to run programs or instructions to implement the above video encoding method embodiment. Each process can achieve the same technical effect. To avoid repetition, it will not be described again here.
  • chips mentioned in the embodiments of this application may also be called system-on-chip, system-on-a-chip, system-on-a-chip or system-on-chip, etc.
  • Embodiments of the present application provide a computer program product.
  • the program product is stored in a storage medium.
  • the program product is executed by at least one processor to implement each process of the above video encoding method embodiment, and can achieve the same technical effect. , to avoid repetition, will not be repeated here.
  • the methods of the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is better. implementation.
  • the technical solution of the present application can be embodied in the form of a computer software product that is essentially or contributes to the existing technology.
  • the computer software product is stored in a storage medium (such as ROM/RAM, disk , optical disk), including several instructions to cause a terminal (which can be a mobile phone, computer, server, or network device, etc.) to execute the methods described in various embodiments of this application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente demande concerne le domaine technique du traitement vidéo et divulgue un procédé et un appareil de codage vidéo, ainsi qu'un dispositif électronique et un support lisible par ordinateur. Le procédé consiste à : obtenir les informations de pose relative d'une première trame vidéo et d'une seconde trame vidéo, la première trame vidéo étant une trame vidéo à coder, et la seconde trame vidéo étant une trame vidéo de référence ; déterminer le vecteur de mouvement et les informations résiduelles d'un premier bloc de pixels selon les informations de pose relative, le premier bloc de pixels étant un bloc de pixels à coder de la première trame vidéo ; et coder le vecteur de mouvement et les informations résiduelles.
PCT/CN2023/102731 2022-06-30 2023-06-27 Procédé et appareil de codage vidéo, dispositif électronique et support Ceased WO2024002065A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210768613.4A CN115065827B (zh) 2022-06-30 2022-06-30 视频编码方法、装置、电子设备及介质
CN202210768613.4 2022-06-30

Publications (1)

Publication Number Publication Date
WO2024002065A1 true WO2024002065A1 (fr) 2024-01-04

Family

ID=83203396

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/102731 Ceased WO2024002065A1 (fr) 2022-06-30 2023-06-27 Procédé et appareil de codage vidéo, dispositif électronique et support

Country Status (2)

Country Link
CN (1) CN115065827B (fr)
WO (1) WO2024002065A1 (fr)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115065827B (zh) * 2022-06-30 2025-08-22 维沃移动通信有限公司 视频编码方法、装置、电子设备及介质
CN120881291A (zh) * 2024-04-22 2025-10-31 海信视像科技股份有限公司 一种视频解码方法、视频编码方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111583350A (zh) * 2020-05-29 2020-08-25 联想(北京)有限公司 图像处理方法、装置、系统及服务器
WO2021035669A1 (fr) * 2019-08-30 2021-03-04 深圳市大疆创新科技有限公司 Procédé de prévision de pose, procédé de construction de carte, plateforme mobile et support de stockage
WO2022072242A1 (fr) * 2020-10-01 2022-04-07 Qualcomm Incorporated Codage de données vidéo à l'aide d'informations de pose d'un utilisateur
CN115065827A (zh) * 2022-06-30 2022-09-16 维沃移动通信有限公司 视频编码方法、装置、电子设备及介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017031671A1 (fr) * 2015-08-24 2017-03-02 华为技术有限公司 Procédé de codage et procédé de décodage de champ de vecteur de mouvement, et appareils de codage et de décodage
CN113810696B (zh) * 2020-06-12 2024-09-17 华为技术有限公司 一种信息传输方法、相关设备及系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021035669A1 (fr) * 2019-08-30 2021-03-04 深圳市大疆创新科技有限公司 Procédé de prévision de pose, procédé de construction de carte, plateforme mobile et support de stockage
CN111583350A (zh) * 2020-05-29 2020-08-25 联想(北京)有限公司 图像处理方法、装置、系统及服务器
WO2022072242A1 (fr) * 2020-10-01 2022-04-07 Qualcomm Incorporated Codage de données vidéo à l'aide d'informations de pose d'un utilisateur
CN115065827A (zh) * 2022-06-30 2022-09-16 维沃移动通信有限公司 视频编码方法、装置、电子设备及介质

Also Published As

Publication number Publication date
CN115065827B (zh) 2025-08-22
CN115065827A (zh) 2022-09-16

Similar Documents

Publication Publication Date Title
JP7506091B2 (ja) 画像ベースの位置特定
EP4026092B1 (fr) Mode de verrouillage de scène pour capturer des images de caméra
US20220383523A1 (en) Hand tracking method, device and system
US11301051B2 (en) Using natural movements of a hand-held device to manipulate digital content
US20200042263A1 (en) SYNCHRONIZATION AND STREAMING OF WORKSPACE CONTENTS WITH AUDIO FOR COLLABORATIVE VIRTUAL, AUGMENTED, AND MIXED REALITY (xR) APPLICATIONS
WO2024002065A1 (fr) Procédé et appareil de codage vidéo, dispositif électronique et support
WO2022193990A1 (fr) Procédé et appareil de détection et de suivi, dispositif, support de stockage et produit-programme d'ordinateur
CN110349212A (zh) 即时定位与地图构建的优化方法及装置、介质和电子设备
CN114387400A (zh) 三维场景的显示方法、显示装置、电子设备和服务器
CN115278084A (zh) 图像处理方法、装置、电子设备及存储介质
CN115205419B (zh) 即时定位与地图构建方法、装置、电子设备及可读存储介质
CN116342992A (zh) 图像处理方法和电子设备
WO2023155823A1 (fr) Procédé d'identification de trajectoire de mouvement à base d'uwb et dispositif électronique
WO2024055194A1 (fr) Procédé de génération d'objet virtuel, ainsi que procédé d'apprentissage de codec et appareil associé
WO2024188220A1 (fr) Procédé, système et appareil de commande pour sources de lumière de dispositif de commande portatif, dispositif et support
WO2025139826A1 (fr) Procédé et appareil de traitement vidéo, dispositif électronique et support de stockage
CN114339051B (zh) 拍摄方法、装置、电子设备和可读存储介质
CN114723800B (zh) 点云数据的校正方法和校正装置、电子设备和存储介质
US11663752B1 (en) Augmented reality processing device and method
CN114500842B (zh) 视觉惯性标定方法及其装置
CN115811615B (zh) 屏幕视频的编码方法、装置、计算机设备及存储介质
CN114648556A (zh) 视觉跟踪方法、装置和电子设备
US8755819B1 (en) Device location determination using images
CN114998102A (zh) 图像处理方法、装置及电子设备
CN112287155B (zh) 图片处理方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23830242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 23830242

Country of ref document: EP

Kind code of ref document: A1