WO2022267781A1 - 建模方法及相关电子设备及存储介质 - Google Patents
建模方法及相关电子设备及存储介质 Download PDFInfo
- Publication number
- WO2022267781A1 WO2022267781A1 PCT/CN2022/093934 CN2022093934W WO2022267781A1 WO 2022267781 A1 WO2022267781 A1 WO 2022267781A1 CN 2022093934 W CN2022093934 W CN 2022093934W WO 2022267781 A1 WO2022267781 A1 WO 2022267781A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target object
- terminal device
- images
- modeling
- frame
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/50—Depth or shape recovery
- G06T7/55—Depth or shape recovery from multiple images
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating three-dimensional [3D] models or images for computer graphics
- G06T19/20—Editing of three-dimensional [3D] images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three-dimensional [3D] modelling for computer graphics
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/16—Constructional details or arrangements
- G06F1/1613—Constructional details or arrangements for portable computers
- G06F1/1633—Constructional details or arrangements of portable computers not specific to the type of enclosures covered by groups G06F1/1615 - G06F1/1626
- G06F1/1684—Constructional details or arrangements related to integrated I/O peripherals not covered by groups G06F1/1635 - G06F1/1675
- G06F1/1686—Constructional details or arrangements related to integrated I/O peripherals not covered by groups G06F1/1635 - G06F1/1675 the I/O peripheral being an integrated camera
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04845—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range for image manipulation, e.g. dragging, rotation, expansion or change of colour
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04883—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04886—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures by partitioning the display area of the touch-screen or the surface of the digitising tablet into independently controllable areas, e.g. virtual keyboards or menus
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three-dimensional [3D] modelling for computer graphics
- G06T17/10—Constructive solid geometry [CSG] using solid primitives, e.g. cylinders, cubes
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/64—Three-dimensional [3D] objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/048—Indexing scheme relating to G06F3/048
- G06F2203/04806—Zoom, i.e. interaction techniques or interactors for controlling the zooming operation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2203/00—Indexing scheme relating to G06F3/00 - G06F3/048
- G06F2203/048—Indexing scheme relating to G06F3/048
- G06F2203/04808—Several contacts: gestures triggering a specific function, e.g. scrolling, zooming, right-click, when the user establishes several contacts with the surface simultaneously; e.g. using several fingers or a combination of fingers and pen
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/08—Indexing scheme for image data processing or generation, in general involving all processing steps from image acquisition to 3D model generation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2200/00—Indexing scheme for image data processing or generation, in general
- G06T2200/24—Indexing scheme for image data processing or generation, in general involving graphical user interfaces [GUIs]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2210/00—Indexing scheme for image generation or computer graphics
- G06T2210/12—Bounding box
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2219/00—Indexing scheme for manipulating 3D models or images for computer graphics
- G06T2219/20—Indexing scheme for editing of 3D models
- G06T2219/2016—Rotation, translation, scaling
Definitions
- the present application relates to the field of three-dimensional reconstruction, in particular to a modeling method and related electronic equipment and storage media.
- 3D reconstruction applications/software can be used to model objects in 3D.
- users need to first use mobile tools (such as mobile phones, cameras, etc.) to collect data (such as pictures, depth information, etc.)
- the data required for the 3D modeling of the object is reconstructed in 3D to obtain the corresponding 3D model of the object.
- the process of collecting data required for 3D modeling and the process of reconstructing objects in 3D according to the collected data required for 3D modeling are relatively complicated, and require high equipment hardware.
- the process of collecting the data required for 3D modeling requires the collection device (such as the above-mentioned mobile terminal tool) to be equipped with a special device such as a laser radar (light detection and ranging, LIDAR) sensor or an RGB depth (RGB-D) camera.
- the process of 3D reconstruction of objects according to the collected data required for 3D modeling requires a processing device running 3D reconstruction applications to be equipped with a high-performance independent graphics card.
- the embodiment of the present application provides a modeling method, related electronic equipment and storage medium, which simplifies the process of collecting data required for 3D modeling and the process of reconstructing objects in 3D according to the collected data required for 3D modeling , and has low requirements for device hardware.
- an embodiment of the present application provides a modeling method, the method is applied to a terminal device, and the method includes:
- the terminal device displays a first interface, where the first interface includes a photographed screen of the terminal device.
- the terminal device collects multiple frames of images corresponding to the target object to be modeled, and acquires an association relationship among the multiple frames of images.
- the terminal device acquires a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
- the terminal device displays a three-dimensional model corresponding to the target object.
- the terminal device displays a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
- the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and acquiring the association relationship between the multiple frames of images includes:
- the terminal device When the terminal device is in the first pose, the terminal device collects the first image and changes the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, the terminal device collects the second image and changes the display effect of the patch corresponding to the first image.
- the terminal device may determine the matching information of the key frame according to the association relationship between the patch corresponding to the key frame and other patches.
- the relationship between the patch corresponding to the key frame and other patches may include: in the patch model, which patches correspond to the top, bottom, left, and right directions of the patch corresponding to the key frame.
- the mobile phone can determine that other keyframes associated with the frame keyframe include the keyframe corresponding to patch 21, the keyframe corresponding to patch 20, and the keyframe corresponding to patch 2. the corresponding keyframe. Therefore, the mobile phone can obtain the matching information of the key frame of the frame, including the identification information of the key frame corresponding to the patch 21, the identification information of the key frame corresponding to the patch 20, and the identification information of the key frame corresponding to the patch 2.
- the terminal device only relies on the ordinary RGB camera to collect the data required for 3D modeling to achieve 3D modeling, and the process of collecting the data required for 3D modeling , does not need to rely on special hardware such as LIDAR sensors or RGB-D cameras in terminal devices.
- the terminal device obtains the 3D model corresponding to the target object according to the multi-frame images and the correlation between the multi-frame images, which can effectively reduce the calculation load of the 3D modeling process and improve the efficiency of 3D modeling.
- the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final 3D model on the terminal device. For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
- the terminal device includes a first application, and before the terminal device displays the first interface, the method further includes: the terminal device displays a second interface in response to an operation of opening the first application.
- the terminal device displaying the first interface includes: the terminal device displays the first interface in response to an operation of starting the 3D modeling function of the first application on the second interface.
- the second interface may include a function control for starting the 3D modeling function
- the user may click or touch the function control on the second interface
- the mobile phone may respond to the operation of the user clicking or touching the function control on the second interface to start the 3D modeling function of the first application. That is, the user's operation of clicking or touching the functional control on the second interface is an operation of starting the 3D modeling function of the first application on the second interface.
- the first virtual bounding volume includes one or more layers, and the plurality of surface patches are distributed on the one or more layers.
- the structure of the mesh model may include upper and lower layers, and each layer may include multiple meshes.
- the structure of the mesh model may include upper, middle and lower layers, and each layer may include multiple meshes.
- the structure of the mesh model may be a one-layer structure composed of multiple meshes. No limitation is imposed here.
- the method further includes: displaying first prompt information on the terminal device, where the first prompt information is used to remind the user to place the position of the target object in the shooting picture at a central position.
- the first prompt information may be "please place the target object at the center of the screen”.
- the method further includes: the terminal device displays second prompt information; the second prompt information is used to remind the user to adjust the shooting environment where the target object is located, the way of shooting the target object, and the One or more of the screen-to-body ratios of the target object.
- the second prompt information may be "put the object still on a solid-color plane, with soft lighting, and shoot around the object, and make the screen-to-body ratio of the object as large and complete as possible.”
- the subsequent data collection process can be faster, and the quality of the collected data can be better.
- the method further includes: the terminal device detects the operation of generating the 3D model ; The terminal device displays third prompt information in response to the operation of generating the three-dimensional model, where the third prompt information is used to prompt the user that the target object is being modeled.
- the third prompt information may be "modeling in progress”.
- the method further includes: the terminal device displays fourth prompt information, The fourth prompt information is used to prompt the user that the modeling of the target object has been completed.
- the fourth prompt information may be "modeling completed”.
- the terminal device displaying the 3D model corresponding to the target object further includes: the terminal device responds to the operation of changing the display angle of the 3D model corresponding to the target object, changing the display angle of the 3D model corresponding to the target object;
- the operation of the display angle of the 3D model corresponding to the object includes the operation of dragging the 3D model corresponding to the target object to rotate clockwise or counterclockwise along the first direction.
- the first direction may be any direction, such as a horizontal direction, a vertical direction, and the like.
- the terminal device changes the display angle of the 3D model corresponding to the target object, so as to achieve the effect of presenting the 3D model to the user at different angles.
- the terminal device displaying the 3D model corresponding to the target object further includes: the terminal device responds to the operation of changing the display size of the 3D model corresponding to the target object, changing the display size of the 3D model corresponding to the target object;
- the operation of displaying the size of the three-dimensional model corresponding to the object includes the operation of zooming in or out the three-dimensional model corresponding to the target object.
- the zoom-out operation may be an operation in which the user uses two fingers to slide inward (in the opposite direction) on the 3D model preview interface
- the zoom-in operation may be an operation in which the user uses two fingers to slide outward (in the opposite direction) on the 3D model preview interface.
- the 3D model preview interface is also an interface where the terminal device displays the 3D model corresponding to the target object.
- the zoom-in or zoom-out operation performed by the user on the 3D model corresponding to the target object can also be a double-click operation, a long-press operation, or, the 3D model preview interface can also include a zoom-in or zoom-out operation.
- the function control etc. of the operation are not limited here.
- the association relationship between the multiple frames of images includes matching information of each frame of images in the multiple frames of images; the matching information of each frame of images includes the matching information between the multiple frames of images and the The identification information of other images associated with the image; the matching information of each frame of the image is based on the association relationship between the image of each frame and the patch corresponding to the image of each frame, and the relationship between the multiple patches relationship obtained.
- the identification information of other key frames associated with the key frame with picture number 18 is the picture number of other key frames associated with the key frame with picture number 18, such as 26 , 45, 59, 78, 89, 100, 449, etc.
- the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtaining the association relationship between the multiple frames of images further includes: the terminal device determines the target object according to the shooting picture; When the terminal device captures multiple frames of images, the position of the target object in the shooting picture is the central position of the shooting picture.
- the terminal device collects multiple frames of images corresponding to the target object to be modeled, including: during the process of shooting the target object, the terminal device performs blur detection on each frame of the captured Images larger than the first threshold are used as images corresponding to the target object.
- the terminal device can obtain some key frame pictures with better quality by performing fuzzy detection on the pictures taken at the shooting position, and the key frame pictures are corresponding to the target object.
- Image the number of key frame images corresponding to each patch can be one or more.
- displaying the three-dimensional model corresponding to the target object on the terminal device includes: displaying the three-dimensional model corresponding to the target object by the terminal device in response to an operation of previewing the three-dimensional model corresponding to the target object.
- the terminal device may display a view button, and the user may click the view button, and the terminal device may display the 3D model corresponding to the target object in response to the user's operation of clicking the view button.
- the user's operation of clicking the view button is an operation of previewing the 3D model corresponding to the target object.
- the three-dimensional model corresponding to the target object includes a basic three-dimensional model of the target object and a texture of the surface of the target object.
- the texture on the surface of the target object may be a texture map on the surface of the target object.
- the 3D model of the target object can be generated according to the basic 3D model of the target object and the texture of the surface of the target object. Mapping the texture of the surface of the target object onto the surface of the basic 3D model of the target object in a specific way can restore the surface of the target object more realistically and make the target object look more real.
- the terminal device is connected to the server; the terminal device obtains the three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, including: the terminal device sends the server Sending the multi-frame images and the association relationship between the multi-frame images; the terminal device receives the 3D model corresponding to the target object sent from the server.
- this design the process of generating the three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images can be completed on the server side. That is to say, this design can realize 3D modeling by utilizing the computing resources of the server. This design can be applied to some scenarios where the computing power of some terminal devices is weak, and the universality of the 3D modeling method can be improved.
- the method further includes: the terminal device sends camera intrinsic parameters, gravity direction information, image name, image number, camera pose information, and time stamps respectively corresponding to the multiple frames of images to the server.
- the method further includes: the terminal device receiving an indication message from the server, where the indication message is used to indicate to the terminal device that the server has completed modeling of the target object.
- the terminal device may display the fourth prompt information after receiving the indication message.
- the method further includes: the terminal device sends a download request message to the server, and the download request message is used to request the server to download the 3D model corresponding to the target object. Model.
- the server may send the three-dimensional model corresponding to the target object to the terminal device.
- the terminal device when the terminal device collects the data required for 3D modeling of the target object, it can also display the scanning progress on the first interface.
- the first interface may include a scan button, and the terminal device may display the scan progress through the circular black filling effect of the scan button on the first interface.
- the UI presentation effect of the scan button is different, and the way the mobile phone displays the scan progress on the first interface can be different, which is not limited here.
- the terminal device may also not need to display the scanning progress, and the user can know the scanning progress according to the lighting of the patch in the first virtual enclosure.
- an embodiment of the present application provides a modeling device, which can be applied to a terminal device, and used to implement the modeling method described in the first aspect above.
- the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
- the hardware or software includes one or more modules or units corresponding to the above functions, for example, the device may include: a display unit and a processing unit. The display unit and the processing unit can be used to cooperate to implement the modeling method described in the first aspect above.
- the display unit is configured to display a first interface, where the first interface includes a captured image of the terminal device.
- the processing unit is configured to, in response to the collection operation, collect multiple frames of images corresponding to the target object to be modeled, and acquire a correlation between the multiple frames of images. Acquiring a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
- the display unit is also used to display the three-dimensional model corresponding to the target object.
- the display unit is further configured to display a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
- the processing unit is specifically configured to collect the first image when the terminal device is in the first pose, and change the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, collect the second image, and change The display effect of the patch corresponding to the second image; after changing the display effect of the plurality of patches of the first virtual bounding volume, acquiring the association relationship between the multiple frames of images according to the plurality of patches.
- the display unit and the processing unit are further configured to implement other display functions and processing functions in the method described in the first aspect above, which will not be repeated here.
- the terminal device described in the first aspect above sends the multi-frame images and the association relationship between the multi-frame images to the server, and the server sends the multi-frame images according to the multi-frame images and the multi-frame images
- the modeling device may also include a sending unit and a receiving unit, and the sending unit is used to send the multi-frame images and the information between the multi-frame images to the server.
- the association relationship, the receiving unit is used to receive the 3D model corresponding to the target object sent from the server.
- an embodiment of the present application provides an electronic device, including: a processor; a memory; and a computer program; wherein the computer program is stored in the memory, and when the computer program is executed by the processor , so that the electronic device implements the method described in the first aspect and any possible implementation manner of the first aspect.
- an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the method described in the first aspect. and the method described in any possible implementation manner of the first aspect.
- the embodiment of the present application further provides a computer program product, including computer readable code, when the computer readable code is run in the electronic device, the electronic device realizes any of the first aspect and the first aspect.
- a computer program product including computer readable code, when the computer readable code is run in the electronic device, the electronic device realizes any of the first aspect and the first aspect. The method described in one possible implementation.
- the embodiment of the present application also provides a modeling method, the method is applied to a server, and the server is connected to the terminal device; the method includes: the server receives the multi-frame image corresponding to the target object sent from the terminal device and the association relationship between the multi-frame images; the server generates a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images; the server sends the three-dimensional model corresponding to the target object to the terminal device .
- the method can utilize computing resources of the server to implement 3D modeling, and the process of generating a 3D model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images can be completed on the server side.
- the server performs 3D modeling in combination with the association relationship among the multiple frames of images, which can effectively reduce the computing load of the server and improve modeling efficiency.
- the server when the server performs 3D modeling, it only needs to combine the association relationship between the multiple frames of images, and perform feature detection and matching on each frame of images and other images associated with the image.
- the image is feature detected and matched against all other images. In this way, two adjacent frames of images can be quickly compared, which can effectively reduce the computing load of the server and improve the efficiency of 3D modeling.
- the server can combine the matching information of the first frame image to quickly and accurately determine the texture associated with the first frame image.
- the mapping relationship between the texture of other images and the surface of the basic 3D model of the target object can be combined for each subsequent frame of image.
- the server can combine the matching information of the image to quickly and accurately determine the mapping relationship between the texture of other images associated with the image and the surface of the basic 3D model of the target object.
- the method can be applied to some scenarios where the computing power of some terminal devices is weak, and the universality of the 3D modeling method can be improved.
- this method also has other beneficial effects described in the first aspect above, such as: the process of collecting data required for 3D modeling does not need to rely on the terminal device having a special device such as a LIDAR sensor or an RGB-D camera. hardware.
- the server obtains the 3D model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, which can effectively reduce the calculation load of the 3D modeling process and improve the efficiency of the 3D modeling, etc., and will not be repeated here.
- the method further includes: the server receives camera intrinsic parameters, gravity direction information, image name, image number, camera pose information, and time stamps respectively corresponding to the multi-frame images sent from the terminal device.
- the server generates a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images, including: the server according to the multi-frame images and the association relationship between the multi-frame images,
- the multi-frame images respectively correspond to camera internal parameters, gravity direction information, image name, image number, camera pose information, and time stamps to generate a three-dimensional model corresponding to the target object.
- the association relationship between the multiple frames of images includes matching information of each frame of images in the multiple frames of images; the matching information of each frame of images includes the matching information of the multiple frames of images with the image The identification information of other associated images; the matching information of each frame of the image is based on the association relationship between the image of each frame and the patch corresponding to the image of each frame, and the relationship between the multiple patches The relationship is obtained.
- the embodiment of the present application provides a modeling device, which can be applied to a server to implement the modeling method described in the sixth aspect.
- the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
- Hardware or software includes one or more modules or units corresponding to the above functions, for example, the device may include: a receiving unit, a processing unit, and a sending unit. The receiving unit, the processing unit and the sending unit may be used to cooperate to implement the modeling method described in the sixth aspect.
- the receiving unit may be configured to receive multiple frames of images corresponding to the target object sent from the terminal device and an association between the multiple frames of images.
- the processing unit may be configured to generate a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
- the sending unit may be used to send the three-dimensional model corresponding to the target object to the terminal device.
- the receiving unit, the processing unit, and the sending unit may be used to implement all the functions that the server in the method described in the sixth aspect may implement, and details will not be repeated here.
- the embodiment of the present application provides an electronic device, including: a processor; a memory; and a computer program; wherein the computer program is stored in the memory, and when the computer program is executed by the processor , so that the electronic device implements the sixth aspect and the method described in any possible implementation manner of the sixth aspect.
- an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the sixth aspect. and the method described in any possible implementation manner of the sixth aspect.
- the embodiment of the present application also provides a computer program product, including computer readable code, when the computer readable code is run in the electronic device, the electronic device can realize any of the sixth aspect and the sixth aspect.
- a computer program product including computer readable code
- the electronic device can realize any of the sixth aspect and the sixth aspect. The method described in one possible implementation.
- the embodiment of the present application further provides a device-cloud collaboration system, including: a terminal device and a server, the terminal device is connected to the server; the terminal device displays a first interface, and the first interface includes The shooting picture of the terminal device; the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtains the correlation between the multiple frames of images; wherein, when collecting the target During the process of multiple frames of images corresponding to the object, the terminal device displays a first virtual bounding volume; the first virtual bounding volume includes a plurality of patches; the terminal device captures the target object to be modeled in response to the collection operation Corresponding multi-frame images, and obtaining the association relationship between the multi-frame images includes: when the terminal device is in the first pose, the terminal device collects the first image, and changes the corresponding The display effect of the patch; when the terminal device is in the second pose, the terminal device collects a second image, and changes the display effect of the patch corresponding to the second image
- FIG. 1 is a schematic diagram of the composition of the device-cloud collaboration system provided by the embodiment of the present application.
- FIG. 2 is a schematic structural diagram of a terminal device provided in an embodiment of the present application.
- FIG. 3 is a schematic diagram of the main interface of the mobile phone provided by the embodiment of the present application.
- FIG. 4 is a schematic diagram of the main interface of the first application provided by the embodiment of the present application.
- FIG. 5 is a schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- FIG. 6 is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
- FIG. 7A is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
- FIG. 7B is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
- FIG. 7C is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
- FIG. 7D is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- FIG. 7E is another schematic diagram of the 3D modeling data acquisition interface provided by the embodiment of the present application.
- FIG. 7F is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- FIG. 8 is a schematic structural diagram of a patch model provided by an embodiment of the present application.
- Fig. 9 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- FIG. 10 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- Fig. 11 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- Fig. 12 is a schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- FIG. 13 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- Fig. 14 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- Fig. 15 is a schematic diagram of the user performing counterclockwise rotation operation along the horizontal direction on the 3D model of the toy car provided by the embodiment of the present application;
- FIG. 16 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- FIG. 17 is a schematic diagram of the user performing a zoom-out operation on the 3D model of the toy car provided by the embodiment of the present application;
- FIG. 18 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- Fig. 19 is a schematic flow chart of the 3D modeling method provided by the embodiment of the present application.
- FIG. 20 is a logical schematic diagram of a 3D modeling method implemented by a device-cloud collaboration system provided in an embodiment of the present application
- Fig. 21 is a schematic structural diagram of a modeling device provided by an embodiment of the present application.
- Fig. 22 is another structural schematic diagram of the modeling device provided by the embodiment of the present application.
- Fig. 23 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
- references to "one embodiment” or “some embodiments” or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
- appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically stated otherwise.
- the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
- the term “connected” includes both direct and indirect connections, unless otherwise stated.
- first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
- Three-dimensional (3-dimension, 3D) reconstruction technology is widely used in virtual reality (virtual reality), augmented reality (augmented reality), extended reality (extended reality, XR), mixed reality (mixed reality, MR), games, film and television, education, medical and other fields.
- 3D reconstruction technology can be used to model characters, props, vegetation, etc. in games, or to model character models in film and television, or to realize chemical analysis structure-related modeling in the field of education, and in medical Realize the modeling related to human body structure in the field.
- 3D reconstruction applications/software that can be used to realize 3D modeling need to be implemented on a personal computer (PC) (such as a computer), and a small number of 3D reconstruction applications can realize 3D on a mobile terminal (such as a mobile phone). modeling.
- the 3D reconstruction application on the PC side realizes 3D modeling
- the user needs to use mobile tools (such as mobile phones, cameras, etc.) to collect the data required for 3D modeling (such as pictures, depth information, etc.), and upload the collected data to the PC side, and then the 3D reconstruction application on the PC side can perform 3D modeling processing based on the uploaded data.
- the 3D reconstruction application on the mobile terminal implements 3D modeling, the user can directly use the mobile terminal to collect the data required for 3D modeling, and the 3D reconstruction application on the mobile terminal can directly perform 3D modeling processing based on the data collected by the mobile terminal.
- the user uses the mobile terminal to collect the data required for 3D modeling, it must rely on the mobile terminal's laser radar (light detection and ranging, LIDAR) sensor or RGB depth (RGB depth, RGB-D) camera and other special hardware, the data acquisition process required for 3D modeling requires high hardware requirements.
- the 3D reconstruction application of the PC/mobile terminal realizes 3D modeling, and the hardware requirements of the PC/mobile terminal are also relatively high.
- the PC/mobile terminal may be required to be equipped with a high-performance independent graphics card.
- the above-mentioned method of realizing 3D modeling on the PC side is relatively cumbersome. For example, after the user performs relevant data collection operations on the mobile side, the user not only needs to copy the collected data or The PC side performs related modeling operations on the 3D reconstruction application.
- the embodiment of the present application provides a 3D modeling method, which can be applied to a device-cloud collaboration system composed of a terminal device and a cloud.
- the "device" of the device-cloud collaboration refers to the terminal device
- the “cloud” refers to the cloud, which can also be called a cloud server or a cloud platform.
- the terminal device can collect the data required for 3D modeling, and after preprocessing the data required for 3D modeling, upload the preprocessed data required for 3D modeling to the cloud;
- the preprocessed data required for 3D modeling is obtained for 3D modeling;
- the terminal device can download the 3D model obtained from the cloud for 3D modeling from the cloud, and provide a preview function for the 3D model.
- the terminal device can realize 3D modeling only by relying on the ordinary RGB camera to collect the data required for 3D modeling.
- the process of collecting the data required for 3D modeling does not need to rely on the terminal device having Special hardware such as LIDAR sensors or RGB-D cameras; the process of 3D modeling is completed on the cloud, and there is no need to rely on high-performance discrete graphics cards configured on terminal devices. That is, the 3D modeling method has relatively low requirements on the hardware of the terminal device.
- the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final model on the terminal device.
- 3D model of For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
- FIG. 1 is a schematic composition diagram of a device-cloud collaboration system provided by an embodiment of the present application.
- the device-cloud collaboration system provided by the embodiment of the present application may include: a cloud 100 and a terminal device 200, and the terminal device 200 may be connected to the cloud 100 through a wireless network.
- the cloud 100 is also a server.
- the cloud 100 may be a single server or a server cluster composed of multiple servers, and the present application does not limit the implementation architecture of the cloud 100 .
- the terminal device 200 may be an interactive electronic whiteboard with a shooting function, a mobile phone, a wearable device (such as a smart watch, a smart bracelet, etc.), a tablet computer, a notebook computer, a desktop computer, a portable Electronic equipment (such as laptop computer, Laptop), ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA), smart TV (such as smart screen), car computer, Smart speakers, augmented reality (augmented reality, AR) devices, virtual reality (virtual reality, VR) devices, and other smart devices with display screens, or digital cameras, SLR cameras/mirror cameras, and action cameras , pan-tilt camera, unmanned aerial vehicle and other professional shooting equipment, the embodiment of the present application does not limit the specific type of terminal equipment.
- a wearable device such as a smart watch, a smart bracelet, etc.
- a tablet computer such as a smart watch, a smart bracelet, etc.
- notebook computer such as a smart watch, a smart bracelet
- the terminal device when the terminal device is a shooting device such as a pan-tilt camera or a drone, it will also include a display device that can provide a shooting interface for displaying a collection interface for collecting data required for 3D modeling and a preview of the 3D model. interface etc.
- the display device of the pan-tilt camera can be a mobile phone
- the display device of the aerial drone can be a remote control device, etc.
- a terminal device 200 is exemplarily shown in FIG. 1 .
- the terminal device 200 in the device-cloud collaboration system may include one or more terminal devices 200, and the multiple terminal devices 200 may be the same, different or partly the same, which are not limited herein.
- the 3D modeling method provided in the embodiment of the present application is a process for realizing 3D modeling through interaction between each terminal device 200 and the cloud 100 .
- FIG. 2 is a schematic structural diagram of the terminal device provided in the embodiment of the present application.
- the mobile phone can include a processor 210, an external memory interface 220, an internal memory 221, a universal serial bus (universal serial bus, USB) interface 230, a charging management module 240, a power management module 241, a battery 242, and an antenna 1.
- a processor 210 an external memory interface 220
- an internal memory 221 a universal serial bus (universal serial bus, USB) interface 230
- a charging management module 240 a power management module 241, a battery 242, and an antenna 1.
- USB universal serial bus
- Antenna 2 mobile communication module 250, wireless communication module 260, audio module 270, speaker 270A, receiver 270B, microphone 270C, earphone jack 270D, sensor module 280, button 290, motor 291, indicator 292, camera 293, display screen 294, and a subscriber identification module (subscriber identification module, SIM) card interface 295, etc.
- SIM subscriber identification module
- the processor 210 may include one or more processing units, for example: the processor 210 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
- application processor application processor, AP
- modem processor graphics processing unit
- graphics processing unit graphics processing unit
- ISP image signal processor
- controller memory
- video codec digital signal processor
- DSP digital signal processor
- baseband processor baseband processor
- neural network processor neural-network processing unit, NPU
- the controller can be the nerve center and command center of the mobile phone.
- the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
- a memory may also be provided in the processor 210 for storing instructions and data.
- the memory in processor 210 is a cache memory.
- the memory may hold instructions or data that the processor 210 has just used or recycled. If the processor 210 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 210 is reduced, thereby improving the efficiency of the system.
- processor 210 may include one or more interfaces.
- the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input/output (general-purpose input/output, GPIO) interface, SIM interface, and/or USB interface, etc.
- I2C integrated circuit
- I2S integrated circuit built-in audio
- PCM pulse code modulation
- PCM pulse code modulation
- UART universal asynchronous transmitter
- MIPI mobile industry processor interface
- GPIO general-purpose input/output
- SIM interface SIM interface
- USB interface etc.
- the external memory interface 220 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the mobile phone.
- the external memory card communicates with the processor 210 through the external memory interface 220 to implement a data storage function. Such as saving music, video and other files in the external memory card.
- the internal memory 221 may be used to store computer-executable program codes including instructions.
- the processor 210 executes various functional applications and data processing of the mobile phone by executing instructions stored in the internal memory 221 .
- the internal memory 221 may also include an area for storing programs and an area for storing data.
- the program storage area may store an operating system, at least one application required by a function (such as the first application described in the embodiment of the present application), and the like.
- the storage data area can store data created during the use of the mobile phone (such as image data, phone book) and the like.
- the internal memory 221 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
- the charging management module 240 is configured to receive charging input from the charger. While the charging management module 240 is charging the battery 242 , it can also provide power for the mobile phone through the power management module 241 .
- the power management module 241 is used for connecting the battery 242 , the charging management module 240 , and the processor 210 .
- the power management module 241 can also receive the input of the battery 242 to provide power for the mobile phone.
- the wireless communication function of the mobile phone can be realized by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, the modem processor and the baseband processor.
- Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
- Each antenna in a mobile phone can be used to cover single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
- Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
- the antenna may be used in conjunction with a tuning switch.
- the mobile phone can realize the audio function through the audio module 270, the speaker 270A, the receiver 270B, the microphone 270C, the earphone interface 270D, and the application processor. Such as music playback, recording, etc.
- the sensor module 280 may include a pressure sensor 280A, a gyro sensor 280B, an air pressure sensor 280C, a magnetic sensor 280D, an acceleration sensor 280E, a distance sensor 280F, a proximity light sensor 280G, a fingerprint sensor 280H, a temperature sensor 280J, a touch sensor 280K, an ambient light sensor 280L, bone conduction sensor 280M, etc.
- the camera 293 may include various types.
- the camera 293 may include a telephoto camera, a wide-angle camera or an ultra-wide-angle camera with different focal lengths.
- the field of view of the telephoto camera is small, which is suitable for shooting distant scenes in a small range; the field of view of the wide-angle camera is relatively large; screen.
- the telephoto camera with a smaller field of view can be rotated so as to capture scenes in different ranges.
- the mobile phone can capture raw images (also called RAW images or digital negatives) through the camera 293 .
- the camera 293 includes at least a lens (lens) and a sensor (sensor).
- the shutter is opened, and the light can be transmitted to the sensor through the lens of the camera 293 .
- the sensor can convert the optical signal passing through the lens into an electrical signal, then perform analog-to-digital (A/D) conversion on the electrical signal, and output a corresponding digital signal.
- This digital signal is the RAW image.
- the mobile phone can perform subsequent ISP processing and YUV domain processing on the RAW image through the processor (such as: ISP, DSP, etc.), and convert the RAW image into an image that can be used for display, such as: JPEG image or high-efficiency image file format (high efficiency image file format, HEIF) image.
- JPEG images or HEIF images can be transmitted to the display screen of the mobile phone for display, and/or transmitted to the memory of the mobile phone for storage.
- the mobile phone can realize the function of shooting.
- the photosensitive element of the sensor may be a charge coupled device (CCD), and the sensor also includes an A/D converter.
- the photosensitive element of the sensor may be a complementary metal-oxide-semiconductor (CMOS).
- CMOS complementary metal-oxide-semiconductor
- the ISP processing may include: bad pixel correction (bad pixel correction, DPC), RAW domain noise reduction, black level correction (black level correction, BLC), lens brightness correction (lens shading correction, LSC), automatic white Balance (auto white balance, AWB), demosica (demosica) color interpolation, color correction (color correction matrix, CCM), dynamic range compression (dynamic range compression, DRC), gamma (gamma), 3D lookup table (look up table, LUT), YUV domain noise reduction, sharpen, detail enhance, etc.
- YUV domain processing can include: multi-frame registration, fusion, noise reduction of high-dynamic range images (high-dynamic range, HDR), and super-resolution (SR) algorithms to improve clarity, skin beautification algorithms, distortion Correction algorithm, blur algorithm, etc.
- the display screen 294 is used to display images, videos and the like.
- Display 294 includes a display panel.
- the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), Miniled, MicroLed, Micro-oLed, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
- the mobile phone may include 1 or N display screens 294, where N is a positive integer greater than 1.
- display screen 294 may be used to display application program interfaces.
- the mobile phone realizes the display function through the GPU, the display screen 294, and the application processor.
- GPU is a microprocessor for image processing, connected to display screen 294 and application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
- Processor 210 may include one or more GPUs that execute program instructions to generate or change display information.
- the structure shown in FIG. 2 does not constitute a specific limitation on the mobile phone.
- the mobile phone may also include more or fewer components than those shown in FIG. 2 , or combine certain components, or separate certain components, or arrange different components, etc.
- some components shown in FIG. 2 may be implemented in hardware, software, or a combination of software and hardware.
- terminal device 200 is an interactive electronic whiteboard, a wearable device, a tablet computer, a notebook computer, a desktop computer, a portable electronic device, a UMPC, a netbook, a PDA, a smart TV, a car computer, a smart speaker, an AR device, a VR device, And other smart devices with display screens, or other forms of terminal equipment such as digital cameras, SLR cameras/micro-single cameras, sports cameras, pan-tilt cameras, drones, etc., the specific structures of these other forms of terminal equipment are also You can refer to Figure 2. Exemplarily, on the basis of the structure shown in FIG. 2 , other forms of terminal equipment may have components added or reduced, which will not be repeated here.
- the terminal device 200 (such as a mobile phone) can run one or more data required for 3D modeling, preprocess the data required for 3D modeling, and support the 3D model
- An application program that performs functions such as preview, for example, may be called a 3D modeling application or a 3D reconstruction application.
- the application program can call the camera of the terminal device 200 to take pictures according to the user's operation, collect the data required for 3D modeling, and perform 3D modeling. The data is preprocessed.
- the application program can also display a preview interface of the 3D model through the display screen of the terminal device 200 for the user to view and preview the 3D model.
- the embodiment of the present application uses the terminal device 200 as an example for illustration, it should be understood that the 3D modeling method provided in the embodiment of the present application is also applicable to other above-mentioned terminal devices with a shooting function.
- the specific type of the terminal equipment is not limited.
- the 3D modeling method may include the following three parts:
- the user uses the mobile phone to collect the data required for 3D modeling of the target object.
- the mobile phone preprocesses the collected data required for 3D modeling of the target object, and uploads the preprocessed data to the cloud.
- the cloud performs 3D modeling based on the data uploaded by the mobile phone to obtain the 3D model of the target object.
- the mobile phone can download the 3D model from the cloud for the user to preview.
- the first application may be installed in the mobile phone, and the first application is the 3D modeling application or the 3D reconstruction application described in the foregoing embodiments.
- the name of the first application may be "3D Rubik's Cube", and there is no limitation on the name of the first application here.
- the main interface of the first application may include a function control for starting the 3D modeling function, and the user may click or touch the function control on the main interface of the first application.
- the mobile phone can start the 3D modeling function of the first application in response to the user's operation of clicking or touching the function control.
- the mobile phone can switch the display interface from the main interface of the first application to the 3D modeling data collection interface, and start the shooting function of the camera, and record the pictures captured by the camera in the 3D modeling data
- the collection interface is displayed.
- the user can hold the mobile phone to collect data required for 3D modeling of the target object.
- the mobile phone may display a function control for starting the first application on the main interface (or called desktop), such as: an application icon (or called a button) of the first application.
- a function control for starting the first application on the main interface such as: an application icon (or called a button) of the first application.
- the user wants to use the first application to perform 3D modeling of a certain target object, he may click or touch the application icon of the first application.
- the mobile phone may start and run the first application and display the main interface of the first application in response to the user's operation of clicking or touching the application icon of the first application.
- FIG. 3 is a schematic diagram of a main interface of a mobile phone provided by an embodiment of the present application.
- the main interface 301 of the mobile phone may include an application icon 302 of the first application.
- the main interface 301 of the mobile phone may also include application icons of application A, application B, application C and other applications.
- the user can click or touch the application icon 302 on the main interface 301 of the mobile phone to trigger the mobile phone to start and run the first application and display the main interface of the first application.
- the mobile phone may also display a function control for starting the first application on another display interface such as a pull-down interface or a negative screen.
- a function control for starting the first application on another display interface such as a pull-down interface or a negative screen.
- the functional controls of the first application may be presented in the form of application icons, or in the form of other functional buttons, which is not limited here.
- the drop-down interface refers to the display interface that appears after sliding down the top of the main interface of the mobile phone. Buttons for commonly used functions of the user can be displayed in the drop-down interface, such as WLAN, Bluetooth, etc., so that the user can quickly use related functions. For example, when the current display interface of the mobile phone is the desktop, the user can perform a downward sliding operation on the top of the mobile phone screen to trigger the mobile phone to switch the display interface from the desktop to the drop-down interface (or overlay and display the drop-down interface on the desktop).
- the negative screen refers to the display interface that appears after sliding the main interface (or desktop) of the mobile phone to the right.
- the negative screen can display the user's frequently used applications, functions, subscribed services and information, etc., which is convenient for users to quickly browse and use. For example, when the current display interface of the mobile phone is the desktop, the user can perform a rightward sliding operation on the screen of the mobile phone to trigger the mobile phone to switch the display interface from the desktop to a negative screen.
- one negative screen is just a word used in the embodiment of the present application, and its meaning has been recorded in the embodiment of the present application, but its name does not constitute any limitation to the embodiment of the present application; in addition, in some other embodiments, “one negative screen” may also be called other names such as “desktop assistant”, “shortcut menu”, “Widget collection interface”, etc., which is not limited here.
- the voice assistant when the user wants to use the first application to perform 3D modeling of a certain target object, the voice assistant may also be used to control the mobile phone to start and run the first application.
- the present application does not limit the way of starting the first application here.
- FIG. 4 is a schematic diagram of a main interface of a first application provided in an embodiment of the present application.
- the main interface 401 of the first application may include a functional control: "start modeling” 402, and the "start modeling” 402 is the above-mentioned functional control for starting the 3D modeling function.
- the user may click or touch "start modeling" 402 on the main interface 401 of the first application.
- the mobile phone can respond to the user's operation of clicking or touching "start modeling” 402, start the 3D modeling function of the first application, switch the display interface from the main interface 401 of the first application to the 3D modeling data collection interface, and start the camera
- the shooting function displays the pictures captured by the camera on the 3D modeling data collection interface.
- FIG. 5 is a schematic diagram of a 3D modeling data collection interface provided by an embodiment of the present application.
- the 3D modeling data collection interface 501 displayed by the mobile phone may include functional controls: a scan button 502 , and images captured by the camera of the mobile phone.
- a scan button 502 the 3D modeling data collection interface 501 displayed by the mobile phone
- images captured by the camera of the mobile phone For example, please continue to refer to FIG. 5 , assuming that the user wants to perform 3D modeling on a toy car placed on the table, the user can point the camera of the mobile phone at the toy car.
- the 3D modeling data collection interface 501 The picture captured by the camera displayed in may include a toy car 503 and a table 504 .
- the user can move the shooting angle of the mobile phone, adjust the position of the toy car 503 in the screen to be in the central position of the mobile phone screen (that is, the 3D modeling data collection interface 501), and click or touch the scan button 502 in the 3D modeling data collection interface 501 , the mobile phone can respond to the user's operation of clicking or touching the scan button 502, and start collecting data required for 3D modeling of the target object (ie, the toy car 503) at the center of the mobile phone screen.
- the position in the screen can be placed at the center of the mobile phone screen object as the target object.
- the 3D modeling data collection interface may be referred to as a first interface, and the main interface of the first application may be referred to as a second interface.
- FIG. 6 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone can also display a prompt message on the modeling data collection interface 501 : "Please place the target object in the center of the screen" 505 .
- the target object is also the target object, and the prompt information can be used to remind the user to place the position of the target object in the screen at the center of the mobile phone screen.
- the display position of "Please place the target object in the center of the screen” 505 in the modeling data collection interface 501 may be above the scan button 502, or a position lower than the center of the screen, etc.
- "Please place the target object There is no limitation on the display position of "put the object in the center of the screen” 505 in the modeling data collection interface 501.
- the prompt message: "Please place the target object in the center of the screen" 505 is only an exemplary description. Placed in the central position of the mobile phone screen, this application does not limit the content of the prompt information.
- the prompt information used to remind the user to place the position of the target object in the screen at the center of the screen of the mobile phone may be referred to as the first prompt information.
- the mobile phone when the mobile phone responds to the user's operation of clicking or touching the scan button 502 and starts to collect the data required for 3D modeling of the target object, the user can hold the mobile phone around the target object to take pictures.
- the mobile phone can collect 360-degree panoramic data of the target object.
- the data collected by the mobile phone required for 3D modeling of the target object may include: the picture/image of the target object captured by the mobile phone during the process of shooting around the target object, and the picture may be in JPG/JPEG format.
- the mobile phone can collect the RAW image corresponding to the target object through the camera, and then, the processor of the mobile phone can perform ISP processing and JPEG encoding on the RAW image to obtain the corresponding JPG/JPEG format image of the target object.
- FIG. 7A is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone responds to the user's operation of clicking or touching the scan button 502 and starts to collect the data required for 3D modeling of the target object (taking a toy car as an example), it also A mesh model 701 (or called a bounding volume or a virtual bounding volume) may be displayed around the target object in the frame of the 3D modeling data collection interface 501 .
- the mesh model 701 can take the center of the target object as the central axis and cover around the target object.
- the mesh model 701 may include upper and lower layers, each layer may include multiple meshes, the upper layer may be called the first layer, and the lower layer may be called the second layer.
- Each patch in each layer may correspond to a range of angles within 360 degrees around the target object. For example, assuming that the number of patches in each layer is 20, each patch in each layer corresponds to an angle range of 18 degrees.
- the first circle is to shoot the target object with the mobile phone camera looking down (for example, looking down at an angle of 30 degrees, without limitation), and the second circle is to shoot the mobile phone around the target object.
- the camera looks directly at the target object and shoots around the target object.
- the mobile phone can sequentially light up the patches on the first layer as it moves around the target object.
- the mobile phone can sequentially light up the second layer of patches as it moves around the target object.
- the mobile phone can light up the first patch in the first layer.
- the mobile phone captures a picture of the target object within an angle range of 18 degrees to 36 degrees
- the mobile phone can light up the second patch in the first layer.
- the mobile phone can light up the 20th patch in the first layer.
- the user looks down at the target object with the camera of the mobile phone, and after completing the first round of shooting around the target object in a clockwise or counterclockwise direction, all 20 patches in the first layer can be lit.
- the user looks directly at the target object with the camera of the mobile phone, and after completing the second round of shooting around the target object in a clockwise or counterclockwise direction, all 20 patches in the second layer can be lit.
- the picture corresponding to the first patch may be called a first image
- the picture corresponding to the second patch may be called a second image.
- the pose when the mobile phone captures the first image may be called the first pose
- the pose when the mobile phone shoots the second image may be called the second pose.
- the mobile phone can light up the first patch in the first layer.
- the effect of can be shown as 702 in FIG. 7A.
- the lit patch may present a different pattern or color (that is, the display effect is changed).
- FIG. 7B is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
- the hour hand rotates at a certain angle (moves a certain distance)
- the mobile phone can continue to light up the second patch, the third patch, etc. in the first layer.
- FIG. 7C is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
- the mobile phone can light up half or more of the patches in the first layer.
- FIG. 7D is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the user looks down at the target object with the camera of the mobile phone, and during the first round of shooting around the target object in the counterclockwise direction, when the user holds the mobile phone around the target object
- the clock hand moves to the initial shooting position described in FIG. 7A (or when the user holds the mobile phone and turns counterclockwise around the target object)
- the mobile phone can light up all the patches in the first layer.
- the user can adjust the shooting position of the mobile phone relative to the target object, lower the mobile phone by a certain distance, make the mobile phone camera face the target object, and move around the target object in a counterclockwise direction for the first time. Two lap shots.
- FIG. 7E is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone can light up the first patch in the second layer.
- FIG. 7F is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the rules for the mobile phone to light up each patch can be as follows:
- the mobile phone when the mobile phone is shooting a target object, it may collect a preview stream corresponding to the target object, and the preview stream includes multiple frames of pictures.
- the mobile phone can shoot the target object at a frame rate of 24 frames per second, 30 frames per second, etc., and there is no limitation here.
- the mobile phone can perform blur detection on each frame of pictures captured, and obtain pictures whose resolution is greater than the first threshold.
- the first threshold can be determined according to the requirements and the blur detection algorithm, and the size is not limited. If the resolution of the current frame does not meet the requirements (such as less than or equal to the first threshold), continue to acquire the next frame of pictures.
- the mobile phone can perform key frame selection (or called key frame screening) on the pictures whose resolution is greater than the first threshold, and obtain pictures whose features meet the requirements.
- the features of the picture that meet the requirements may include: the features contained in the picture are relatively clear and rich, the features contained in the picture are easy to extract, and the features contained in the picture have less redundant information, etc. There are no restrictions on the algorithm and specific requirements for key frame selection.
- the mobile phone can obtain some key frame pictures with better quality by performing blur detection and key frame selection on the pictures captured by the shooting position, and each patch corresponds to The number of keyframe pictures can be one or more.
- the mobile phone calculates the camera pose information corresponding to the picture obtained in 1) (that is, the pose information of the mobile phone camera).
- the mobile phone when the mobile phone supports AR engine (engine) capability, or AR core (core) capability, or ARKIT capability and other capabilities, the mobile phone can call the aforementioned capabilities to directly obtain the camera pose information corresponding to the picture.
- engine engine
- core core
- ARKIT capability and other capabilities the mobile phone can call the aforementioned capabilities to directly obtain the camera pose information corresponding to the picture.
- the camera pose information may include qw, qx, qy, qz, tx, ty, tz.
- qw, qx, qy, and qz represent a rotation matrix composed of unit quaternions, and tx, ty, and tz can form a translation matrix.
- the rotation matrix and translation matrix can represent the relative positional relationship and angle between the camera (mobile phone camera) and the target object.
- the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system through the aforementioned rotation matrix and translation matrix, and obtain the coordinates of the target object in the camera coordinate system.
- the world coordinate system may refer to a coordinate system whose origin is the center of the target object
- the camera coordinate system may refer to a coordinate system whose origin is the camera center.
- the mobile phone determines the relationship between the picture obtained in 1) and each patch in the patch model, and obtains the patch corresponding to the picture.
- the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system according to the camera pose information (rotation matrix and translation matrix) corresponding to the picture, and obtain the target object in the camera coordinate system coordinates in the system. Then, the mobile phone can determine the connection line between the camera coordinates and the coordinates of the target object according to the coordinates of the target object in the camera coordinate system and the camera coordinates, and the line between the camera coordinates and the coordinates of the target object intersects with the The patch is the patch corresponding to the frame picture.
- the camera coordinates are known parameters for the mobile phone.
- the frame sequence file includes pictures corresponding to each lighted patch, and these pictures can be used as data required for 3D modeling of the target object.
- the format of the picture included in the frame sequence file may be JPG format.
- the pictures saved in the frame sequence file may be sequentially numbered as 001.jpg, 002.jpg, 003.jpg...etc.
- Each frame of pictures included in the above frame sequence file can be called keyframes, and these keyframes can be used as the data collected by the mobile phone in the first part for 3D modeling of the target object.
- a frame sequence file can also be called a key frame sequence file.
- the user can hold the mobile phone and shoot the target object within 1.5 meters from the target object.
- the shooting distance is too short (for example, the 3D modeling data collection interface cannot present the whole picture of the target )
- the phone can turn on the wide-angle camera for shooting.
- the mobile phone when the user holds the mobile phone around the target object to shoot around, the mobile phone lights up the patches in the patch model sequentially as it moves around the target object, which can guide the user to collect and model the target object in 3D
- the required data is guided by a dynamic UI through a 3D guidance interface (that is, the 3D modeling data collection interface that displays the mesh model) to enhance user interactivity and allow users to intuitively perceive the data collection process.
- the description about the patch model (or called the bounding volume) in the first part above is only an example.
- the number of layers of the mesh model can also include more layers or fewer layers, and the number of meshes in each layer can be greater than 20 or less than 20.
- the present application is concerned with the number of layers of the mesh model and each The number of dough pieces in one layer is not limited.
- FIG. 8 is a schematic structural diagram of a mesh model provided by an embodiment of the present application. Please refer to FIG. 8.
- the structure of the patch model can be shown in (a) in FIG. described structure).
- the structure of the mesh model may be as shown in (b) in FIG. 8 , including three layers, upper, middle and lower, and each layer may include multiple meshes.
- the structure of the mesh model may be a one-layer structure composed of multiple meshes, as shown in (c) in FIG. 8 .
- the structure of the mesh model may also be shown in (d) in FIG. 8 , including two upper and lower layers, and each layer may include multiple meshes.
- the structures of the mesh models shown in FIG. 8 are all illustrative. The present application does not limit the structure of the mesh model and the inclination angle of each layer in the mesh model (the inclination angle relative to the central axis).
- the mesh model described in the embodiment of the present application is a virtual model, and the mesh model can be preset in the mobile phone, for example, it can be configured in the file directory of the first application in the form of a configuration file.
- multiple mesh models can be preset in the mobile phone.
- the mobile phone collects the data required for 3D modeling of the target object, it can recommend a target mesh model that matches the target object according to the shape of the target object, or According to the user's selection operation, a target mesh model is selected from multiple mesh models, and the target mesh model is used to implement the guiding function described in the foregoing embodiments.
- the target patch model can also be called a first virtual bounding volume.
- FIG. 9 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone switches the display interface from the main interface 401 of the first application to the 3D modeling data collection interface 501, when it detects that there is a target object (target object) in the screen, it can also be displayed in the modeling data collection interface.
- a prompt message is displayed on the interface 501 : “Target object detected, click the button to start scanning” 506 .
- the button is the scan button 502 , and the prompt information can be used to remind the user to click the scan button 502 so that the mobile phone starts collecting data required for 3D modeling of the target object.
- the mobile phone can respond to the user's operation of clicking or touching "start modeling" 402 to start the 3D modeling function of the first application, and switch the display interface from the main interface 401 of the first application to After the 3D modeling data collection interface, you can also display relevant prompt information on the 3D modeling data collection interface to remind the user to adjust the shooting environment where the target object is located, the way to shoot the target object, and the screen ratio of the object. .
- FIG. 10 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone can respond to the user's operation of clicking or touching "start modeling" 402, start the 3D modeling function of the first application, and switch the display interface from the main interface 401 of the first application to 3D modeling data
- you can also display prompt information 1001 on the 3D modeling data collection interface.
- the content of the prompt information 1001 can be "Place the object on a solid color plane, with soft light, and shoot around the object.
- the screen ratio of the object should be as high as possible. Large and complete", which can be used to remind the user to adjust the shooting environment of the target object.
- the mobile phone may first display a function control 1002 on the 3D modeling data collection interface, for example, the function space may be "know", “confirm” and so on. After the user clicks the function control 1002, the mobile phone no longer displays the prompt information 1001 and the function control 1002, and presents the 3D modeling data collection interface as shown in FIG. 5 above.
- the subsequent data collection process can be faster and the quality of the collected data can be better.
- the prompt information 1001 may also be called second prompt information.
- the mobile phone after the mobile phone switches the display interface from the main interface 401 of the first application to the 3D modeling data collection interface, it can also display the prompt information 1001 on the 3D modeling data collection interface for a preset duration. After a long period of time, the mobile phone may automatically no longer display the prompt message 1001, and present the 3D modeling data collection interface as shown in FIG. 5 above.
- the preset duration may be 20 seconds, 30 seconds, etc., which is not limited here.
- the second part can be executed automatically.
- the mobile phone after the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can be displayed on the 3D modeling data collection interface Displays functional controls for uploading to the cloud for 3D modeling.
- the user can click the functional control for uploading to the cloud for 3D modeling.
- the mobile phone may execute the second part in response to the operation of the user clicking the functional control for uploading to the cloud for 3D modeling.
- FIG. 11 is another schematic diagram of the 3D modeling data collection interface provided by the embodiment of the present application.
- the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can be displayed on the 3D modeling data collection interface.
- Display function control: "upload cloud modeling” 1101, "upload cloud modeling” 1101 is a function control for uploading cloud to perform 3D modeling.
- the user may click "Upload Cloud Modeling" 1101, and the mobile phone may execute the second part in response to the user's operation of clicking "Upload Cloud Modeling" 1101.
- the operation of the user clicking "upload cloud modeling" 1101 may be referred to as an operation of generating a 3D model.
- the mobile phone collects the data required for 3D modeling of the target object (each frame picture included in the frame sequence file) according to the method described in the first part above, it can also A prompt message for prompting the user that the mobile phone has collected the data required for 3D modeling of the target object is displayed on the 3D modeling data collection interface, such as: “scanning is complete” 1102 .
- the mobile phone can also display an exit button 1103 (only marked in Figure 11 ) in the 3D modeling data collection interface, and the mobile phone is executing the first part During the process, the user can click the exit button 1103 at any time, and the mobile phone can respond to the user's operation of clicking the exit button 1103 to exit the execution process of the first part. After exiting the execution process of the first part, the mobile phone can switch the display interface from the 3D modeling data collection interface to the main interface of the first application as shown in FIG. 4 .
- the data collected by the mobile phone in the first part and required for 3D modeling of the target object is each frame picture (ie key frame) included in the frame sequence file mentioned in the first part.
- the mobile phone preprocesses the collected data required for 3D modeling of the target object means: the mobile phone preprocesses the key frames included in the frame sequence file collected in the first part, specifically as follows:
- the mobile phone calculates the matching information of each key frame, and saves the matching information of each key frame in the first file.
- the first file may be a file in JS object notation (javascript object notation, JSON) format.
- the matching information of the key frame may include: identification information of other key frames associated with the key frame.
- the matching information of a key frame of a certain frame may include the identification information of the key frames (such as the nearest key frame) corresponding to the four orientations of the key frame of the frame, up, down, left, and right respectively, and the four orientations of the key frame of the frame respectively correspond to A keyframe for is the other keyframes associated with that frame's keyframe.
- the identification information of the key frame may be the picture number of the key frame.
- the matching information of the key frame of the frame may be used to indicate: which other pictures in the frame sequence file are associated with the key frame of the frame.
- the matching information of each key frame is obtained according to the association relationship between each key frame and the patch corresponding to each key frame, and the association relationship among the plurality of patches. That is, for each key frame, the mobile phone can determine the matching information of the key frame according to the association relationship between the patch corresponding to the key frame and other patches.
- the relationship between the patch corresponding to the key frame and other patches may include: in the patch model, which patches are corresponding to the top, bottom, left, and right directions of the patch corresponding to the key frame.
- the association between patch 1 and other patches can be Including: the dough piece below the dough piece 1 is the dough piece 21, the dough piece on the left side of the dough piece 1 is the dough piece 20, and the dough piece on the right side of the dough piece 1 is the dough piece 2.
- the mobile phone can determine that other keyframes associated with the frame keyframe include the keyframe corresponding to patch 21, the keyframe corresponding to patch 20, and the keyframe corresponding to patch 2. the corresponding keyframe. Therefore, the mobile phone can obtain the matching information of the key frame of the frame, including the identification information of the key frame corresponding to the patch 21, the identification information of the key frame corresponding to the patch 20, and the identification information of the key frame corresponding to the patch 2.
- the first file also includes: camera intrinsics corresponding to each key frame, gravity direction information (gravity), picture (image) name, picture number (index), camera pose information (slampose), Timestamp (timestamp) and other information.
- the first file includes three parts: “intrinsics”, “keyframes”, and “matching_list”.
- the "intrinsics” part is the internal reference of the camera;
- keyframes is the gravity direction information, picture name, picture number, camera pose information, timestamp (timestamp) and other information corresponding to the key frame of each frame;
- matching_list is the key frame of each frame Frame matching information.
- the content of the first file could look like this:
- cx, cy, fx, fy, height, k1, k2, k3, p1, p2, and width are all camera internal references.
- cx and cy represent the offset of the optical axis to the coordinate center of the projection plane
- fx and fy represent the focal lengths in the x and y directions of the camera when shooting
- k1, k2, k3 represent the radial distortion coefficient
- p1 and p2 represent the cut Distortion coefficient
- height (height) and width (width) indicate the resolution of the camera when shooting.
- the gravity direction information can be obtained by the mobile phone according to the built-in gyroscope, which can indicate the offset angle when the mobile phone takes pictures.
- 18.jpg represents the name of the picture (image), and 18 is the number of the picture (only 18.jpg is used as an example here). That is, the above example is the camera intrinsics (intrinsics), gravity direction information (gravity), picture (image) name, picture number (index), camera pose information (slampose), timestamp (timestamp), and match information.
- qw, qx, qy, qz, tx, ty, tz are all camera pose information.
- qw, qx, qy, and qz represent a rotation matrix composed of unit quaternions, and tx, ty, and tz can form a translation matrix.
- the rotation matrix and translation matrix can represent the relative positional relationship and angle between the camera (mobile phone camera) and the target object.
- the mobile phone can convert the coordinates of the target object from the world coordinate system to the camera coordinate system through the aforementioned rotation matrix and translation matrix, and obtain the coordinates of the target object in the camera coordinate system.
- the world coordinate system may refer to a coordinate system whose origin is the center of the target object
- the camera coordinate system may refer to a coordinate system whose origin is the camera center.
- timestamp represents a timestamp, which means the time when the camera captures the key frame of this frame.
- src_id indicates the picture number of each key frame, for example, in the content of the first file given above, the picture number is 18, and the "matching_list" part is the matching information of the key frame with picture number 18.
- tgt_id indicates the picture numbers of other key frames associated with the key frame with picture number 18 (ie, identification information of other key frames associated with the key frame with picture number 18).
- the picture numbers of other key frames associated with the key frame with picture number 18 include: 26, 45, 59, 78, 89, 100, 449 and so on.
- key frames associated with the key frame with picture number 18 include: key frame with picture number 26, key frame with picture number 45, key frame with picture number 59, key frame with picture number 78, The key frame with picture number 89, the key frame with picture number 100, the key frame with picture number 449, etc.
- the mobile phone packs the first file and all key frames in the frame sequence file (that is, all frame pictures included in the above frame sequence file).
- the result obtained after the first file is packaged with all key frames in the frame sequence file is the 3D modeling of the target object collected in the first part by the mobile phone in the second part
- the required data is preprocessed to obtain the preprocessed data.
- the data after the mobile phone preprocesses the data collected in the first part and required for 3D modeling of the target object may include: when the mobile phone is shooting around the target object, Each key frame picture saved in the frame sequence file, and a first file including matching information of each frame key frame.
- the mobile phone After the mobile phone obtains the above-mentioned preprocessed data, it can send (that is, upload) the preprocessed data to the cloud, and the cloud can execute the third part, perform 3D modeling according to the data uploaded by the mobile phone, and obtain the 3D model of the target object.
- the process of 3D modeling in the cloud based on the data uploaded by the mobile phone can be as follows:
- the cloud decompresses the received data packet (the data packet includes the frame sequence file and the first file) from the mobile phone, and extracts the frame sequence file and the above-mentioned first file.
- the cloud performs 3D modeling processing according to the key frame picture included in the frame sequence file and the above-mentioned first file to obtain a 3D model of the target object.
- the step of performing 3D modeling processing on the cloud according to the key frame pictures included in the frame sequence file and the above-mentioned first file may at least include: key target extraction, feature detection and matching, global optimization and fusion, sparse point cloud computing, dense Point cloud computing, surface reconstruction, texture generation.
- key target extraction refers to the operation of separating the target object of interest in the key frame picture from the background, identifying and interpreting meaningful object entities from the image, and extracting different image features.
- Feature detection and matching refers to: detecting the unique pixels in the key frame picture as the feature points of the key frame picture; describing the feature points with significant features in different key frame pictures, and comparing the similarity of the two descriptions to judge different key points Whether the feature points in the frame picture are the same feature.
- the cloud when the cloud performs feature detection and matching, for each key frame, the cloud can use the matching information of the key frame included in the first file (that is, the identification of other key frames associated with the key frame) Information) to determine other key frames associated with the key frame and the key frame of the frame, and perform feature detection and matching on the key frame and other key frames associated with the key frame of the frame.
- the matching information of the key frame included in the first file that is, the identification of other key frames associated with the key frame
- Information that is, the identification of other key frames associated with the key frame
- the cloud can use the matching information of the key frame included in the first file (that is, the identification of other key frames associated with the key frame) Information) to determine other key frames associated with the key frame and the key frame of the frame, and perform feature detection and matching on the key frame and other key frames associated with the key frame of the frame.
- the cloud can determine based on the first file that other key frames associated with the key frame of picture number 18 include: Key frame with picture number 26, key frame with picture number 45, key frame with picture number 59, key frame with picture number 78, key frame with picture number 89, key frame with picture number 100, picture number 449 keyframes etc. Then, the cloud can compare the key frame with picture number 18 with the key frame with picture number 26, the key frame with picture number 45, the key frame with picture number 59, the key frame with picture number 78, and the key frame with picture number 89.
- the key frame, the key frame with picture number 100, the key frame with picture number 449, etc. are used for feature detection and matching, and there is no need to perform feature detection and matching for the key frame with picture number 18 and all other key frames in the frame sequence file.
- the cloud can combine the matching information of the key frame included in the first file, and associate the key frame and other key frames associated with the key frame It is only necessary to perform feature detection and matching, and it is not necessary to perform feature detection and matching on this key frame and all other key frames in the frame sequence file. In this way, the computing load on the cloud can be effectively reduced and the efficiency of 3D modeling can be improved.
- Global optimization and fusion refers to the use of global optimization and fusion algorithms to optimize and fuse the matching results of feature detection and matching.
- the results of global optimization and fusion can be used to generate basic 3D models.
- Sparse point cloud computing and dense point cloud computing refer to generating 3D point cloud data corresponding to the target object according to the results of global optimization and fusion. Compared with images, point clouds have an irreplaceable advantage—depth.
- the 3D point cloud data directly provides the data of the 3D space, while the image needs to reverse the 3D data through the perspective geometry.
- Surface reconstruction refers to the use of 3D point cloud data to accurately restore the 3D surface shape of an object to obtain the basic 3D model of the target object.
- Texture generation refers to: generating the texture (also called texture map) of the surface of the target object according to the key frame picture or the characteristics of the key frame picture. After obtaining the surface texture of the target object, the texture is mapped to the surface of the basic 3D model of the target object in a specific way, so that the surface of the target object can be restored more realistically and the target object looks more real.
- texture also called texture map
- the cloud can also quickly and accurately determine the mapping relationship between the texture and the surface of the basic 3D model of the target object based on the matching information of each key frame included in the first file, which can further improve the modeling efficiency and effectiveness.
- the cloud determines the mapping relationship between the texture of the first key frame and the surface of the basic 3D model of the target object, it can combine the matching information of the first key frame to quickly and accurately determine the texture of the first key frame.
- the cloud can combine the matching information of this key frame to quickly and accurately determine the relationship between the texture of other key frames associated with this key frame and the surface of the basic 3D model of the target object. Mapping relations.
- the cloud can generate the 3D model of the target object according to the basic 3D model of the target object and the texture of the surface of the target object.
- the cloud can save the basic 3D model of the target object and the texture of the surface of the target object for downloading by the mobile phone.
- the matching information of each frame key frame included in the first file can effectively improve the processing speed of 3D modeling, reduce the computing load on the cloud, and improve the Efficiency in 3D modeling.
- the basic 3D model of the target object can be stored in OBJ format, and the texture of the surface of the target object can be stored in JPG format (such as texture map).
- the basic 3D model of the target object can be an OBJ file, and the texture on the surface of the target object can be a JPG file.
- the cloud can save the basic 3D model of the target object and the texture of the surface of the target object for a certain period of time (such as 7 days). texture.
- the cloud may also permanently retain the basic 3D model of the target object and the texture of the surface of the target object, which is not limited here.
- the mobile phone can realize 3D modeling only by relying on ordinary RGB cameras (cameras) to collect the data required for 3D modeling.
- the process of collecting data required for modeling does not need to rely on special hardware such as LIDAR sensors or RGB-D cameras on mobile phones.
- the process of 3D modeling is completed in the cloud, and there is no need to rely on the mobile phone to be equipped with a high-performance discrete graphics card.
- the 3D modeling method can obviously lower the threshold of 3D modeling, and has higher universal applicability to terminal equipment.
- the mobile phone can enhance user interaction through a dynamic UI guidance, allowing the user to intuitively perceive the data collection process.
- the mobile phone when the mobile phone shoots the target object at a certain position, it performs blur detection on each frame of the captured picture to obtain a picture with a resolution that meets the requirements, which can realize the screening of key frames and obtain effective Useful keyframes for modeling.
- the mobile phone extracts the matching information of the key frames of each frame, and sends the first file including the matching information of the key frames of each frame and the frame sequence file composed of the key frames to the cloud for the cloud to carry out modeling (no need to send the captured All pictures), can greatly reduce the complexity of 3D modeling on the cloud side, reduce the consumption of hardware resources in the cloud during the 3D modeling process, effectively reduce the computing load of cloud modeling, and improve the speed and effect of 3D modeling.
- the cloud may send an indication message to the mobile phone to indicate that the cloud has completed the 3D modeling.
- the mobile phone after receiving the above instruction message from the cloud, can automatically download the 3D model of the target object from the cloud for the user to preview the 3D model.
- FIG. 12 is a schematic diagram of a 3D model preview interface provided by the embodiment of the present application.
- the display interface can be switched from the 3D modeling data collection interface shown in FIG. 11 to the 3D model preview interface shown in FIG. 12 .
- the mobile phone may display prompt information on the 3D model preview interface: "modeling" 1201, which is used to prompt the user that the target object is being 3D modeled.
- "Modeling" 1201 may be referred to as the third prompt information.
- the mobile phone may display the third prompt information after detecting that the user clicks on the above-mentioned function control "upload cloud modeling" 1101.
- the cloud After the cloud completes the 3D modeling of the target object and obtains the 3D model of the target object, it can send an indication message to the mobile phone to indicate that the cloud has completed the 3D modeling. After receiving the indication message, the mobile phone can automatically download the 3D model of the target object from the cloud. For example: the mobile phone can send a download request message to the cloud, and the cloud can send the 3D model of the target object (ie, the basic 3D model of the target object and the texture of the surface of the target object) to the mobile phone according to the download request message.
- the mobile phone can send a download request message to the cloud
- the cloud can send the 3D model of the target object (ie, the basic 3D model of the target object and the texture of the surface of the target object) to the mobile phone according to the download request message.
- FIG. 13 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- the mobile phone may also change the prompt information from "modeling in progress" 1201 to "modeling completed” 1301 to remind the user that the 3D model of the target object has been modeled.
- a view button 1302 may also be included in the 3D model preview interface. The user can click the view button 1302, and the mobile phone can display the 3D model of the target object downloaded from the cloud in the 3D model preview interface in response to the user's operation of clicking the view button 1302.
- "Modeling completed" 1301 may be referred to as the fourth prompt information.
- FIG. 14 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- the mobile phone may display the 3D model 1401 of the toy car in the 3D model preview interface in response to the user's operation of clicking the view button 1302 .
- the user can view the 3D model 1401 of the toy car in the 3D model preview interface shown in FIG. 14 .
- the user's operation of clicking the view button 1302 is an operation of previewing the 3D model corresponding to the target object.
- the 3D model of the toy car can be rotated counterclockwise in any direction (such as horizontal direction, vertical direction, etc.)
- the mobile phone can respond to the aforementioned operations of the user and display the 3D models of the toy car at different angles (360 degrees) for the user on the 3D model preview interface.
- Fig. 15 is a schematic diagram of the user performing counterclockwise rotation operation along the horizontal direction on the 3D model of the toy car provided by the embodiment of the present application. As shown in FIG. 15 , the user can use fingers to drag the 3D model of the toy car along the horizontal direction to rotate counterclockwise on the 3D model preview interface.
- FIG. 16 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- the mobile phone can respond to the user's dragging of the 3D model of the toy car along the horizontal direction.
- the counterclockwise rotation operation displays the rendering effects of the angles shown in (a) and (b) in FIG. 16 for the user on the 3D model preview interface.
- the presentation effects of the angles shown in (a), (b) and so on in FIG. 16 are only illustrative.
- the angle at which the 3D model of the toy car is presented is related to the direction, distance, and number of times the user drags the 3D model of the toy car, and will not be shown here one by one.
- the user when the user views the 3D model of the toy car in the 3D model preview interface, the user can also perform zoom-in or zoom-out operations on the 3D model of the toy car, and the mobile phone can respond to the zoom-in performed by the user on the 3D model of the toy car or zoom out operation, the zoom-in or zoom-out effect of the 3D model of the toy car is displayed for the user on the 3D model preview interface.
- FIG. 17 is a schematic diagram of a user performing a zoom-out operation on a 3D model of a toy car provided by an embodiment of the present application.
- the user can use two fingers to slide inward (in the opposite direction) on the 3D model preview interface, and this sliding operation is a zoom-out operation.
- FIG. 18 is another schematic diagram of the 3D model preview interface provided by the embodiment of the present application.
- the mobile phone can respond to the zoom-out operation performed by the user on the 3D model of the toy car, and display the zoomed-in 3D model of the toy car to the user on the 3D model preview interface. After the rendering effect.
- the user can use two fingers to slide outward (opposite direction) on the 3D model preview interface.
- the mobile phone may respond to the zoom-in operation performed by the user on the 3D model of the toy car, and display the zoomed-in presentation effect of the 3D model of the toy car for the user on the 3D model preview interface. Let me go into more detail.
- the zoom-in operation or zoom-out operation performed by the user on the 3D model of the toy car above is an exemplary description.
- the zoom-in or zoom-out operation performed by the user on the 3D model of the toy car can also be a double-click operation, a long-press operation, or, the 3D model preview interface can also include a zoom-in or zoom-out operation Functional controls, etc., are not limited here.
- the mobile phone after the mobile phone receives the above instruction message from the cloud, it may also only display the 3D model preview interface as shown in FIG. 13 above.
- the mobile phone downloads the 3D model of the target object from the cloud in response to the user's operation of clicking the view button 1302, and displays the 3D model of the target object in the 3D model preview interface for the user to preview.
- the present application does not limit the triggering conditions for the mobile phone to download the 3D model of the target object from the cloud.
- the user only needs to perform operations related to collecting data required for 3D modeling on the terminal device side, and then view or preview the final 3D model on the terminal device. For the user, all operations are completed on the terminal device side, which makes the operation easier and the user experience can be better.
- FIG. 19 is a schematic flowchart of a 3D modeling method provided by an embodiment of the present application.
- the 3D modeling method may include S1901-S1913.
- the mobile phone receives a first operation, where the first operation is an operation for starting a first application.
- the first operation may be the above-mentioned operation of clicking or touching the application icon 302 of the first application in the main interface of the mobile phone shown in FIG. 3 .
- the first operation may be an operation of clicking or touching a function control of the first application on a drop-down interface, or another display interface such as a negative one screen.
- the first operation may also be the above-mentioned operation of controlling the mobile phone to start and run the first application through the voice assistant.
- the mobile phone In response to the first operation, the mobile phone starts and runs the first application, and displays a main interface of the first application.
- the main interface of the first application may be referred to as a second interface.
- the mobile phone receives a second operation, where the second operation is an operation of starting the 3D modeling function of the first application.
- the second operation may be the above-mentioned operation of clicking or touching the function control "start modeling" 402 in the main interface 401 of the first application shown in FIG. 4 .
- the mobile phone displays a 3D modeling data collection interface, and starts a camera shooting function, and displays images captured by the camera on the 3D modeling data collection interface.
- the 3D modeling data collection interface displayed by the mobile phone in response to the second operation may refer to the above-mentioned FIG. 5 .
- the 3D modeling data acquisition interface may be referred to as a first interface.
- the mobile phone receives a third operation.
- the third operation is an operation of controlling the mobile phone to collect data required for 3D modeling of the target object.
- the third operation may include the operation of the user clicking or touching the scan button 502 in the 3D modeling data collection interface shown in FIG. 5 above, and the operation of the user holding the mobile phone around the target object to take pictures.
- the third operation may also be referred to as an acquisition operation.
- the mobile phone acquires a frame sequence file composed of key frame pictures corresponding to the target object.
- the mobile phone obtains the matching information of key frames of each frame in the frame sequence file, and obtains the first file.
- the matching information of the key frame included in the first file may include: the identification information of the nearest key frame respectively corresponding to the four orientations of the key frame, up, down, left, and right, for example, the identification information Can be the number of the preceding picture.
- the mobile phone sends the frame sequence file and the first file to the cloud.
- the cloud receives the frame sequence file and the first file.
- the cloud performs 3D modeling according to the frame sequence file and the first file to obtain a 3D model of the target object.
- the 3D model of the target object may include the basic 3D model of the target object and the texture of the surface of the target object. Paste the texture (texture map) on the surface of the target object to the basic 3D model of the target object, which is the 3D model of the target object.
- the cloud sends an indication message to the mobile phone, which is used to indicate that the cloud has completed the 3D modeling.
- the mobile phone receives the indication message.
- the mobile phone sends a download request message to the cloud, for requesting to download the 3D model of the target object.
- the cloud receives the download request message.
- the cloud sends the 3D model of the target object to the mobile phone.
- the mobile phone receives a 3D model of the target object.
- the mobile phone displays the 3D model of the target object.
- the mobile phone displays the 3D model of the target object, which can be used for the user to preview the 3D model of the target object.
- the effect of the mobile phone displaying the 3D model of the target object can refer to the above-mentioned figures 12, 13, 14, 16, and 18.
- the target object displayed on the mobile phone can The 3D model can be rotated, zoomed in or out, etc.
- FIG. 20 is a schematic diagram of a 3D modeling method implemented by the device-cloud collaboration system provided in the embodiment of the present application.
- the mobile phone may at least include an RGB camera (such as a camera) and a first application.
- the first application is the above-mentioned 3D modeling application.
- the RGB camera can be used to realize the shooting function of the mobile phone, shoot the target object to be modeled, and obtain the corresponding picture of the target object.
- the RGB camera can transmit the captured pictures to the first application.
- the first application may include a data acquisition and dynamic guidance module, a data processing module, a 3D model preview module, and a 3D model export module.
- the data acquisition and dynamic guidance module can realize functions such as blur detection, key frame selection, guidance information calculation, and guidance interface update.
- the blur detection function can be used to realize the blur detection of each frame of the picture (which can be called the input frame), and obtain the picture with the definition that meets the requirements as the key frame; if the current frame picture is clear If the degree does not meet the requirements, continue to obtain the next frame of pictures.
- the key frame selection function you can judge whether the picture has been stored in the frame sequence file. If the picture is not stored in the frame sequence file, add the frame picture to the frame sequence file.
- the relationship between the picture and each patch in the patch model can be determined according to the camera pose information corresponding to the picture, and the corresponding patch of the picture can be obtained.
- the corresponding relationship between pictures and patches is the guide information.
- the update function of the guide interface the display effect of the patches in the patch model can be updated (that is, changed), such as lighting up the patches, according to the guidance information obtained through the aforementioned calculation.
- the data processing module can realize matching relationship calculation, matching list calculation, data packaging and other functions.
- the matching relationship between key frames in the frame sequence file can be calculated through the matching relationship calculation function, such as whether they are adjacent or not.
- the data processing module can calculate the matching relationship between the key frames in the frame sequence file according to the association relationship between the patches in the patch model through the overmatching relationship calculation function.
- the matching list calculation function can generate the matching list of each key frame according to the calculation result of the matching relationship calculation function, and the matching list of each key frame includes the matching information of each key frame.
- the matching information of a key frame includes identification information of other key frames associated with the key frame of the frame.
- the first file including the matching information of each key frame and the frame sequence file can be packaged through the data packaging function. After the first file and the frame sequence file are packaged, the mobile phone can send the packaged data package (including the first file and the frame sequence file) to the cloud.
- the cloud may include a data analysis module, a 3D modeling module, and a data storage module.
- the data analysis module can analyze the received data packet to obtain the frame sequence file and the first file.
- the 3D modeling module can perform 3D modeling according to the frame sequence file and the first file to obtain a 3D model.
- the 3D modeling module can realize functions such as key target extraction, feature detection and matching, global optimization and fusion, sparse point cloud computing, dense point cloud computing, surface reconstruction, and texture generation.
- key target extraction function the target object of interest in the key frame picture can be separated from the background, and different image features can be extracted by identifying and interpreting meaningful object entities from the image.
- feature detection and matching function the unique pixels in the key frame picture can be detected as the feature points of the key frame picture; the feature points with significant features in different key frame pictures are described, and the similarity between the two descriptions is compared to judge different keys Whether the feature points in the frame picture are the same feature.
- the 3D point cloud data corresponding to the target object can be generated according to the results of feature detection and matching.
- the surface reconstruction function the 3D surface shape of the object can be accurately restored by using the 3D point cloud data, and the basic 3D model of the target object can be obtained.
- the texture generation function the texture (also called texture map) of the surface of the target object can be generated according to the key frame picture or the characteristics of the key frame picture. After the texture of the surface of the target object is obtained, the texture is mapped to the surface of the basic 3D model of the target object in a specific manner to obtain the 3D model of the target object.
- the 3D modeling module can store the 3D model of the target object in the data storage module.
- the first application of the mobile phone can download the 3D model of the target object from the data storage module in the cloud. After downloading the 3D model of the target object, the first application may provide the user with a 3D model preview function through the 3D model preview module, or provide the user with a 3D model export function through the 3D model export module.
- the specific process of the first application providing the user with a 3D model preview function through the 3D model preview module please refer to the foregoing embodiments.
- the scanning progress can also be displayed on the 3D modeling data collection interface.
- the mobile phone can display the scanning progress through the circular black filling effect in the scan button in the 3D modeling data collection interface.
- the UI presentation effect of the scan button is different, and the manner in which the mobile phone displays the scanning progress on the 3D modeling data collection interface may be different, which is not limited here.
- the mobile phone does not need to display the scanning progress, and the user can know the scanning progress according to the lighting of the patches in the patch model.
- the steps of the 3D modeling method provided in the embodiments of the present application may also all be implemented on the terminal device side.
- the functions implemented on the cloud side described in the foregoing embodiments may also all be implemented in the terminal device. That is, after obtaining the above-mentioned frame sequence file and the first file, the terminal device can directly locally generate a 3D model of the target object according to the frame sequence file and the first file, and provide functions such as preview and export of the 3D model.
- the specific principle that the terminal device generates the 3D model of the target object locally based on the frame sequence file and the first file is the same as the principle that the cloud generates the 3D model of the target object based on the frame sequence file and the first file described in the foregoing embodiment. I won't repeat them here.
- an embodiment of the present application provides a modeling device, which can be applied to a terminal device, and is used to implement the terminal device in the 3D modeling method described in the foregoing embodiments. steps to achieve.
- the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
- Hardware or software includes one or more modules or units corresponding to the functions described above.
- FIG. 21 is a schematic structural diagram of a modeling device provided by an embodiment of the present application.
- the device may include: a display unit 2101 and a processing unit 2102 .
- the display unit 2101 and the processing unit 2102 can be used to cooperate to implement the functions of the terminal device in the modeling method described in the foregoing method embodiments.
- the display unit 2101 is configured to display a first interface, where the first interface includes a captured image of the terminal device.
- the processing unit 2102 is configured to, in response to the collection operation, collect multiple frames of images corresponding to the target object to be modeled, and acquire a relationship among the multiple frames of images. Acquiring a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
- the display unit 2101 is further configured to display a three-dimensional model corresponding to the target object.
- the display unit 2101 is further configured to display a first virtual bounding volume; the first virtual bounding volume includes multiple meshes.
- the processing unit 2102 is specifically configured to collect the first image when the terminal device is in the first pose, and change the display effect of the patch corresponding to the first image; when the terminal device is in the second pose, collect the second image, and Changing the display effect of the patches corresponding to the second image; after changing the display effects of the plurality of patches of the first virtual bounding volume, acquiring the association relationship between the multiple frames of images according to the plurality of patches.
- the display unit 2101 and the processing unit 2102 are also configured to implement other display functions and processing functions of the terminal device in the modeling method described in the foregoing method embodiments, which will not be repeated here.
- FIG. 22 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
- the terminal device sends the multi-frame images and the association relationship between the multi-frame images to the server, and the server sends the multi-frame images according to the multi-frame images
- the modeling device may also include a sending unit 2103 and a receiving unit 2104, and the sending unit 2103 is configured to send the multi-frame images to the server
- the receiving unit 2104 is configured to receive the 3D model corresponding to the target object sent from the server.
- the sending unit 2103 is also configured to implement other sending functions that the terminal device can implement in the methods described in the foregoing method embodiments, such as: sending a download request message
- the receiving unit 2104 is also configured to implement the methods described in the foregoing method embodiments
- Other receiving functions that the terminal device can implement in the method, such as: receiving indication messages, will not be described here one by one.
- apparatus may further include other modules or units configured to implement the functions of the terminal device in the methods described in the foregoing embodiments, which are not shown here one by one.
- the embodiment of the present application further provides a modeling device, which can be applied to a server, and used to realize the function of the server in the 3D modeling method described in the foregoing embodiments.
- the functions of the device can be realized by hardware, and can also be realized by executing corresponding software by hardware.
- Hardware or software includes one or more modules or units corresponding to the functions described above.
- FIG. 23 is another schematic structural diagram of the modeling device provided by the embodiment of the present application.
- the apparatus may include: a receiving unit 2301 , a processing unit 2302 and a sending unit 2303 .
- the receiving unit 2301, the processing unit 2302, and the sending unit 2303 may be configured to cooperate to realize the server function in the modeling method described in the foregoing method embodiments.
- the receiving unit 2301 may be configured to receive multiple frames of images corresponding to the target object sent from the terminal device and the association relationship among the multiple frames of images.
- the processing unit 2302 may be configured to generate a three-dimensional model corresponding to the target object according to the multi-frame images and the association relationship between the multi-frame images.
- the sending unit 2303 may be configured to send the 3D model corresponding to the target object to the terminal device.
- the receiving unit 2301, the processing unit 2302, and the sending unit 2303 may be configured to implement all functions that can be implemented by the server in the modeling method described in the foregoing method embodiments, which will not be repeated here.
- the division of units (or called modules) in the above device is only a division of logical functions, and may be fully or partially integrated into a physical entity or physically separated during actual implementation.
- the units in the device can all be implemented in the form of software called by the processing element; they can also be implemented in the form of hardware; some units can also be implemented in the form of software called by the processing element, and some units can be implemented in the form of hardware.
- each unit can be a separate processing element, or it can be integrated in a certain chip of the device. In addition, it can also be stored in the memory in the form of a program, which is called and executed by a certain processing element of the device. Function. In addition, all or part of these units can be integrated together, or implemented independently.
- the processing element described here may also be referred to as a processor, and may be an integrated circuit with a signal processing capability. In the process of implementation, each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in the processor element or implemented in the form of software called by the processing element.
- the units in the above device may be one or more integrated circuits configured to implement the above method, for example: one or more application specific integrated circuits (ASIC), or, one or more A digital signal processor (DSP), or, one or more field programmable gate arrays (FPGA), or a combination of at least two of these integrated circuit forms.
- ASIC application specific integrated circuits
- DSP digital signal processor
- FPGA field programmable gate arrays
- the processing element can be a general-purpose processor, such as a central processing unit (central processing unit, CPU) or other processors that can call programs.
- CPU central processing unit
- these units can be integrated together and implemented in the form of a system-on-a-chip (SOC).
- the units of the above apparatus for implementing each corresponding step in the above method may be implemented in the form of a processing element scheduler.
- the apparatus may include a processing element and a storage element, and the processing element invokes a program stored in the storage element to execute the methods described in the above method embodiments.
- the storage element may be a storage element on the same chip as the processing element, that is, an on-chip storage element.
- the program for executing the above method may be stored in a storage element on a different chip from the processing element, that is, an off-chip storage element.
- the processing element invokes or loads a program from the off-chip storage element on the on-chip storage element, so as to invoke and execute the steps performed by the terminal device or the server in the methods described in the above method embodiments.
- the embodiment of the present application may also provide an apparatus, such as an electronic device.
- the electronic device may include: a processor; a memory; and a computer program; wherein the computer program is stored on the memory, and when the computer program is executed by the processor, the electronic device realizes the aforementioned implementation The steps performed by the terminal device or the server in the 3D modeling method described in the example.
- the memory can be located inside the electronic device or outside the electronic device.
- the processor includes one or more.
- the electronic device may be a mobile phone, a large screen (such as a smart screen), a tablet computer, a wearable device (such as a smart watch, a smart bracelet, etc.), a TV, a car device, an augmented reality (augmented reality, AR) /Virtual reality (virtual reality, VR) equipment, notebook computer, ultra-mobile personal computer (ultra-mobile personal computer, UMPC), netbook, personal digital assistant (personal digital assistant, PDA) and other terminal equipment.
- augmented reality augmented reality, AR
- VR Virtual reality
- notebook computer ultra-mobile personal computer
- UMPC ultra-mobile personal computer
- netbook personal digital assistant
- PDA personal digital assistant
- the unit of the device implementing each step in the above method may be configured as one or more processing elements, where the processing elements may be integrated circuits, for example: one or more ASICs, or, one or more Multiple DSPs, or, one or more FPGAs, or a combination of these types of integrated circuits. These integrated circuits can be integrated together to form a chip.
- an embodiment of the present application further provides a chip, and the chip can be applied to the above-mentioned electronic device.
- the chip includes one or more interface circuits and one or more processors; the interface circuits and processors are interconnected through lines; the processor receives and executes computer instructions from the memory of the electronic device through the interface circuits, so as to realize the Steps performed by a terminal device or a server in a 3D modeling method.
- the embodiment of the present application also provides a computer program product, including computer readable code, when the computer readable code runs in the electronic device, the electronic device implements the terminal device or server in the 3D modeling method as described in the foregoing embodiments steps performed.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
- the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
- the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
- the software product is stored in a program product, such as a computer-readable storage medium, and includes several instructions to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all of the methods described in various embodiments of the present application. or partial steps.
- the aforementioned storage medium includes: various media capable of storing program codes such as U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk.
- an embodiment of the present application may also provide a computer-readable storage medium, the computer-readable storage medium includes a computer program, and when the computer program is run on an electronic device, the electronic device implements the above-mentioned embodiment.
- the embodiment of the present application also provides a device-cloud collaboration system.
- the composition of the device-cloud collaboration system can refer to the above-mentioned FIG. 1 or FIG. 20, including a terminal device and a server, and the terminal device is connected to the server
- the terminal device displays a first interface, and the first interface includes a shooting picture of the terminal device; the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and obtains the An association relationship between multiple frames of images; wherein, during the process of acquiring multiple frames of images corresponding to the target object, the terminal device displays a first virtual enclosure; the first virtual enclosure includes a plurality of patches;
- the terminal device collects multiple frames of images corresponding to the target object to be modeled in response to the collection operation, and acquiring the correlation between the multiple frames of images includes: when the terminal device is in the first pose, the The terminal device collects the first image, and changes the display effect of the patch corresponding to the first image; when the terminal device is in the second pose
- the terminal device can realize all the functions that the terminal device can realize in the 3D modeling method described in the foregoing method embodiments, and the server can realize the 3D modeling method described in the foregoing method embodiments All the functions that can be realized by the middle server are not repeated here.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Architecture (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims (24)
- 一种建模方法,其特征在于,所述方法应用于终端设备,所述方法包括:所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中,所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;所述终端设备显示所述目标物体对应的三维模型。
- 根据权利要求1所述的方法,其特征在于,所述终端设备包括第一应用,在所述终端设备显示第一界面之前,所述方法还包括:所述终端设备响应于打开所述第一应用的操作,显示第二界面;所述终端设备显示第一界面,包括:所述终端设备响应于在所述第二界面启动所述第一应用的三维建模功能的操作,显示所述第一界面。
- 根据权利要求1或2所述的方法,其特征在于,所述第一虚拟包围体包括一层或多层,所述多个面片分布在所述一层或多层。
- 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:所述终端设备显示第一提示信息,所述第一提示信息用于提醒用户将目标物体在拍摄画面中的位置置于中央位置。
- 根据权利要求1-4任一项所述的方法,其特征在于,所述方法还包括:所述终端设备显示第二提示信息;所述第二提示信息用于提醒用户调整所述目标物体所处的拍摄环境、对所述目标物体进行拍摄的方式、以及所述目标物体的屏占比中的一种或多种。
- 根据权利要求1-5任一项所述的方法,其特征在于,在所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型之前,所述方法还包括:所述终端设备检测生成三维模型的操作;所述终端设备响应于所述生成三维模型的操作,显示第三提示信息,所述第三提示信息用于提示用户正在对所述目标物体进行建模。
- 根据权利要求1-6任一项所述的方法,其特征在于,在所述终端设备根据所述 多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型之后,所述方法还包括:所述终端设备显示第四提示信息,所述第四提示信息用于提示用户已完成对所述目标物体的建模。
- 根据权利要求1-7任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型还包括:所述终端设备响应于改变所述目标物体对应的三维模型的显示角度的操作,改变所述目标物体对应的三维模型的显示角度;所述改变所述目标物体对应的三维模型的显示角度的操作包括拖动所述目标物体对应的三维模型沿着第一方向进行顺时针转动或逆时针转动的操作。
- 根据权利要求1-8任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型还包括:所述终端设备响应于改变所述目标物体对应的三维模型的显示大小的操作,改变所述目标物体对应的三维模型的显示大小;所述改变所述目标物体对应的三维模型的显示大小的操作包括对所述目标物体对应的三维模型进行放大或缩小的操作。
- 根据权利要求1-9任一项所述的方法,其特征在于,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
- 根据权利要求1-10任一项所述的方法,其特征在于,所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系还包括:所述终端设备根据所述拍摄画面确定所述目标物体;当所述终端设备采集所述多帧图像时,所述目标物体在所述拍摄画面中的位置为所述拍摄画面的中央位置。
- 根据权利要求1-11任一项所述的方法,其特征在于,所述终端设备采集待建模的目标物体对应的多帧图像,包括:所述终端设备在对所述目标物体进行拍摄的过程中,对拍摄到的每一帧图像进行模糊检测,采集清晰度大于第一阈值的图像作为所述目标物体对应的图像。
- 根据权利要求1-12任一项所述的方法,其特征在于,所述终端设备显示所述目标物体对应的三维模型,包括:所述终端设备响应于对所述目标物体对应的三维模型进行预览的操作,显示所述目标物体对应的三维模型。
- 根据权利要求1-13任一项所述的方法,其特征在于,所述目标物体对应的三维模型包括所述目标物体的基本三维模型、以及所述目标物体表面的纹理。
- 根据权利要求1-14任一项所述的方法,其特征在于,所述终端设备与服务器连接;所述终端设备根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述 目标物体对应的三维模型,包括:所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;所述终端设备接收来自所述服务器发送的所述目标物体对应的三维模型。
- 根据权利要求15所述的方法,其特征在于,所述方法还包括:所述终端设备向所述服务器发送所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳。
- 根据权利要求15或16所述的方法,其特征在于,所述方法还包括:所述终端设备接收来自所述服务器的指示消息,所述指示消息用于向所述终端设备指示所述服务器已完成对所述目标物体的建模。
- 根据权利要求15-17任一项所述的方法,其特征在于,所述终端设备接收来自所述服务器发送的所述目标物体对应的三维模型之前,所述方法还包括:所述终端设备向所述服务器发送下载请求消息,所述下载请求消息用于向所述服务器请求下载所述目标物体对应的三维模型。
- 一种建模方法,其特征在于,所述方法应用于服务器,所述服务器与终端设备连接;所述方法包括:所述服务器接收来自所述终端设备发送的目标物体对应的多帧图像以及所述多帧图像之间的关联关系;所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成所述目标物体对应的三维模型;所述服务器向所述终端设备发送所述目标物体对应的三维模型。
- 根据权利要求19所述的方法,其特征在于,所述方法还包括:所述服务器接收来自所述终端设备发送的所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳;所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,生成所述目标物体对应的三维模型,包括:所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系、所述多帧图像分别对应的相机内参、重力方向信息、图像名称、图像编号、相机位姿信息、以及时间戳,生成所述目标物体对应的三维模型。
- 根据权利要求19或20所述的方法,其特征在于,所述多帧图像之间的关联关系包括所述多帧图像中的每一帧图像的匹配信息;每一帧所述图像的匹配信息包括所述多帧图像中与所述图像关联的的其他图像的标识信息;每一帧所述图像的匹配信息根据每一帧所述图像与每一帧所述图像对应的面片的关联关系,以及所述多个面片之间的关联关系获得。
- 一种端云协同系统,其特征在于,包括:终端设备和服务器,所述终端设备与所述服务器连接;所述终端设备显示第一界面,所述第一界面包括所述终端设备的拍摄画面;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系;其中,在采集所述目标物体对应的多帧图像的过程中, 所述终端设备显示第一虚拟包围体;所述第一虚拟包围体包括多个面片;所述终端设备响应于采集操作,采集待建模的目标物体对应的多帧图像,并获取所述多帧图像之间的关联关系包括:当所述终端设备为第一位姿时,所述终端设备采集第一图像,并改变所述第一图像对应的面片的显示效果;当所述终端设备为第二位姿时,所述终端设备采集第二图像,并改变所述第二图像对应的面片的显示效果;当改变所述第一虚拟包围体的所述多个面片的显示效果后,所述终端设备根据所述多个面片获取所述多帧图像之间的关联关系;所述终端设备向所述服务器发送所述多帧图像以及所述多帧图像之间的关联关系;所述服务器根据所述多帧图像以及所述多帧图像之间的关联关系,获取所述目标物体对应的三维模型;所述服务器向所述终端设备发送所述目标物体对应的三维模型;所述终端设备显示所述目标物体对应的三维模型。
- 一种电子设备,其特征在于,包括:处理器;存储器;以及计算机程序;其中,所述计算机程序存储在所述存储器上,当所述计算机程序被所述处理器执行时,使得所述电子设备实现如权利要求1-18任一项所述的方法,或者如权利要求19-21任一项所述的方法。
- 一种计算机可读存储介质,所述计算机可读存储介质包括计算机程序,其特征在于,当所述计算机程序在电子设备上运行时,使得所述电子设备实现如权利要求1-18任一项所述的方法,或者如权利要求19-21任一项所述的方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP22827279.5A EP4343698A4 (en) | 2021-06-26 | 2022-05-19 | MODELING METHOD AND ASSOCIATED ELECTRONIC DEVICE, AND STORAGE MEDIUM |
| US18/573,668 US12482212B2 (en) | 2021-06-26 | 2022-05-19 | Modeling method, related electronic device, and storage medium |
| US19/361,280 US20260112138A1 (en) | 2021-06-26 | 2025-10-17 | Modeling method, related electronic device, and storage medium |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110715044.2 | 2021-06-26 | ||
| CN202110715044.2A CN115526925B (zh) | 2021-06-26 | 2021-06-26 | 建模方法及相关电子设备及存储介质 |
Related Child Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/573,668 A-371-Of-International US12482212B2 (en) | 2021-06-26 | 2022-05-19 | Modeling method, related electronic device, and storage medium |
| US19/361,280 Continuation US20260112138A1 (en) | 2021-06-26 | 2025-10-17 | Modeling method, related electronic device, and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022267781A1 true WO2022267781A1 (zh) | 2022-12-29 |
Family
ID=84544082
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2022/093934 Ceased WO2022267781A1 (zh) | 2021-06-26 | 2022-05-19 | 建模方法及相关电子设备及存储介质 |
Country Status (4)
| Country | Link |
|---|---|
| US (2) | US12482212B2 (zh) |
| EP (1) | EP4343698A4 (zh) |
| CN (2) | CN115526925B (zh) |
| WO (1) | WO2022267781A1 (zh) |
Families Citing this family (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN121646794A (zh) * | 2024-06-28 | 2026-03-10 | 京东方科技集团股份有限公司 | 基于增强现实的三维重建方法和装置 |
| CN119784679B (zh) * | 2024-11-26 | 2025-12-19 | 深圳创景数科信息技术有限公司 | 基于多模态模型的纺织面料识别方法、装置、设备及存储介质 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160335985A1 (en) * | 2015-05-14 | 2016-11-17 | Box, Inc. | Rendering high bit depth grayscale images using gpu color spaces and acceleration |
| CN108108748A (zh) * | 2017-12-08 | 2018-06-01 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
| CN109658507A (zh) * | 2018-11-27 | 2019-04-19 | 联想(北京)有限公司 | 信息处理方法及装置、电子设备 |
| CN110473292A (zh) * | 2019-07-16 | 2019-11-19 | 江苏艾佳家居用品有限公司 | 一种三维场景中模型自动化加载布局方法 |
Family Cites Families (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104573597B (zh) | 2013-10-10 | 2018-12-11 | 腾讯科技(深圳)有限公司 | 一种二维码识别方法,及装置 |
| CN104574501B (zh) * | 2014-12-19 | 2017-07-21 | 浙江大学 | 一种针对复杂三维场景的高质量纹理映射方法 |
| CN108229232B (zh) | 2016-12-21 | 2021-02-19 | 腾讯科技(深圳)有限公司 | 批量扫描二维码的方法和批量扫描二维码的装置 |
| CN106951812B (zh) | 2017-03-31 | 2018-12-07 | 腾讯科技(深圳)有限公司 | 识别二维码的方法、装置和终端 |
| CN112861560B (zh) | 2017-09-27 | 2023-12-22 | 创新先进技术有限公司 | 二维码定位方法及装置 |
| CN110020571B (zh) | 2019-03-18 | 2022-05-13 | 创新先进技术有限公司 | 二维码校正方法、装置及设备 |
| CN110064200B (zh) * | 2019-04-25 | 2022-02-22 | 腾讯科技(深圳)有限公司 | 基于虚拟环境的物体构建方法、装置及可读存储介质 |
| CN115456002A (zh) | 2019-05-31 | 2022-12-09 | 创新先进技术有限公司 | 二维码识别方法、二维码定位识别模型建立方法及其装置 |
| CN112785492A (zh) * | 2021-01-20 | 2021-05-11 | 北京百度网讯科技有限公司 | 图像处理方法、装置、电子设备和存储介质 |
-
2021
- 2021-06-26 CN CN202110715044.2A patent/CN115526925B/zh active Active
- 2021-06-26 CN CN202510890245.4A patent/CN120997269A/zh active Pending
-
2022
- 2022-05-19 EP EP22827279.5A patent/EP4343698A4/en active Pending
- 2022-05-19 WO PCT/CN2022/093934 patent/WO2022267781A1/zh not_active Ceased
- 2022-05-19 US US18/573,668 patent/US12482212B2/en active Active
-
2025
- 2025-10-17 US US19/361,280 patent/US20260112138A1/en active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20160335985A1 (en) * | 2015-05-14 | 2016-11-17 | Box, Inc. | Rendering high bit depth grayscale images using gpu color spaces and acceleration |
| CN108108748A (zh) * | 2017-12-08 | 2018-06-01 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
| CN109658507A (zh) * | 2018-11-27 | 2019-04-19 | 联想(北京)有限公司 | 信息处理方法及装置、电子设备 |
| CN110473292A (zh) * | 2019-07-16 | 2019-11-19 | 江苏艾佳家居用品有限公司 | 一种三维场景中模型自动化加载布局方法 |
Non-Patent Citations (1)
| Title |
|---|
| See also references of EP4343698A4 |
Also Published As
| Publication number | Publication date |
|---|---|
| US12482212B2 (en) | 2025-11-25 |
| CN115526925A (zh) | 2022-12-27 |
| US20260112138A1 (en) | 2026-04-23 |
| EP4343698A4 (en) | 2024-09-04 |
| US20240331324A1 (en) | 2024-10-03 |
| CN115526925B (zh) | 2025-07-11 |
| EP4343698A1 (en) | 2024-03-27 |
| CN120997269A (zh) | 2025-11-21 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110012209B (zh) | 全景图像生成方法、装置、存储介质及电子设备 | |
| WO2022042776A1 (zh) | 一种拍摄方法及终端 | |
| JP2022537614A (ja) | マルチ仮想キャラクターの制御方法、装置、およびコンピュータプログラム | |
| CN114640783B (zh) | 一种拍照方法及相关设备 | |
| CN111833461B (zh) | 一种图像特效的实现方法、装置、电子设备及存储介质 | |
| US20260112138A1 (en) | Modeling method, related electronic device, and storage medium | |
| CN110290426B (zh) | 展示资源的方法、装置、设备及存储介质 | |
| CN113709355B (zh) | 滑动变焦的拍摄方法及电子设备 | |
| CN110796248A (zh) | 数据增强的方法、装置、设备及存储介质 | |
| CN108776822B (zh) | 目标区域检测方法、装置、终端及存储介质 | |
| CN108495032A (zh) | 图像处理方法、装置、存储介质及电子设备 | |
| WO2021185374A1 (zh) | 一种拍摄图像的方法及电子设备 | |
| CN116711316A (zh) | 电子装置及其操作方法 | |
| WO2021103919A1 (zh) | 构图推荐方法和电子设备 | |
| WO2023035868A1 (zh) | 拍摄方法及电子设备 | |
| CN115150542A (zh) | 一种视频防抖方法及相关设备 | |
| JP2022543510A (ja) | 撮影方法、装置、電子機器及び記憶媒体 | |
| CN110956571A (zh) | 基于slam进行虚实融合的方法及电子设备 | |
| CN117676314B (zh) | 一种拍照方法、电子设备和存储介质 | |
| US10148874B1 (en) | Method and system for generating panoramic photographs and videos | |
| CN115587938A (zh) | 视频畸变校正方法及相关设备 | |
| CN114979458B (zh) | 一种图像的拍摄方法及电子设备 | |
| CN117729320B (zh) | 图像显示方法、设备及存储介质 | |
| CN120282018B (zh) | 图像拍摄方法、电子设备以及计算机可读存储介质 | |
| CN116095405B (zh) | 视频播放方法和装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22827279 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2022827279 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18573668 Country of ref document: US |
|
| ENP | Entry into the national phase |
Ref document number: 2022827279 Country of ref document: EP Effective date: 20231213 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWG | Wipo information: grant in national office |
Ref document number: 18573668 Country of ref document: US |


