WO2025029057A1 - 생성 모델을 이용한 이미지 처리 방법 및 이를 수행하기 위한 컴퓨팅 장치 - Google Patents
생성 모델을 이용한 이미지 처리 방법 및 이를 수행하기 위한 컴퓨팅 장치 Download PDFInfo
- Publication number
- WO2025029057A1 WO2025029057A1 PCT/KR2024/011285 KR2024011285W WO2025029057A1 WO 2025029057 A1 WO2025029057 A1 WO 2025029057A1 KR 2024011285 W KR2024011285 W KR 2024011285W WO 2025029057 A1 WO2025029057 A1 WO 2025029057A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- area
- generation
- computing device
- input image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—Two-dimensional [2D] image generation
- G06T11/60—Creating or editing images; Combining images with text
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0475—Generative networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
Definitions
- the present disclosure relates to a method, device and system for editing an image using a generative model, and more particularly, to a method, device and system for editing an image after an object is transformed when the position or size of the object in the image is transformed.
- Generative AI technology refers to a technology that learns the patterns and structures of massive training data and generates new data similar to the input data based on that. Using generative AI technology, you can obtain images corresponding to text or expand images to areas that were not included in the original image.
- Generative AI technology can be applied to the field of image processing to support outpainting or inpainting.
- Outpainting refers to expanding an image while maintaining the style and content of the image
- inpainting refers to generating an image to fill a specific area within the image.
- some devices or programs support functions that allow users to change the location or size of objects included in an image.
- part of an object is cut off in the image after the object is changed or the object is displayed in a way that does not match its surroundings, user satisfaction may decrease.
- a method executed by at least one processor for processing an image using one or more generative models may include the steps of: receiving a user input requesting movement of at least one object included in an input image; expanding the input image in a direction determined based on the movement of the object; determining a generation required area, which is an area requiring generation of a partial image of the at least one object, based on the expanded input image; generating an image for the generation required area using at least one generative model; and outputting a reconstructed image based on the input image and the partial image of the at least one object.
- a computing device includes an input/output interface for receiving a user input requesting processing of an input image and outputting a reconstructed image processed according to the user input, a memory storing instructions for processing the input image, and at least one processor for executing the instructions, wherein, when the at least one processor receives a user input requesting movement of at least one object included in the input image, the at least one processor expands the input image in a direction determined based on the movement of the object, determines a generation required area, which is an area requiring generation of a partial image of the at least one object, based on the expanded input image, and generates an image for the generation required area using at least one generative model, and then outputs a reconstructed image based on the input image and the partial image of the at least one object.
- a computer-readable, non-transitory recording medium is stored with instructions executed by at least one processor, such that the at least one processor includes an input/output interface for receiving a user input requesting processing of an input image and outputting a reconstructed image processed according to the user input, a memory storing instructions for processing the input image, and at least one processor for executing the instructions, wherein, when the at least one processor receives a user input requesting movement of at least one object included in the input image, the instructions enable the at least one processor to expand the input image in a direction determined based on the movement of the object, determine a generation required area, which is an area requiring generation of a partial image of the at least one object based on the expanded input image, generate an image for the generation required area using at least one generative model, and then output a reconstructed image based on the input image and the partial image of the at least one object.
- the at least one processor includes an input/output interface for receiving a user input requesting processing of an input image and outputting a
- a computer program may be stored in a medium for performing at least one of the embodiments of the disclosed method on a computer.
- FIG. 1 is a diagram illustrating a process of editing an image using a creation model when transforming an object included in an image according to one embodiment of the present disclosure.
- FIG. 2 is a diagram illustrating a process of expanding an image by considering inverse transformation when an object included in the image is reduced according to one embodiment of the present disclosure.
- FIG. 3 is a diagram illustrating a process of expanding an image by considering inverse transformation when an object included in the image is rotated, according to one embodiment of the present disclosure.
- FIG. 4a is a diagram illustrating a method for expanding an image based on the size of a selected object according to one embodiment of the present disclosure.
- FIG. 4b is a diagram illustrating a process of expanding an image based on the size of the image according to one embodiment of the present disclosure.
- FIG. 5 is a diagram illustrating a process of expanding an image based on a transformable range of a selected object according to one embodiment of the present disclosure.
- FIG. 6 is a diagram illustrating a process of expanding an image by moving all borders located in the reverse transformation direction according to one embodiment of the present disclosure.
- FIG. 7 is a diagram illustrating a process of inferring an object proposal region from an image according to one embodiment of the present disclosure.
- FIGS. 8 and 9 are diagrams illustrating a process of inferring an object proposal region without expanding an input image and expanding the object proposal region by considering the inverse transformation of the object.
- Figure 10 is a diagram illustrating a process of editing an image by using the object image generated as is in the process of expanding the input image by considering the inverse transformation of the object's transformation.
- FIG. 11 is a diagram illustrating an example of a configuration of a computing device for performing image editing using a generative model according to one embodiment of the present disclosure.
- FIG. 12 is a diagram illustrating the types and roles of generation models according to one embodiment of the present disclosure.
- FIGS. 13 to 19 are flowcharts illustrating a process of editing an image using a generative model according to embodiments of the present disclosure.
- FIGS. 20 to 23 are flowcharts illustrating a process of editing an image using a generative model according to embodiments of the present disclosure.
- each block of the flowchart diagrams and combinations of the flowchart diagrams can be performed by computer program instructions.
- the computer program instructions can be installed on a processor of a general-purpose computer, a special-purpose computer, or other programmable data processing apparatus, and the instructions, when executed by the processor of the computer or other programmable data processing apparatus, can create means for performing the functions described in the flowchart block(s).
- the computer program instructions can also be stored in a computer-available or computer-readable memory that can direct a computer or other programmable data processing apparatus to implement the functions in a particular manner, and the instructions stored in the computer-available or computer-readable memory can also produce an article of manufacture that includes instruction means for performing the functions described in the flowchart block(s).
- the computer program instructions can also be installed on a computer or other programmable data processing apparatus.
- each block of the flowchart diagram may represent a module, segment, or portion of code that includes one or more executable instructions for performing a specified logical function(s).
- the functions mentioned in the blocks may occur out of order.
- two blocks depicted in succession may be performed substantially simultaneously or may be performed in reverse order depending on the functionality.
- the term ' ⁇ unit' used in one embodiment of the present disclosure may represent software or a hardware component such as a Field Programmable Gate Array (FPGA) or an Application Specific Integrated Circuit (ASIC), and the ' ⁇ unit' may perform a specific role. Meanwhile, the ' ⁇ unit' is not limited to software or hardware.
- the ' ⁇ unit' may be configured to be in an addressable storage medium and may be configured to play one or more processors.
- the ' ⁇ unit' may include components such as software components, object-oriented software components, class components, and task components, processes, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, databases, data structures, tables, arrays, and variables.
- the functions provided through a specific component or a specific ' ⁇ unit' may be combined to reduce the number or separated into additional components.
- the ' ⁇ unit' may include one or more processors.
- Embodiments of the present disclosure relate to a method of editing an image using a generative model. Before describing specific embodiments, the meanings of terms frequently used in this specification are defined.
- 'EDITING' of an image may mean processing an image so that at least a portion of the image is changed.
- a device performing image processing may generate an image after the object is transformed, which may be expressed as editing the image.
- terms such as 'RECOMPOSITION' of an image may also be used.
- the 'transformation' of an object may include movement of the object, resize (e.g., enlargement or reduction), rotation of the object, etc.
- the present disclosure describes processes involving movement of an object. However, those skilled in the art will understand that the present disclosure is not limited to such embodiments. Additionally, even though the embodiments are for a case where an object is moved, other types of transformations such as resize or rotation may be applied to the object. Instead of 'transformation' of an object, a term such as 'change' of an object may be used. Also, instead of 'movement' of an object, a term such as 'translation' of an object may be used.
- a 'partial object' may mean an object whose appearance is at least partially not displayed in an image. For example, if an object is located at the edge of an image so that only a part of the object is displayed in the image and the rest is not displayed in the image, the object may be referred to as a partial object. Or, for example, if a part of the object is obscured by another object or background and is not displayed, the object may be referred to as a partial object.
- a computing device may determine whether an object is a partial object based on whether the object touches at least one of the edges of an image.
- the computing device may perform object recognition on an image and determine whether the object is a partial object based on the result of recognizing the object.
- terms such as 'incomplete object' may be used.
- a 'full object' can mean an object whose entire appearance is displayed in the image. Instead of 'full object', terms such as 'complete object' or 'whole object' can also be used.
- 'Generative AI' can refer to artificial intelligence technology that can generate new text, images, etc. in response to input data (e.g. text, images, etc.). Representative examples of generative AI are explained in the 'Generative Model' section below.
- a 'generative model' can refer to a neural network model that implements generative AI technology.
- a generative model can generate new data with similar characteristics to the input data or new data corresponding to the input data by learning the patterns and structures of training data. For example, if the input data is an image and the generative model is requested to expand the image, the generative model can generate a new image in the outer area of the original image while maintaining the content or style of the original image. Or, for example, if the input data is text containing a question, the generative model can generate and output an answer to the question.
- the term 'object proposal area' may mean an area in an image where an object is judged or predicted to exist.
- a computing device performing image processing may infer, determine, or extract a certain area (e.g., a bounding box) including an object as an object proposal area.
- the computing device may separate an object from an image through segmentation and infer (determine) only the separated object as the object proposal area.
- the computing device may infer (determine) the object proposal area to include not only the separated object as a result of performing segmentation, but also a portion of a margin around the object.
- the term 'generation required area' may refer to an area where an image needs to be generated for editing the image. For example, when moving an object within an image, an image needs to be generated for a part of the object that was not visible in the original image, and the area where a new image needs to be generated may correspond to the generation required area.
- one or more generation models can generate an image for the generation required area.
- by inferring (determining) the minimum area where an image needs to be generated as the generation required area it is possible to prevent the output image from becoming unnatural due to the instability of the generation model or the generation of the image from taking a lot of time.
- Outpainting' may refer to a technique of expanding the outer border of an image while maintaining the style and content of the image.
- outpainting may be used when expanding an input image considering inverse transformation.
- Inpainting' may refer to a technique of creating a new image and filling in a specific area within an image.
- inpainting may be used when creating an image for an area requiring creation.
- FIGS. 1 to 10 It is assumed that the processes described with reference to FIGS. 1 to 10 are performed by a computing device supporting an image processing function. Therefore, in describing FIGS. 1 to 10, it is described that the computing device performs the processes. Detailed configurations included in the computing device according to one embodiment are illustrated in FIG. 11, and the corresponding configurations are described in detail later.
- the computing device may perform outpainting to expand an image, or inpainting when a transformation of an object requires generation of a new image (object or background) within the image.
- the generation of such an image may be performed using a single generation model, but efficiency and performance may be improved by using generation models with different characteristics for each operation.
- the first generative model (1210) is a model for performing outpainting for image expansion.
- the second generative model (1220) and the third generative model (1230) are both models for performing inpainting for image generation after object transformation, wherein the second generative model (1220) is a model for generating an image of an object, and the third generative model (1230) is a model for generating an image of a background.
- the first generation model (1210) may be a model that focuses on the generation speed rather than the quality of the generated image. This is because, when expanding an image for inference (determination) of an object proposal region, fast generation is more important than the quality of the generated image.
- the second generation model (1220) may be a model that focuses on the quality of the generated image rather than the generation speed. This is because the image generated by the second generation model (1220) corresponds to the output image. Therefore, the second generation model (1220) may be a model with higher performance than the first generation model (1210).
- the first generation model (1210) may be a model with lower quality of output data instead of faster generation speed than the second generation model (1220).
- the second generation model (1220) may be a model with higher performance than the third generation model (1230).
- processes described in the following examples using different generative models may also be performed using one identical generative model or multiple identical generative models.
- FIG. 1 is a diagram for explaining a process of editing an image using a generation model when transforming an object included in an image according to one embodiment of the present disclosure.
- FIG. 1 illustrates images corresponding to the first operation (1a) to the sixth operation (1f) in the order of processing the images.
- the computing device can obtain an input image including an object (10).
- the computing device can receive a user input requesting a transformation of the object.
- the user can request a movement of the object (10) by touching the object (10) on the screen of the computing device on which the input image is displayed and moving the finger while maintaining the touch.
- the user input can also be configured to select (specify) the object (10) and request a transformation of the object (10) in various other forms. (e.g., clicking the mouse button while the pointer is positioned on the object and moving the pointer while maintaining the click)
- the transformation of the object (10) requested by the user is a 'translation' of the object, but is not limited thereto, and the user may also request a resize (enlargement, reduction) or rotation of the object (10).
- the computing device can expand the input image considering the transformation (movement) requested by the user, or considering the inverse transformation for the transformation requested by the user. In other words, the computing device can expand the input image in a direction determined based on the movement of the object requested by the user.
- the computing device can perform outpainting using the first generation model (1210) described above. In order for the first generation model (1210) to perform the outpainting, the computing device can input a prompt instructing the first generation model (1210) to perform the outpainting.
- the computing device can input a prompt to the first generation model (1210) to expand the input image by moving a certain border of the input image in a certain direction and a certain distance, and generate an image for the expanded area while maintaining the features of the input image (e.g., the overall atmosphere and the contents of the foreground and background, etc.).
- the computing device can generate a prompt that instructs outpainting based on the user input (transformation request) and the input image. That is, the computing device can generate a prompt that instructs the first generative model (1210) to perform outpainting in the manner described in the embodiments below.
- a detailed description of a specific method in which the computing device expands the input image by considering the inverse transformation in the third operation (1c) is as follows.
- the computing device can determine whether an object (10) that a user has requested to transform (move), i.e., an object (10) selected by a user input, is in contact with at least one border among the borders of an input image. Referring to FIG. 1, it can be seen that a first border (B1) among the borders of the input image is in contact with the object (10). Accordingly, the computing device can select the first border (B1) that is in contact with the object (10) among the borders of the input image.
- the computing device can move the first border (B1) in a distance and direction according to the inverse transformation (movement) according to the user input.
- the computing device can move the first border (B1) in the opposite direction to the movement (distance and direction) of the object (10) requested by the user input in the second operation (1b).
- the user input received by the computing device in the second operation (1b) includes a transformation request to move the object (10) in the upper left diagonal direction.
- the computing device can move the first border (B1) in the lower right diagonal direction.
- the distance by which the first border (B1) is moved can be determined based on the distance by which the object (10) is moved by the user input in the second operation (1b). For example, the distance by which the first border (B1) is moved can be the same as the distance by which the object (10) is moved by the user input in the second operation (1b).
- the computing device can expand the input image by generating (outpainting) an image up to the area where the first border (B1) is moved using the first generation model (1210).
- the computing device can generate a prompt that instructs the first generation model (1210) to generate an image up to the area where the first border (B1) is moved while maintaining the characteristics of the input image.
- the computing device can also expand the input image in various other ways.
- the method by which the computing device expands the input image may vary even when a transformation other than movement (resize, rotation, etc.) is applied to the object (10). This will be described in detail with reference to FIGS. 2 to 6.
- FIG. 2 is a diagram for explaining a method of expanding an image by considering inverse transformation when an object included in the image is reduced according to one embodiment of the present disclosure.
- FIG. 2 illustrates images corresponding to the first operation (2a) to the third operation (2c) in the order of image processing.
- a computing device in a first operation (2a), can obtain an input image including an object (10).
- the computing device can receive a user input requesting reduction of the object (10). For example, a user can request reduction of the object (10) by touching the object (10) with two fingers on a screen of the computing device on which the input image is displayed and performing a pinch in operation.
- the computing device can move the first border (B1) according to the inverse transformation (reduction) according to the user input.
- the computing device can move the first border (B1) so that the input image expands in the direction of the first border (B1) by the ratio at which the object (10) is reduced by the user input. (e.g., if the size of the object (10) is reduced by 15% by the user input, the computing device moves the first border (B1) so that the input image expands in the direction of the first border (B1) by 15%.)
- the computing device can extend the length of the first border (B1) by moving the first border (B1) in a direction away from the center of the object (10).
- FIG. 3 is a diagram for explaining a method of expanding an image by considering inverse transformation when an object included in the image is rotated according to one embodiment of the present disclosure.
- FIG. 3 illustrates images corresponding to the first operation (3a) to the third operation (3c) in the order of processing the images.
- the computing device can obtain an input image including an object (10).
- the computing device can receive a user input requesting rotation of the object (10). For example, the user can touch the object (10) with two fingers on a screen of the computing device on which the input image is displayed, and rotate the object (10) while maintaining the touch.
- the computing device can expand the input image by considering the inverse transformation for the transformation (rotation) requested by the user. Specifically, the computing device can determine whether the object (10) for which the user requested rotation, i.e., the object (10) selected by the user input, touches at least one border among the borders of the input image. Referring to FIG. 3, it can be seen that the first border (B1) among the borders of the input image touches the object (10). Therefore, the computing device can select the first border (B1) that touches the object (10) among the borders of the input image.
- the computing device can expand the input image by generating an image up to the area where the first border (B1) is moved using the first generation model (1210).
- the computing device can expand the input image even before a request for transformation of the object (10) is received when the object (10) is selected by a user input.
- the user can request the movement of the object (10) by touching the object (10) included in the input image with a finger for a predetermined period of time and moving the finger while maintaining the touch, and the computing device can expand the input image in advance before the user moves the finger.
- the computing device can improve the speed at which the image is edited by expanding the image in advance before the user makes a decision on the transformation. From the user's perspective, the time taken for the edited image to be output is shortened, so the user experience can be improved.
- the computing device may expand an input image in advance before a user input requesting selection of an object (10) or transformation of an object (10) is received, and then infer (determine) an object suggestion region based on the user input when the user input is received.
- the computing device may expand an original image by a predetermined ratio (e.g., 1.5 times) through outpainting, temporarily store the expanded original image, and then perform a process for inferring (determining) an object suggestion region when a user input is received.
- the computing device can expand the image based on the size or position of the selected object (10), the size of the input image, etc.
- the computing device can move the border of the input image by a distance determined based on the length of the selected object (10) (the length measured based on an axis parallel to the direction of movement of the border), or can move the border of the input image by a distance determined based on the range in which the selected object (10) can be converted within the input image (e.g. based on how far and in which direction the object (10) can be moved within the input image).
- the computing device can move the border of the input image by a distance determined based on the length (width or height) of the input image. This will be described in detail below with reference to FIGS. 4A, 4B, and 5.
- FIG. 4a is a diagram for explaining a method of expanding an image based on the size of a selected object according to one embodiment of the present disclosure.
- FIG. 4a illustrates images corresponding to the first operation (4aa) to the third operation (4ac) in the order of processing the images.
- a computing device in a first operation (4aa), can obtain an input image including an object (10).
- the computing device can receive a user input for selecting the object (10).
- the user may select the object (10) as a target of transformation by touching the object (10) with a finger for a predetermined period of time, but may not have yet made an input requesting transformation (movement, resizing, rotation, etc.) of the object (10).
- the computing device may also receive a user input requesting transformation of the object (10).
- the computing device can expand the input image based on the size of the object (10).
- the computing device can determine whether the selected object (10) touches at least one border among the borders of the input image. Referring to FIG. 4a, it can be seen that the first border (B1) among the borders of the input image touches the object (10). Therefore, the computing device can select the first border (B1) among the borders of the input image that touches the object (10).
- the computing device can move the first border (B1) away from the center of the object (10) and determine the distance by which the first border (B1) is to be moved based on the size of the object (10). According to one embodiment, the computing device can measure the length (d1) of the object (10) based on an axis parallel to the moving direction of the first border (B1). Also, according to one embodiment, the computing device can measure the length (d1) of the object (10) based on an axis perpendicular to the first border (B1).
- the computing device can move the first border (B1) by a distance (d2) that is obtained by multiplying the length (d1) of the measured object (10) by a preset ratio (e.g. 0.2).
- a preset ratio e.g. 0.2
- the preset ratio used in calculating the distance (d2) can be set to an appropriate value depending on conditions such as the required image quality or processing speed.
- the computing device can expand the input image by generating an image up to the area where the first border (B1) is moved using the first generation model (1210).
- FIG. 4b is a diagram for explaining a method of expanding an image based on the size of the image according to one embodiment of the present disclosure.
- FIG. 4b illustrates images corresponding to the first operation (4ba) to the third operation (4bc) in the order of processing the images.
- the computing device in the first operation (4ba), can obtain an input image including an object (10).
- the computing device can receive a user input for selecting the object (10).
- the user may select the object (10) as a target of transformation by touching the object (10) with a finger for a predetermined period of time in the second operation (4bb), but may not have yet made an input requesting transformation (movement, resizing, rotation, etc.) of the object (10).
- the computing device may also receive a user input requesting transformation of the object (10).
- the computing device can expand the input image based on the size of the input image. For example, the computing device can expand the input image based on the length (width or height) of the input image.
- the computing device can determine whether the selected object (10) touches at least one border among the borders of the input image. Referring to FIG. 4b, it can be seen that the first border (B1) among the borders of the input image touches the object (10). Therefore, the computing device can select the first border (B1) among the borders of the input image that touches the object (10).
- the computing device can move the first border (B1) away from the center of the object (10) and determine the distance by which the first border (B1) is to be moved based on the length (width or height) of the input image. According to one embodiment, the computing device can measure the length (d1) of the input image based on an axis perpendicular to the first border (B1). Also, according to one embodiment, the computing device can measure the length (d1) of the input image based on an axis parallel to the movement direction of the first border (B1).
- the computing device can move the first border (B1) by a distance (d2) that is obtained by multiplying the length (d1) of the measured image by a preset ratio (e.g. 0.1).
- a preset ratio e.g. 0.1
- the preset ratio used in calculating the distance (d2) can be set to an appropriate value depending on conditions such as the required image quality or processing speed.
- the computing device can expand the input image by generating an image up to the area where the first border (B1) is moved using the first generation model (1210).
- FIG. 5 is a diagram for explaining a method of expanding an image based on a transformable range of a selected object according to one embodiment of the present disclosure.
- FIG. 5 illustrates images corresponding to the first operation (5a) to the third operation (5c) in the order of processing the images.
- the computing device can obtain an input image including an object (10).
- the computing device can receive a user input for selecting the object (10). For example, the user may select the object (10) as a target of transformation by touching the object (10) with a finger for a predetermined period of time in the second operation (5b), but may not yet have made an input requesting transformation (movement, resizing, rotation, etc.) of the object (10).
- the computing device may also receive a user input requesting transformation of the object (10).
- the computing device can expand the input image based on the transformable range of the object (10). Specifically, the computing device can determine the range in which the object (10) can be transformed (e.g. moved) within the input image based on the size and position of the object (10), and expand the input image based on the determined range.
- the computing device can determine whether the selected object (10) is in contact with at least one border among the borders of the input image. Referring to FIG. 5, it can be seen that the first border (B1) among the borders of the input image is in contact with the object (10). Accordingly, the computing device can select the first border (B1) that is in contact with the object (10) among the borders of the input image.
- the computing device can move the first border (B1) away from the center of the object (10) and determine the distance by which the first border (B1) is to be moved based on the transformable range of the object (10).
- the computing device can measure the distance (d1) by which the object (10) can move based on an axis parallel to the moving direction of the first border (B1) in the third operation (5c).
- the computing device can move the first border (B1) by a distance (d2) obtained by multiplying the measured distance (d1) by a preset ratio (e.g. 1).
- the preset ratio used in calculating the distance (d2) can be set to an appropriate value depending on conditions such as the quality of the required image or the processing speed.
- FIG. 6 is a diagram for explaining a method of expanding an image by moving all borders located in the reverse transformation direction according to one embodiment of the present disclosure.
- FIG. 6 illustrates images corresponding to the first operation (6a) to the third operation (6c) in the order of processing the images.
- the computing device in the first operation (6a), can obtain an input image including an object (10).
- the computing device can receive a user input requesting movement of the object (10). For example, the user can request movement of the object (10) by touching the object (10) on the screen of the computing device on which the input image is displayed and moving the finger while maintaining the touch.
- the computing device can expand the input image by considering the inverse transformation for the transformation (movement) requested by the user. Specifically, the computing device can select all the borders located in the inverse transformation direction among the borders of the input image.
- the user input received by the computing device includes a transformation request to move the object (10) in the upper left diagonal direction. Since the inverse transformation direction is the lower right direction, the computing device can select the borders (B1, B2) located on the lower and right sides.
- the computing device can move the first border (B1) and the second border (B2) by a distance and direction according to the inverse transformation (movement) according to the user input. For example, the computing device can move the first border (B1) downward by the distance by which the object (10) moves upward by the user input received in the second operation (6b). Similarly, the computing device can move the second border (B2) rightward by the distance by which the object (10) moves leftward by the user input received in the second operation (6b).
- the computing device can expand the input image by generating an image up to the area where the borders (B1, B2) are moved using the first generation model (1210).
- the computing device when the input image is expanded in the third operation (1c), the computing device can infer (determine) the object proposal area (100a) in the fourth operation (1d). Meanwhile, the image included in the expanded area in the third operation (1c) can also be used as a guide when performing inpainting (e.g., generating an image for an area requiring generation) after object transformation. In other words, when performing inpainting after object transformation, the computing device can refer to the image generated when expanding the input image in the third operation (1c), and thus the image generation efficiency can be improved.
- inpainting e.g., generating an image for an area requiring generation
- the 'object proposal region' may mean a region in which an object (10) is judged or predicted to exist within an image.
- the computing device may infer (determine) a region including an object (10) in an extended input image as an object proposal region (100a).
- the computing device may infer (determine) an object proposal region (100b) corresponding to the object (10) transformed (moved) in a fifth operation (1e) based on the object proposal region (100a) inferred (determined) in a fourth operation (1d).
- the object proposal region (100a) inferred (determined) in the fourth operation (1d) is referred to as a first object proposal region
- the object proposal region (100b) inferred (determined) in the fifth operation (1e) is referred to as a second object proposal region.
- the computing device can infer (determine) the first object suggestion area (100a) in the form of a bounding box.
- the computing device can infer (determine) the bounding box including the object (10) as the first object suggestion area (100a) through a method used in general pedestrian detection technology, and can also infer (determine) the first object suggestion area (100a) through various other methods.
- the computing device can infer (determine) the first object suggestion area (100a) in a form other than a bounding box, which will be described in detail with reference to FIG. 7.
- FIG. 7 is a diagram for explaining a method for inferring (determining) an object proposal region from an image according to one embodiment of the present disclosure.
- FIG. 7 illustrates three examples of different methods for inferring (determining) an object proposal region on the first to third screens (7a, 7b, and 7c).
- the first screen (7a) of Fig. 7 illustrates an example in which a computing device infers (determines) an object suggestion area (700a) in the form of a bounding box containing an object (10). This is the same as the method described in the fourth operation (1d) of Fig. 1 above.
- the second screen (7b) of Fig. 7 illustrates an example in which a computing device separates an object (10) through segmentation and infers (determines) only the separated object (10) as an object suggestion area (700b).
- the third screen (7c) of Fig. 7 illustrates an example in which a computing device separates an object (10) through segmentation and infers (determines) an object proposal area (700c) to include not only the separated object (10) but also a portion of the margin around the object (10).
- the computing device infers (determines) the object suggestion region in the form of a bounding box, but the computing device may also infer (determine) the object suggestion region in the other two ways illustrated in FIG. 7, and may also infer (determine) the object suggestion region in various other ways.
- the computing device infers (determines) a first object proposal region (100a) from the expanded input image in the fourth operation (1d), it can transform the object (10) and infer (determine) a second object proposal region (100b) corresponding to the transformed object (10) in the fifth operation (1e).
- the operation performed by the computing device in the fifth operation (1e) is described in detail as follows.
- the computing device can transform the object (10) according to the request included in the user input received in the second operation (1b). That is, the computing device can move the object (10) in the upper left direction on the input image.
- the computing device can infer (determine) the second object proposal area (100b) corresponding to the object (10) at the position after the movement based on the first object proposal area (100a) inferred (determined) in advance.
- the computing device can infer (determine) the second object proposal area (100b) corresponding to the moved object (10) by reflecting the size and shape of the first object proposal area (100a), the positional relationship between the first object proposal area (100a) and the object (10), etc.
- the object (10) included in the second object proposal area (100b) may be smaller than the object (10) included in the first object proposal area (100a) because it is extracted from an unexpanded input image.
- the second object proposal area (100b) may have the same size as the first object proposal area (100a), so there may be an area (200) in the second object proposal area (100b) where generation of an additional image of the object (10) is required.
- the computing device may infer (determine) this area (200) as a generation required area.
- the computing device may infer (determine) the second object proposal area (100b) by moving the first object proposal area (100a) in a distance and direction in which the object (10) moves according to a user's request.
- the computing device can infer (determine) the generation-required region (200) based on the second object proposal region (100b). Specifically, the computing device can infer (determine) the region where an image for the converted object (10) must be additionally generated as the generation-required region (200) by comparing the converted object (10) and the second object proposal region (100b).
- the 'region requiring generation' may mean a region requiring generation of an image for editing the image.
- the region requiring generation may be divided into an object portion and a background portion.
- the portion (200) requiring additional generation of the object (10) image may be referred to as a first region requiring generation
- the empty spaces (parts requiring additional generation of a background image) created due to transformation (movement) of the object (10) may be referred to as a second region requiring generation. That is, the computing device may infer (determine) the region requiring generation of a background image due to transformation of the object (10) as the second region requiring generation.
- the computing device can infer (determine) an area including an object (10) in the expanded input image as a first object proposal area (100a), and move (transform) the object (10) on the input image according to a user input received in the second operation (1b). Then, the computing device can infer (determine) a second object proposal area (100b) corresponding to the moved object (10) based on the first object proposal area (100a). Finally, the computing device can infer (determine) a first generation-required area (200) and a second generation-required area (empty spaces created due to the movement of the object (10)) based on the second object proposal area (100b).
- the computing device may infer (determine) the region requiring generation (200) by considering the object (10). For example, the computing device may determine the region requiring generation of an image of the object (10) based on the size of the object (10) and the direction of the object (10) (e.g., the direction in which the front of the object (10) is facing), and infer (determine) the determined region as the region requiring generation (200).
- the computing device may determine the region requiring generation of an image of the object (10) based on the size of the object (10) and the direction of the object (10) (e.g., the direction in which the front of the object (10) is facing), and infer (determine) the determined region as the region requiring generation (200).
- the computing device may identify the object (10) by performing object recognition on an input image (e.g., identify the type of the object), and determine the region requiring generation of an image of the object (10) based on the identification result of the object (10) (e.g., the type of the object), and infer (determine) the determined region as the region requiring generation (200).
- identify the object (10) by performing object recognition on an input image (e.g., identify the type of the object), and determine the region requiring generation of an image of the object (10) based on the identification result of the object (10) (e.g., the type of the object), and infer (determine) the determined region as the region requiring generation (200).
- the computing device may infer (determine) the entire second object proposal area (100b) as a region requiring generation. Accordingly, the computing device may generate a new image for the entire second object proposal area (100b) using the second generation model (1220) for object image generation. In this case, since the image for the entire second object proposal area (100b) is generated by reflecting the surrounding background at the position after the transformation of the object (10), the effect of generating the image to better match the surrounding background can be expected.
- the computing device may input a prompt instructing the generation model to inpaint a region requiring generation.
- the computing device may generate a prompt instructing the generation model to generate an image while maintaining the features of the input image (e.g., the overall atmosphere and the contents of the foreground and background, etc.) for the first generation requiring region (200) inferred (determined) in the fifth operation (1e), and input the prompt to the second generation model (1220).
- the computing device may generate a prompt instructing inpainting based on a user input (conversion request) and the input image.
- the image generated when the input image is expanded in the third operation (1c) can be used as a guide when generating an image for the first generation-required area (200).
- the computing device can generate an image of an object (10) in the first generation-required area (200) similar to the image of the object (10) included in the expanded input image in the third operation (1c), but can generate the image so as to match the background in the first generation-required area (200).
- the computing device can generate a prompt that instructs to generate an image for the first generation-required area (200) while maintaining the characteristics of the image of the expanded area in the third operation (1c), and input the prompt into the second generation model (1220).
- FIGS. 8 and 9 are diagrams for explaining embodiments of a method for inferring (determining) an object proposal region without expanding an input image and expanding the object proposal region by considering the inverse transformation of the object transformation.
- FIG. 8 illustrates images corresponding to the first operation (8a) to the eighth operation (8h) in the order of processing images.
- FIG. 9 illustrates images corresponding to the first operation (9a) to the seventh operation (9g) in the order of processing images.
- the computing device can obtain an input image including an object (10).
- the computing device can receive a user input requesting transformation of the object.
- the user can request movement of the object (10) by touching the object (10) on the screen of the computing device on which the input image is displayed and moving the finger while maintaining the touch.
- the transformation of the object (10) requested by the user is a 'movement' of the object, but is not limited thereto, and the user can also request resizing (enlarging, reducing) or rotating the object (10).
- the computing device can infer (determine) an area including an object (10) in the input image as an object proposal area (100a).
- the computing device can infer (determine) an object proposal area (100b) corresponding to the object (10) transformed (moved) in the fourth operation (8d) based on the object proposal area (100a) inferred (determined) in the third operation (8c).
- the object proposal area (100a) inferred (determined) in the third operation (8c) is referred to as a first object proposal area
- the object proposal area (100b) inferred (determined) in the fourth operation (8d) is referred to as a second object proposal area.
- the computing device can infer (determine) the first object suggestion area (100a) in the form of a bounding box.
- the computing device can infer (determine) the bounding box including the object (10) as the first object suggestion area (100a) through a method used in general pedestrian detection technology, and can also infer (determine) the first object suggestion area (100a) through various other methods.
- the computing device can infer (determine) the first object suggestion area (100a) in a form other than a bounding box, as described above with reference to FIG. 7.
- the computing device infers (determines) the first object proposal area (100a) from the input image in the third operation (8c), it can transform the object (10) and infer (determine) the second object proposal area (100b) corresponding to the transformed object (10) in the fourth operation (8d).
- the operation performed by the computing device in the fourth operation (8d) is described in detail as follows.
- the computing device can transform the object (10) according to the request included in the user input received in the second operation (8b). That is, the computing device can move the object (10) in the upper left direction on the input image.
- the computing device can infer (determine) the second object proposal area (100b) corresponding to the object (10) at the position after the movement based on the first object proposal area (100a) extracted in advance.
- the computing device can infer (determine) the second object proposal area (100b) corresponding to the moved object (10) by reflecting the size and shape of the first object proposal area (100a), the positional relationship between the first object proposal area (100a) and the object (10), etc.
- the computing device can expand the second object proposal area (100b) by considering the inverse transformation of the transformation of the object (10).
- the method of expanding the second object proposal area (100b) by considering the inverse transformation may be similar to the third operation (1c) of FIG. 1 and the method described with reference to FIGS. 2 to 6.
- at least one border that touches the object (10) among the borders of the input image is moved by a distance and in a direction according to the inverse transformation.
- At least one border that touches the object (10) among the borders of the second object proposal area (100b) is moved by a distance and in a direction according to the inverse transformation, and an image can be generated (outpainted) using the first generation model up to the area where the border is moved.
- the method of moving the border according to the transformation of the object (10) may be the same as described in the previous embodiments.
- the extended second object proposal area (100c) considering the inverse transformation is illustrated in the sixth motion (8f).
- the computing device can infer (determine) the generation-required region (200) based on the extended second object suggestion region (100c).
- a detailed description of how the computing device infers (determines) the generation-required region (200) is as follows.
- the computing device can transform (move) the object (10) according to a request included in a user input, and apply the extended second object suggestion region (100c) to the transformed object (10).
- the computing device can infer (determine) the region in which an image for the transformed object (10) must be additionally generated as the generation-required region (200) by comparing the transformed object (10) and the extended second object suggestion region (100c).
- the object (10) included in the extended second object proposal area (100c) may be smaller than the object (10) included in the extended second object proposal area (100c) in the sixth operation (8f) because it is extracted from the non-extended input image. Accordingly, in the extended second object proposal area (100c), there may be an area (200) where generation of an additional image of the object (10) is required. The computing device may infer (determine) this area (200) as a region requiring generation.
- the computing device can infer (determine) the region requiring generation (200) based on the extended second object proposal region (100c). Specifically, the computing device can infer (determine) the region requiring generation (200) by comparing the converted object (10) and the extended second object proposal region (100c).
- the region requiring generation can be divided into an object portion and a background portion.
- the portion (200) requiring additional generation of the object (10) image can be referred to as a first region requiring generation
- the empty spaces (parts requiring additional generation of a background image) created due to transformation (movement) of the object (10) can be referred to as a second region requiring generation. That is, the computing device can infer (determine) the region where the background image must be generated due to transformation of the object (10) as the second region requiring generation.
- the computing device can generate images for the first generation-required area (200) and the second generation-required area (empty spaces created due to movement of the object (10)), edit the images, and output the edited images.
- the computing device can generate an image for the first generation-required area (200) using the second generation model (1220), and can generate an image for the second generation-required area using the third generation model (1230).
- the expanded second object suggestion region (100c) of the sixth operation (8f) already includes an image of the expanded object (10).
- the computing device compares the object (10) before expansion (the object whose position has only been moved due to transformation) with the expanded second object suggestion region (100c) to infer (determine) the region requiring generation (200), and generates an image for the region requiring generation (200) using the second generation model (1220), thereby expanding the object (10).
- the reason for this is as follows.
- the image included in the extended second object proposal area (100c) is an image generated using the first generation model (1210).
- the first generation model (1210) is a model used to 'temporarily' expand an image in order to infer (determine) the generation-required area (200).
- the second generation model (1220) is a model used when generating an image to be output, and can generate a more precise image than the first generation model (1210).
- the computing device expands the second object proposal area (100b) using the first generation model (1210), and then infers (determines) the generation-required area (200) based on the expanded second object proposal area (100c), and generates an image for the generation-required area (200) again using the second generation model (1220), thereby improving the quality of the output image.
- the computing device can infer (determine) only the empty spaces (parts requiring additional generation of a background image) resulting from the transformation (movement) of the object (10) as the generation-required area (second generation-required area).
- the computing device can output an edited image by using the image of the object (10) included in the extended second object proposal area (100c) as it is and generating an image for the second generation-required area using the third generation model.
- first method the method according to the embodiment illustrated in Fig. 1
- second method the method according to the embodiment illustrated in Fig. 8
- the image is expanded at the position before the object (10) is transformed, so the image is generated by reflecting the background at the position before the object (10) is transformed, and the object proposal area can be inferred (determined) accordingly.
- the object proposal area (100c) of the second method is inferred (determined) to better match the surroundings of the position after transformation of the object (10), and therefore, it is expected that a more natural output image will be obtained when using the second method.
- Fig. 10 is a drawing for explaining an embodiment of editing an image by using the object image generated as it is in the process of expanding an input image by considering the inverse transformation of the object transformation.
- Fig. 10 shows images corresponding to the first operation (10a) to the sixth operation (10f) in the order of image processing.
- the computing device can transform the object (10) according to a request included in the user input received in the second operation (10b). That is, the computing device can move the object (10) in the upper left direction on the input image.
- the computing device can infer (determine) a second object proposal area (100b) corresponding to the object (10) at the position after the movement, based on the first object proposal area (100a) inferred (determined) in advance.
- the computing device can infer (determine) a second object proposal area (100b) corresponding to the moved object (10) by reflecting the size and shape of the first object proposal area (100a), and the positional relationship between the first object proposal area (100a) and the object (10).
- the computing device can insert an image of an object (10) included in the first object proposal area (100a) into the second object proposal area (100b). Then, the computing device can infer (determine) only the empty spaces (areas where additional generation of a background image is required) resulting from the transformation (movement) of the object (10) as the generation-required area (the second generation-required area).
- the computing device can output an edited image by generating an image for the second generation required area using the third generation model.
- the computing device can increase the processing speed by using the image of the object (10) generated by the first generation model (1210) as it is at the position after the transformation of the object (10) when expanding the input image for inference (determination) of the object proposal area.
- the computing device can adjust the color or brightness, etc. of the image of the object (10) according to the surrounding background by using a harmonizer or the like so that the image of the object (10) generated by the first generation model (1210) can blend well with the surrounding background at the position after the transformation of the object (10).
- FIG. 11 is a diagram for explaining the configuration of a computing device for performing image editing using a generative model according to one embodiment of the present disclosure.
- the input/output interface (1100) may include an input interface (e.g., a touch screen, a hard button, a microphone, etc.) for receiving control commands or information from a user, and an output interface (e.g., a display panel, a speaker, etc.) for displaying the results of execution of an operation according to the user's control or the status of the computing device (1000).
- an input interface e.g., a touch screen, a hard button, a microphone, etc.
- an output interface e.g., a display panel, a speaker, etc.
- the memory (1200) is a configuration for storing various programs or data, and may be configured as a storage medium or a combination of storage media such as a ROM, a RAM, a hard disk, a CD-ROM, and a DVD.
- the memory (1200) may not exist separately and may be configured to be included in the processor (1300).
- the memory (1200) may be configured as a volatile memory, a nonvolatile memory, or a combination of volatile memory and nonvolatile memory. Programs or instructions for performing operations according to the embodiments described with reference to FIGS. 1 to 10 above may be stored in the memory (1200).
- the memory (1200) may also provide stored data to the processor (1300) at the request of the processor (1300).
- the processor (1300) is a configuration that controls a series of processes so that the computing device (1000) operates according to the embodiments described with reference to FIGS. 1 to 10 above, and may be composed of one or more processors.
- the one or more processors may be a general-purpose processor such as a CPU, an AP, a DSP (Digital Signal Processor), a graphics-only processor such as a GPU, a VPU (Vision Processing Unit), or an artificial intelligence-only processor such as an NPU.
- the one or more processors are artificial intelligence-only processors
- the artificial intelligence-only processor may be designed with a hardware structure specialized for processing a specific artificial intelligence model.
- the processor (1300) can record data in the memory (1200), read data stored in the memory (1200), and process data according to predefined operation rules or artificial intelligence models, particularly by executing programs or commands stored in the memory (1200). Accordingly, the processor (1300) can perform the operations described in the embodiments described above, and the operations described as performed by the computing device (1000) in the embodiments described above can be viewed as performed by the processor (1300) unless otherwise specifically described.
- FIGS. 13 to 19 are flowcharts for explaining a method of editing an image using a generation model according to embodiments of the present disclosure.
- a method of editing an image using a generation model according to embodiments of the present disclosure will be explained with reference to FIGS. 13 to 19. Since the operations described below are performed by the computing device (1000) described so far, the contents included in the embodiments described above may be applied equally even if omitted below.
- a computing device may receive user input requesting transformation of at least one object included in an input image.
- the computing device can transform at least one object according to a user's request, and infer (determine) a generation-required region, which is a region where image generation is required, by considering the inverse transformation for the transformation.
- Operation 1302 is the same as operation 1403 of FIG. 14, so the contents described below for operation 1403 can be equally applied to operation 1302.
- the computing device can generate an image of a region requiring generation using one or more generative models.
- a computing device may receive user input requesting transformation of at least one object included in an input image.
- the computing device can determine whether at least one border among the borders of the input image touches at least one object (an object that is a target of the conversion request). If the determination result shows that the at least one border does not touch at least one object, the computing device can proceed to operation 1405 to convert at least one object according to a request included in the user input. Conversely, if the determination result shows that the at least one border touches at least one object, the calculation proceeds to operation 1403.
- the computing device can transform at least one object according to the user's request, and infer (determine) a generation-required region, which is a region where image generation is required, by considering the inverse transformation for the transformation.
- a generation-required region which is a region where image generation is required.
- operation 1501 the computing device can expand the input image by considering the inverse transformation of the object's transformation. Detailed operations included in operation 1501 are described with reference to FIGS. 16 to 18.
- the computing device can select at least one border that touches at least one object among the borders of the input image.
- the computing device can move the selected at least one border by a distance and direction according to the inverse transformation.
- the computing device can generate an image up to the area where the border is moved using one or more generative models.
- the computing device may select at least one border that touches at least one object selected from among borders of the input image or at least one border that cuts the at least one object.
- the computing device may move the at least one selected border by a distance determined according to a size of the at least one object. For example, the computing device may move the at least one selected border by a distance that is a preset ratio multiplied by a length of the at least one object.
- the computing device may determine a movement distance based on a range in which the at least one object is transformable within the input image, and move the at least one selected border by the determined movement distance.
- the computing device can generate an image using one or more generative models to a region where at least one boundary has been moved.
- the computing device may select at least one border that touches at least one object selected from among borders of an input image or at least one border that cuts off the at least one object.
- the computing device may move the at least one selected border by a distance determined based on a length of the input image measured with respect to an axis perpendicular to the at least one selected border. For example, the computing device may move the at least one selected border by a distance that is a preset ratio multiplied by the length of the input image.
- the computing device may generate an image up to an area where the at least one border is moved using one or more generative models.
- the computing device can infer (determine) an area including at least one object in the extended input image as a first object proposal area.
- the computing device can transform the at least one object according to a request included in the user input received in operation 1401.
- the computing device can infer (determine) a second object proposal area corresponding to the transformed at least one object based on the first object proposal area.
- the computing device can infer (determine) a generation need area based on the second object proposal area.
- the computing device can infer (determine) an area including at least one object in an input image as a first object proposal area.
- the computing device can transform at least one object according to a request included in a user input received in operation 1401.
- the computing device can infer (determine) a second object proposal area corresponding to the transformed at least one object based on the first object proposal area.
- the computing device can extend the second object proposal area by considering an inverse transformation of the object.
- the computing device can infer (determine) a generation need area based on the extended second object proposal area.
- the computing device can generate an image of a region requiring generation using one or more generative models.
- the computing device may receive user input requesting movement of at least one object included in an input image.
- the user may request movement of an object included in the input image in various ways.
- the computing device can expand the input image in a direction determined based on the movement of the requested object.
- the computing device can determine the direction in which to expand the input image by considering the movement of the requested object. For example, the computing device can determine the direction opposite to the movement direction of the object as the expansion direction of the input image.
- Fig. 21 illustrates detailed operations included in operation 2002.
- the computing device may determine whether at least one object is a partial object in which only a portion of the object is displayed. According to one embodiment of the present disclosure, the computing device may perform object recognition on an input image, and then determine whether the object is a partial object based on the recognition result for at least one object. Alternatively, according to one embodiment of the present disclosure, the computing device may determine whether the object is a partial object based on whether the object touches a border of the input image or whether the object is cut off by the border of the input image. Detailed operations included in operation 2101 are illustrated in FIG. 22.
- the computing device may determine whether at least one object touches at least one border among borders of an input image or whether at least one object is cut by at least one border among borders of the input image. Then, in operation 2202, the computing device may determine at least one object as a partial object if at least one object touches at least one border or at least one object is cut by at least one border.
- the computing device may generate an image to extend the input image in a direction opposite to a movement direction of the requested object, if at least one object is a partial object.
- the computing device may select at least one border touching at least one object or at least one border cutting off the at least one object from among borders of the input image, move the selected at least one border in a direction opposite to a movement direction of the requested object, and then generate an image up to an area to which the at least one border has moved using one or more generation models.
- the computing device may infer (determine) a region requiring generation of an image of at least one object based on the expanded input image.
- the computing device may infer (determine) a region requiring generation of an image of at least one object as a region requiring generation of the image of at least one object based on the size of at least one object and the direction of at least one object.
- the computing device may identify at least one object included in the input image (e.g., determine the type of the object), and then infer (determine) a region requiring generation of an image of at least one object as a region requiring generation of the image of the object based on the identification result (e.g., the type of the object).
- the computing device may infer (determine) a region including an object in the expanded input image as an object suggestion region, and then infer (determine) a region requiring generation based on the object suggestion region.
- Detailed operations included in operation 2003 are illustrated in FIG. 23.
- the computing device can infer (determine) an area including at least one object in the extended input image as a first object proposal area.
- the computing device can move at least one object upon request.
- the computing device can infer (determine) a second object proposal area corresponding to at least one moved object based on the first object proposal area.
- the computing device can infer (determine) a region requiring generation based on a second object suggestion region.
- the computing device can infer (determine) a region requiring additional generation of an image for the at least one moved object as a first region requiring generation by comparing the at least one moved object with the second object suggestion region, and infer (determine) a region requiring generation of a background image due to movement of the object as a second region requiring generation.
- the specific method by which the computing device infers (determines) the first object proposal region, the second object proposal region, and the generation need region is as described above with reference to FIG. 1.
- a method of editing an image using a generative model may include the steps of receiving a user input requesting movement of at least one object included in an input image, expanding the input image in a direction determined based on the requested movement of the object, determining a generation required area, which is an area requiring generation of an image for the at least one object, based on the expanded input image, and generating an image for the generation required area using at least one generative model.
- the step of expanding the input image may include the step of determining whether the at least one object is a partial object in which only a portion of the object is displayed, and if the at least one object is a partial object, the step of generating an image to expand the input image in a direction opposite to a movement direction of the requested object.
- the step of determining whether the object is a partial object may include the step of performing object recognition on the input image and the step of determining whether the object is a partial object based on the recognition result for the at least one object.
- the step of inferring the region requiring generation may infer the region requiring generation of an image of the at least one object as the region requiring generation based on the size of the at least one object and the direction of the at least one object.
- the step of inferring the region requiring generation may include the step of identifying the at least one object and the step of inferring the region requiring generation of an image of the at least one object as the region requiring generation based on the identification result.
- the step of inferring the region requiring generation may include the steps of inferring an area including the at least one object in the extended input image as a first object proposal area, moving the at least one object according to the request, inferring a second object proposal area corresponding to the moved at least one object based on the first object proposal area, and inferring the region requiring generation based on the second object proposal area.
- the step of inferring the region requiring generation based on the second object suggestion region may include the step of inferring a region in which an image for the at least one moved object must be additionally generated as the first region requiring generation by comparing the at least one moved object with the second object suggestion region, and the step of inferring a region in which a background image must be generated due to movement of the object as the second region requiring generation.
- the input image is expanded using a first generation model, an image for the first generation-required region is generated using a second generation model, and an image for the second generation-required region is generated using a third generation model, wherein the second generation model may have higher performance than the first generation model or the third generation model.
- the step of generating an image for the region requiring generation may include the step of generating a prompt based on at least one of information about a location of the region requiring generation, information about a type of the at least one object, and information about a background including the region requiring generation, and the step of inputting the generated prompt into the at least one generation model.
- a computing device includes an input/output interface for receiving a user input requesting processing of an image and outputting an image processed according to the user input, a memory storing commands for processing the image, and at least one processor, wherein the at least one processor executes the commands such that, when a user input requesting movement of at least one object included in an input image is received, the input image is expanded in a direction determined based on the requested movement of the object, and a generation required area, which is an area where generation of an image for the at least one object is required, is inferred based on the expanded input image, and then an image for the generation required area is generated using at least one generative model.
- the at least one processor may generate an image to expand the input image by determining whether the at least one object is a partial object in which only a part of the object is displayed, and if the at least one object is a partial object, expanding the input image in a direction opposite to a movement direction of the requested object.
- the at least one processor when determining whether the at least one processor is a partial object, may determine whether the at least one object is in contact with at least one border among borders of the input image, and if the at least one object is in contact with the at least one border, the at least one object may be determined to be a partial object.
- the at least one processor may generate an image to expand the input image by selecting at least one border among borders of the input image that comes into contact with the at least one object, moving the selected at least one border in a direction opposite to a movement direction of the requested object, and then generating an image up to an area to which the at least one border has been moved using the at least one generation model.
- the at least one processor may perform object recognition on the input image to determine whether the object is a partial object, and then determine whether the object is a partial object based on a recognition result for the at least one object.
- the at least one processor may infer a region requiring generation of an image of the at least one object as the region requiring generation based on a size of the at least one object and a direction of the at least one object.
- the at least one processor may identify the at least one object, and then infer a region requiring generation of an image of the at least one object as the region requiring generation based on the identification result.
- the at least one processor may infer, in the area requiring generation, an area including the at least one object in the extended input image as a first object proposal area, move the at least one object according to the request, infer a second object proposal area corresponding to the moved at least one object based on the first object proposal area, and then infer the area requiring generation based on the second object proposal area.
- the at least one processor may compare the at least one moved object with the second object suggestion area, thereby inferring an area in which an image for the at least one moved object must be additionally generated as a first generation need area, and then inferring an area in which a background image must be generated due to movement of the object as a second generation need area.
- the at least one processor expands the input image using a first generative model, generates an image for the first generation-required region using a second generative model, and generates an image for the second generation-required region using a third generative model, wherein the second generative model may have higher performance than the first generative model or the third generative model.
- the step of inferring the region requiring generation may include the steps of expanding the input image by considering the inverse transformation, inferring an area including the at least one object in the expanded input image as a first object proposal area, transforming the at least one object according to the request, inferring a second object proposal area corresponding to the transformed at least one object based on the first object proposal area, and inferring the region requiring generation based on the second object proposal area.
- the step of expanding the input image may include the steps of selecting at least one border among borders of the input image that touches the at least one object, moving the at least one selected border by a distance and direction according to the inverse transformation, and generating an image up to an area where the at least one border has been moved using the at least one generative model.
- the step of moving the at least one object by a distance determined based on a size of the at least one object may move the at least one selected border by a distance calculated by multiplying a length of the at least one object by a preset ratio.
- the step of moving the at least one object by a distance determined based on a size of the at least one object may determine a movement distance based on a range within which the at least one object is transformable within the input image, and move the at least one selected border by the determined movement length.
- the step of expanding the input image may include the steps of selecting at least one border among borders of the input image that touches the at least one selected object, moving the at least one selected border by a distance determined based on a length of the input image measured relative to an axis perpendicular to the at least one selected border, and generating an image up to an area to which the at least one border has been moved using the at least one generative model.
- the step of moving by a distance determined according to a length of the input image may move the at least one selected border by a distance obtained by multiplying the length of the input image by a preset ratio.
- the step of inferring the region requiring generation based on the second object suggestion region may include the step of inferring a region in which an image for the at least one converted object must be additionally generated as the first region requiring generation by comparing the at least one converted object with the second object suggestion region, and the step of inferring a region in which a background image must be generated due to conversion of the object as the second region requiring generation.
- the step of inferring the region requiring generation based on the extended second object proposal region may include the step of inferring a region in which an image for the at least one transformed object must be additionally generated as the first region requiring generation by comparing the at least one transformed object with the extended second object proposal region, and the step of inferring a region in which a background image must be generated due to the transformation of the object as the second region requiring generation.
- a computing device includes an input/output interface for receiving a user input requesting processing of an image and outputting an image processed according to the user input, a memory storing commands for processing the image, and at least one processor, wherein the at least one processor executes the commands so that, when receiving a user input requesting transformation of at least one object included in an input image, the computing device determines whether at least one border among borders of the input image touches the at least one object, and if the at least one border touches the at least one object, transforms the at least one object according to the request, infers a generation required area, which is an area where image generation is required, by considering an inverse transformation for the transformation, and generates an image for the generation required area using one or more generative models.
- the at least one processor may, in inferring the generation-required region, expand the input image by considering the inverse transformation, infer an area including the at least one object in the expanded input image as a first object proposal area, transform the at least one object according to the request, infer a second object proposal area corresponding to the transformed at least one object based on the first object proposal area, and infer the generation-required region based on the second object proposal area.
- the at least one processor may, when expanding the input image, select at least one border among borders of the input image that is in contact with the at least one selected object, move the at least one selected border by a distance determined according to a size of the at least one object, and generate an image up to an area to which the at least one border has been moved using the at least one generation model.
- the at least one processor when inferring the generation need area based on the second object suggestion area, can compare the at least one transformed object with the second object suggestion area, thereby inferring an area where an image for the at least one transformed object must be additionally generated as a first generation need area, and inferring an area where a background image must be generated due to the conversion of the object as a second generation need area.
- the at least one processor expands the input image using a first generative model, generates an image for the first generation-required region using a second generative model, and generates an image for the second generation-required region using a third generative model, wherein the second generative model may have higher performance than the first generative model or the third generative model.
- Various embodiments of the present disclosure may be implemented or supported by one or more computer programs, which may be formed from computer-readable program code and embodied in a computer-readable medium.
- application and “program” may represent one or more computer programs, software components, sets of instructions, procedures, functions, objects, classes, instances, associated data, or portions thereof suitable for implementation in computer-readable program code.
- Computer-readable program code may include various types of computer code, including source code, object code, and executable code.
- the machine-readable storage medium may be provided in the form of a non-transitory storage medium.
- the 'non-transitory storage medium' is a tangible device and may exclude wired, wireless, optical, or other communication links that transmit temporary electrical or other signals. Meanwhile, the 'non-transitory storage medium' does not distinguish between cases where data is permanently stored in the storage medium and cases where it is temporarily stored.
- the 'non-transitory storage medium' may include a buffer where data is temporarily stored.
- the computer-readable medium may be any available medium that can be accessed by a computer, and may include both volatile and nonvolatile media, removable and non-removable media.
- the computer-readable medium includes media on which data can be permanently stored and media on which data can be stored and later overwritten, such as a rewritable optical disk or an erasable memory device.
- the method according to various embodiments disclosed in the present document may be provided as included in a computer program product.
- the computer program product may be traded between a seller and a buyer as a commodity.
- the computer program product may be distributed in the form of a machine-readable storage medium (e.g., a compact disc read only memory (CD-ROM)), or may be distributed online (e.g., downloaded or uploaded) via an application store or directly between two user devices (e.g., smartphones).
- a machine-readable storage medium e.g., a compact disc read only memory (CD-ROM)
- CD-ROM compact disc read only memory
- At least a portion of the computer program product may be at least temporarily stored or temporarily generated in a machine-readable storage medium, such as a memory of a manufacturer's server, a server of an application store, or an intermediary server.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Human Computer Interaction (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
Claims (15)
- 하나 이상의 생성 모델을 이용하여 이미지를 처리하는, 적어도 하나의 프로세서에 의해 실행되는 방법에 있어서,입력 이미지에 포함된 적어도 하나의 객체의 이동(movement)을 요청하는 사용자 입력을 수신하는 단계;상기 객체의 이동에 기초하여 결정되는 방향으로 상기 입력 이미지를 확장하는 단계;상기 확장된 입력 이미지에 기초하여 상기 적어도 하나의 객체에 대한 부분 이미지(partial image)의 생성이 필요한 영역인 생성 필요 영역(generation required area)을 결정하는 단계;적어도 하나의 생성 모델(generative model)을 이용하여 상기 생성 필요 영역에 대한 이미지를 생성하는 단계; 및상기 입력 이미지 및 상기 적어도 하나의 객체에 대한 부분 이미지에 기초하여 재구성된 이미지를 출력하는 단계를 포함하는, 방법.
- 제1항에 있어서,상기 입력 이미지를 확장하는 단계는,상기 적어도 하나의 객체가 객체의 일부만 표시된 부분 객체(partial object)인지 여부를 판단하는 단계; 및상기 적어도 하나의 객체가 부분 객체인 경우, 상기 객체의 이동 방향의 반대 방향으로 상기 입력 이미지를 확장하도록 이미지를 생성하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항 및 제2항 중 어느 한 항에 있어서,상기 부분 객체인지 여부를 판단하는 단계는,상기 적어도 하나의 객체가 상기 입력 이미지의 테두리(border)들 중에서 적어도 하나의 테두리에 의해 절단되었는지 여부를 확인하는 단계; 및상기 적어도 하나의 객체가 상기 적어도 하나의 테두리에 의해 절단되었다면, 상기 적어도 하나의 객체를 부분 객체로 판단하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항 내지 제3항 중 어느 한 항에 있어서,상기 입력 이미지를 확장하도록 이미지를 생성하는 단계는,상기 입력 이미지의 테두리들 중에서 상기 적어도 하나의 객체를 절단하는 적어도 하나의 테두리 중 제1 테두리를 선택하는 단계;상기 요청된 객체의 이동 방향의 반대 방향으로, 상기 제1 테두리를 이동시키는 단계; 및상기 하나 이상의 생성 모델을 이용하여 상기 적어도 하나의 테두리가 이동된 영역까지 이미지를 생성하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항 내지 제4항 중 어느 한 항에 있어서,상기 생성 필요 영역을 결정하는 단계는,상기 적어도 하나의 객체의 크기 및 상기 적어도 하나의 객체의 이동 방향에 기초하여, 상기 적어도 하나의 객체의 부분 이미지의 생성이 필요한 영역을 상기 생성 필요 영역으로 결정하는 것을 특징으로 하는 방법.
- 제1항 내지 제5항 중 어느 한 항에 있어서,상기 생성 필요 영역을 결정하는 단계는,상기 확장된 입력 이미지에서 상기 적어도 하나의 객체를 포함하는 영역을 제1 객체 제안 영역(object proposal area)으로 결정하는 단계;상기 사용자 입력에 따라 상기 적어도 하나의 객체를 이동시키는 단계;상기 제1 객체 제안 영역에 기초하여, 상기 이동된 적어도 하나의 객체에 대응되는 제2 객체 제안 영역을 결정하는 단계; 및상기 제2 객체 제안 영역에 기초하여 상기 생성 필요 영역을 결정하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항 내지 제6항 중 어느 한 항에 있어서,상기 제2 객체 제안 영역에 기초하여 상기 생성 필요 영역을 결정하는 단계는,상기 이동된 적어도 하나의 객체와 상기 제2 객체 제안 영역을 비교함으로써, 상기 이동된 적어도 하나의 객체에 대한 부분 이미지를 추가적으로 생성해야 하는 제1 영역을 제1 생성 필요 영역으로 결정하는 단계; 및상기 객체의 이동으로 인해 배경 이미지를 생성해야 하는 제2 영역을 제2 생성 필요 영역으로 결정하는 단계를 포함하는 것을 특징으로 하는 방법.
- 제1항 내지 제7항 중 어느 한 항에 있어서,상기 입력 이미지는 제1 생성 모델을 이용하여 확장되고,상기 제1 생성 필요 영역 내의 상기 이동된 적어도 하나의 객체에 대한 부분 이미지는 제2 생성 모델을 이용하여 생성되고,상기 제2 생성 필요 영역 내의 배경 이미지는 제3 생성 모델을 이용하여 생성되며,상기 제2 생성 모델은 상기 제1 생성 모델 또는 상기 제3 생성 모델에 비해 성능이 높은 것을 특징으로 하는 방법.
- 제1항 내지 제8항 중 어느 한 항에 있어서,상기 생성 필요 영역에 대한 이미지를 생성하는 단계는,상기 생성 필요 영역의 위치에 대한 정보, 상기 적어도 하나의 객체의 종류에 대한 정보 및 상기 생성 필요 영역을 포함하는 배경에 대한 정보 중 적어도 하나 이상에 기초하여 프롬프트를 생성하는 단계; 및상기 생성된 프롬프트를 상기 적어도 하나의 생성 모델에 입력하는 단계를 포함하는 것을 특징으로 하는 방법.
- 입력 이미지의 처리를 요청하는 사용자 입력을 수신하고, 상기 사용자 입력에 따라 처리된 재구성된 이미지를 출력하기 위한 입출력 인터페이스(1100);상기 입력 이미지를 처리하기 위한 인스트럭션들이 저장되는 메모리(1200); 및상기 인스트럭션들을 실행하기 위한 적어도 하나의 프로세서(1300)를 포함하며,상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서는,입력 이미지에 포함된 적어도 하나의 객체의 이동(movement)을 요청하는 사용자 입력을 수신하면,상기 객체의 이동에 기초하여 결정되는 방향으로 상기 입력 이미지를 확장하고,상기 확장된 입력 이미지에 기초하여 상기 적어도 하나의 객체에 대한 부분 이미지(partial image)의 생성이 필요한 영역인 생성 필요 영역(generation required area)을 결정하고,적어도 하나 이상의 생성 모델(generative model)을 이용하여 상기 생성 필요 영역에 대한 이미지를 생성한 후상기 입력 이미지 및 상기 적어도 하나의 객체에 대한 부분 이미지에 기초하여 재구성된 이미지를 출력하는, 컴퓨팅 장치.
- 제10항에 있어서,상기 입력 이미지를 확장함에 있어서, 상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서(1300)는,상기 적어도 하나의 객체가 객체의 일부만 표시된 부분 객체(partial object)인지 여부를 판단한 후,상기 적어도 하나의 객체가 부분 객체인 경우, 상기 객체의 이동 방향의 반대 방향으로 상기 입력 이미지를 확장하도록 이미지를 생성하는 것을 특징으로 하는 컴퓨팅 장치.
- 제10항 및 제11항 중 어느 한 항에 있어서,상기 적어도 하나의 객체가 부분 객체인지 여부를 판단함에 있어서, 상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서(1300)는,상기 적어도 하나의 객체가 상기 입력 이미지의 테두리(border)들 중에서 적어도 하나의 테두리에 의해 절단되었는지 여부를 확인한 후,상기 적어도 하나의 객체가 상기 적어도 하나의 테두리에 의해 절단되었다면, 상기 적어도 하나의 객체를 부분 객체로 판단하는 것을 특징으로 하는 컴퓨팅 장치.
- 제10항 내지 제12항 중 어느 한 항에 있어서,상기 입력 이미지를 확장하도록 이미지를 생성함에 있어서, 상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서(1300)는,상기 입력 이미지의 테두리들 중에서 상기 적어도 하나의 객체를 절단하는 적어도 하나의 테두리 중 제1 테두리를 선택하고,상기 요청된 객체의 이동 방향의 반대 방향으로, 상기 제1 테두리를 이동시킨 후,상기 하나 이상의 생성 모델을 이용하여 상기 적어도 하나의 테두리가 이동된 영역까지 이미지를 생성하는 것을 특징으로 하는 컴퓨팅 장치.
- 제10항 내지 제13항 중 어느 한 항에 있어서,상기 생성 필요 영역을 결정함에 있어서, 상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서(1300)는,상기 확장된 입력 이미지에서 상기 적어도 하나의 객체를 포함하는 영역을 제1 객체 제안 영역(object proposal area)으로 결정하고,상기 요청에 따라 상기 적어도 하나의 객체를 이동시키고,상기 제1 객체 제안 영역에 기초하여, 상기 이동된 적어도 하나의 객체에 대응되는 제2 객체 제안 영역을 결정한 후,상기 제2 객체 제안 영역에 기초하여 상기 생성 필요 영역을 결정하는 것을 특징으로 하는 컴퓨팅 장치.
- 제10항 내지 제14항 중 어느 한 항에 있어서,상기 제2 객체 제안 영역에 기초하여 상기 생성 필요 영역을 결정함에 있어서, 상기 인스트럭션들에 의해 상기 적어도 하나의 프로세서(1300)는,상기 이동된 적어도 하나의 객체와 상기 제2 객체 제안 영역을 비교함으로써, 상기 이동된 적어도 하나의 객체에 대한 부분 이미지를 추가적으로 생성해야 하는 제1 영역을 제1 생성 필요 영역으로 결정한 후,상기 객체의 이동으로 인해 배경 이미지를 생성해야 하는 제2 영역을 제2 생성 필요 영역으로 결정하는 것을 특징으로 하는 컴퓨팅 장치.
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480050152.4A CN121713215A (zh) | 2023-08-01 | 2024-07-31 | 使用生成模型的图像处理方法和用于执行该方法的计算设备 |
| EP24849624.2A EP4723041A4 (en) | 2023-08-01 | 2024-07-31 | IMAGE PROCESS USING A GENERAL MODEL AND COMPUTER DEVICE DESIGNED TO EXECUTE IT |
| US18/811,005 US20250045877A1 (en) | 2023-08-01 | 2024-08-21 | Image processing method using generative model and computing device for performing the same |
Applications Claiming Priority (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR20230100706 | 2023-08-01 | ||
| KR10-2023-0100706 | 2023-08-01 | ||
| KR10-2023-0178039 | 2023-12-08 | ||
| KR1020230178039A KR20250019552A (ko) | 2023-08-01 | 2023-12-08 | 생성 모델을 이용한 이미지 편집 방법 및 이를 수행하기 위한 컴퓨팅 장치 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/811,005 Continuation US20250045877A1 (en) | 2023-08-01 | 2024-08-21 | Image processing method using generative model and computing device for performing the same |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025029057A1 true WO2025029057A1 (ko) | 2025-02-06 |
Family
ID=94395725
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2024/011285 Pending WO2025029057A1 (ko) | 2023-08-01 | 2024-07-31 | 생성 모델을 이용한 이미지 처리 방법 및 이를 수행하기 위한 컴퓨팅 장치 |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025029057A1 (ko) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20200094608A (ko) * | 2019-01-30 | 2020-08-07 | 삼성전자주식회사 | 이미지를 처리하기 위한 방법 및 그에 따른 장치 |
| KR102218255B1 (ko) * | 2020-09-25 | 2021-02-19 | 정안수 | 갱신 영역 학습을 통한 인공지능 기반의 영상 분석 시스템 및 방법과, 이를 위한 컴퓨터 프로그램 |
| KR102230361B1 (ko) * | 2019-09-18 | 2021-03-23 | 고려대학교 산학협력단 | 단일 이미지를 이용하는 배경이미지 복원장치 및 그 동작 방법 |
| KR20210056944A (ko) * | 2019-11-11 | 2021-05-20 | 주식회사 날비컴퍼니 | 이미지 변형 방법 |
| KR20230053275A (ko) * | 2021-10-14 | 2023-04-21 | 주식회사 인피닉 | 객체 영역 지정 및 객체 검출 방법과 이를 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램 |
-
2024
- 2024-07-31 WO PCT/KR2024/011285 patent/WO2025029057A1/ko active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| KR20200094608A (ko) * | 2019-01-30 | 2020-08-07 | 삼성전자주식회사 | 이미지를 처리하기 위한 방법 및 그에 따른 장치 |
| KR102230361B1 (ko) * | 2019-09-18 | 2021-03-23 | 고려대학교 산학협력단 | 단일 이미지를 이용하는 배경이미지 복원장치 및 그 동작 방법 |
| KR20210056944A (ko) * | 2019-11-11 | 2021-05-20 | 주식회사 날비컴퍼니 | 이미지 변형 방법 |
| KR102218255B1 (ko) * | 2020-09-25 | 2021-02-19 | 정안수 | 갱신 영역 학습을 통한 인공지능 기반의 영상 분석 시스템 및 방법과, 이를 위한 컴퓨터 프로그램 |
| KR20230053275A (ko) * | 2021-10-14 | 2023-04-21 | 주식회사 인피닉 | 객체 영역 지정 및 객체 검출 방법과 이를 실행하기 위하여 기록매체에 기록된 컴퓨터 프로그램 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2015041438A1 (en) | Method for screen mirroring and source device thereof | |
| WO2021172921A1 (ko) | 의료 영상 장치 및 의료 영상 처리 방법 | |
| WO2017057960A1 (en) | Electronic device and method for controlling the same | |
| WO2022124865A1 (en) | Method, device, and computer program for detecting boundary of object in image | |
| WO2021221394A1 (ko) | 이미지 증강을 위한 방법 및 전자 장치 | |
| WO2019080401A1 (zh) | 脚本语句转换方法、装置及计算机可读存储介质 | |
| WO2025029057A1 (ko) | 생성 모델을 이용한 이미지 처리 방법 및 이를 수행하기 위한 컴퓨팅 장치 | |
| WO2026015009A1 (en) | Method, apparatus, and system for video enhancement, and storage medium | |
| WO2019074185A1 (en) | ELECTRONIC APPARATUS AND CONTROL METHOD THEREOF | |
| WO2023128424A1 (ko) | 애플리케이션 화면을 분석하는 전자 장치 및 그 동작 방법 | |
| WO2024058405A1 (ko) | 전자 장치 및 전자 장치의 제어 방법 | |
| WO2023229431A1 (ko) | 신경망 모델을 이용하여 영상을 보정하는 방법 및 영상 보정을 위한 신경망 모델을 실행하는 컴퓨팅 장치 | |
| WO2024186190A1 (ko) | 이미지의 품질을 평가하기 위한 전자 장치 및 전자 장치의 동작 방법 | |
| WO2023224212A1 (ko) | 사용자에 의한 3차원 객체 편집을 용이하게 하는 영상처리방법 및 영상처리장치 | |
| WO2020059914A1 (ko) | 단말기, 이의 제어 방법 및 상기 방법을 구현하기 위한 프로그램을 기록한 기록 매체 | |
| WO2022005246A1 (en) | Page navigation method and electronic device | |
| WO2024158073A1 (ko) | 학습 모델을 이용한 영상들 간의 유사도 판단 | |
| WO2019035536A1 (ko) | 전자 장치 및 그의 제어 방법 | |
| WO2017107522A1 (zh) | 一种终端的操作方法和终端 | |
| WO2021010558A1 (ko) | 단말기, 이의 제어 방법 및 상기 방법을 구현하기 위한 프로그램을 기록한 기록 매체 | |
| WO2024076169A1 (ko) | 공간정보를 이용한 객체 인식 모델의 학습 방법 및 이를 수행하기 위한 컴퓨팅 장치 | |
| WO2026049307A1 (ko) | 전자 장치 및 전자 장치의 동작 방법 | |
| WO2024253431A1 (ko) | 시선 추적 센서를 포함하는 헤드 마운티드 디스플레이 장치 및 헤드 마운티드 디스플레이 장치의 동작 방법 | |
| WO2025259079A1 (en) | Method and electronic device for automatic generation of high quality data for image editing applications | |
| WO2025084610A1 (ko) | 컨볼루션 연산을 수행하는 전자 장치 및 전자 장치의 동작 방법 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24849624 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024849624 Country of ref document: EP |
|
| ENP | Entry into the national phase |
Ref document number: 2024849624 Country of ref document: EP Effective date: 20251229 |
|
| ENP | Entry into the national phase |
Ref document number: 2024849624 Country of ref document: EP Effective date: 20251229 |
|
| ENP | Entry into the national phase |
Ref document number: 2024849624 Country of ref document: EP Effective date: 20251229 |
|
| ENP | Entry into the national phase |
Ref document number: 2024849624 Country of ref document: EP Effective date: 20251229 |
|
| ENP | Entry into the national phase |
Ref document number: 2024849624 Country of ref document: EP Effective date: 20251229 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 2024849624 Country of ref document: EP |