WO2011008065A2 - Method and apparatus for multi-view video coding and decoding - Google Patents
Method and apparatus for multi-view video coding and decoding Download PDFInfo
- Publication number
- WO2011008065A2 WO2011008065A2 PCT/KR2010/004717 KR2010004717W WO2011008065A2 WO 2011008065 A2 WO2011008065 A2 WO 2011008065A2 KR 2010004717 W KR2010004717 W KR 2010004717W WO 2011008065 A2 WO2011008065 A2 WO 2011008065A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- picture
- view
- base layer
- layer picture
- reconstructed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/236—Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
- H04N21/2365—Multiplexing of several video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/103—Selection of coding mode or of prediction mode
- H04N19/105—Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/157—Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
- H04N19/159—Prediction type, e.g. intra-frame, inter-frame or bidirectional frame prediction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/187—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a scalable video layer
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234327—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by decomposing into layers, e.g. base layer and one or more enhancement layers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/238—Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
- H04N21/2383—Channel coding or modulation of digital bit-stream, e.g. QPSK modulation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/2662—Controlling the complexity of the video stream, e.g. by scaling the resolution or bitrate of the video stream based on the client capabilities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/434—Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
- H04N21/4347—Demultiplexing of several video streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/438—Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
- H04N21/4382—Demodulation or channel decoding, e.g. QPSK demodulation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/24—Systems for the transmission of television signals using pulse code modulation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- Apparatuses and methods consistent with exemplary embodiments relate generally to an apparatus and method for coding and decoding video sequences, and in particular, to a method and apparatus for coding and decoding multi-view video sequences such as stereoscopic video sequences in a layered coding structure, or a hierarchical coding structure.
- Typical examples of related art three-dimensional (3D) video coding methods include Multi-view Profile (MVP) based on MPEG-2 Part 2 Video (hereinafter, MPEG-2 MVP), and Multi-view Video Coding (MVC) based on H.264 (MPEG-4 AVC) Amendment 4 (hereinafter, H.264 MVC).
- MVP Multi-view Profile
- MPEG-4 AVC Multi-view Video Coding
- the MPEG-2 MVP method for coding stereoscopic video performs video coding based on a main profile and a scalable profile of MPEG-2 using inter-view redundancy of video.
- the H.264 MVC method for coding multi-view video performs video coding based on H.264 using the inter-view redundancy of video.
- aspects of exemplary embodiments provide a video coding and decoding method and apparatus for providing multi-view video services while providing compatibility with various video codecs.
- aspects of exemplary embodiments also provide a video coding and decoding method and apparatus for providing multi-view video services based on a layered coding and decoding method.
- a multi-view video coding method for providing a multi-view video service, the method including: coding a base layer picture using an arbitrary video codec; generating a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; and residual-coding a layer picture corresponding to the different view using the generated prediction picture.
- a multi-view video coding apparatus for providing a multi-view video service, the apparatus including: a base layer coder which codes a base layer picture using an arbitrary video codec; a view converter which generates a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; and a residual coder which residual-codes a layer picture corresponding to the different view using the generated prediction picture.
- a multi-view video decoding method for providing a multi-view video service, the method including: reconstructing a base layer picture using an arbitrary video codec; generating a prediction picture using at least one of the reconstructed base layer picture and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; and reconstructing a layer picture corresponding to the different view using a residual-decoded layer picture and the generated prediction picture.
- a multi-view video decoding apparatus for providing a multi-view video service, the apparatus including: reconstructing a base layer picture using an arbitrary video codec; generating a prediction picture using at least one of the reconstructed base layer picture and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; and reconstructing a layer picture corresponding to the different view using a residual-decoded layer picture and the generated prediction picture.
- a multi-view video providing system including: a multi-view video coding apparatus, comprising: a base layer coder which codes a base layer picture using an arbitrary video codec, a view converter which generates a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture, a residual coder which residual-codes a layer picture corresponding to the different view using the generated prediction picture, and a multiplexer which multiplexes the coded base layer picture and the residual-coded layer picture into a bitstream, and outputs the bitstream; and a multi-view video decoding apparatus comprising: a demultiplexer which receives and demultiplexes the output bitstream into a base layer bitstream and a layer bitstream, a base layer decoder which reconstructs the base layer picture from the base layer bitstream using a video code
- FIG. 1 is a block diagram showing a structure of a multi-view video coder according to an exemplary embodiment
- FIG. 2 is a block diagram showing a structure of a view converter in a multi-view video coder according to an exemplary embodiment
- FIG. 3 is a flowchart showing a multi-view video coding method according to an exemplary embodiment
- FIG. 4 is a flowchart showing a view conversion method performed in a multi-view video coder according to an exemplary embodiment
- FIG. 5 is a block diagram showing a structure of a multi-view video decoder according to an exemplary embodiment
- FIG. 6 is a block diagram showing a structure of a view converter in a multi-view video decoder according to an exemplary embodiment
- FIG. 7 is a flowchart showing a multi-view video decoding method according to an exemplary embodiment
- FIG. 8 is a flowchart showing a view conversion method performed in a multi-view video decoder according to an exemplary embodiment
- FIG. 9 is a block diagram showing an exemplary structure of a multi-view video coder with N enhancement layers according to another exemplary embodiment.
- FIG. 10 is a block diagram showing an exemplary structure of a multi-view video decoder with N enhancement layers according to another exemplary embodiment.
- codecs such as H.264 and VC-1 are introduced as exemplary types of codecs, but theses exemplary codecs are merely provided for a better understanding of exemplary embodiments, and are not intended to limit the scope of the exemplary embodiments.
- An exemplary embodiment provides a hierarchical structure of a video coder/decoder to provide multi-view video services such as three-dimensional (3D) video services while maintaining compatibility with any existing codec used for video coding/decoding.
- a video coder/decoder designed in a layered coding/decoding structure codes and decodes multi-view video including one base layer picture and at least one enhancement layer picture.
- the base layer picture as used herein refers to pictures which are compression-coded based on an existing scheme using existing video codecs such as VC-1 and H.264.
- the enhancement layer picture refers to pictures which are obtained by residual-coding pictures that have been view-converted using at least one of a base layer picture of one view and an enhancement layer picture of a view different from that of the base layer, regardless of the type of the video codec used in the base layer.
- the enhancement layer picture refers to pictures having different views from that of the base layer picture.
- the enhancement layer picture may be a right-view picture.
- the enhancement layer picture may be a left-view picture.
- the base layer picture and the enhancement layer picture are considered as left/right-view pictures, respectively, for convenience of description, though it is understood that the base layer picture and the enhancement layer picture may be pictures of various views such as front/rear-view pictures and top/bottom-view pictures. Therefore, the enhancement layer picture may be construed as a layer picture having a view different from that of the base layer picture.
- the layer picture having a different view and the enhancement layer picture may be construed to be the same. If the enhancement layer picture is plural in number, pictures of various views (such as front/rear-view pictures, top/bottom-view pictures, etc.) may be provided as multi-view video by using the base layer picture and the multiple enhancement layer pictures.
- an enhancement layer picture is generated by coding a residual picture.
- the residual picture is defined as a result of coding picture data obtained from a difference between an enhancement layer's input picture and a prediction picture generated by view conversion according to an exemplary embodiment.
- the prediction picture is generated using at least one of a reconstructed base layer picture and a reconstructed enhancement layer picture.
- the reconstructed base layer picture refers to a currently reconstructed base layer picture that is reconstructed by coding the input picture "view 0" by an arbitrary existing video codec, and then decoding the coded picture.
- the reconstructed enhancement layer picture used for generation of the prediction picture refers to a previously reconstructed enhancement layer picture generated by a previous residual picture to a previous prediction picture.
- the reconstructed enhancement layer picture refers to a currently reconstructed enhancement layer picture, which is generated by reconstructing the currently coded residual picture in another enhancement layer of a view different from that of the enhancement layer. View conversion for generating the prediction picture will be described in detail later.
- a multi-view video coder outputs a base layer picture of one view in a bitstream by coding a base layer's input picture using an arbitrary video codec, and outputs an enhancement layer picture having a view different from that of the base layer picture in a bitstream by performing residual coding on an enhancement layer's input picture using a prediction picture generated by the view conversion.
- a multi-view video decoder reconstructs a base layer picture of one view by decoding a coded base layer picture of the view using the arbitrary video codec, and residual-decodes a coded enhancement layer picture of a different view from that of the base layer picture and reconstructs the enhancement layer picture having the different view using a prediction picture generated by the view conversion.
- a two-dimensional (2D) picture of one view may be reconstructed by taking a base layer's bitstream from the bitstream and decoding the base layer's bitstream, and an enhancement layer picture having a different view in, for example, a 3D picture may be reconstructed by decoding the base layer's bitstream and then combining a prediction picture generated by performing view conversion according to an exemplary embodiment with a residual picture generated by decoding an enhancement layer's bitstream.
- a structure and operation of a multi-view video coder according to an exemplary embodiment will now be described in detail.
- the exemplary embodiment described below uses both a reconstructed current base layer picture and a reconstructed previous enhancement layer picture during view conversion, and the number of enhancement layers is 1.
- another exemplary embodiment is not limited thereto.
- FIG. 1 shows a structure of a multi-view video coder 100 according to an exemplary embodiment.
- P1 represents a base layer's input picture
- P2 represents an enhancement layer's input picture.
- a base layer coder 101 compression-codes the input picture P1 of one view in the base layer according to an existing scheme using an arbitrary video codec among existing video codecs (for example, VC-1, H.264, MPEG-4 Part 2 Visual, MPEG-2 Part 2 Video, AVS, JPEG2000, etc.), and outputs the coded base layer picture in a base layer bitstream P3.
- the base layer coder 101 reconstructs the coded base layer picture, and stores the reconstructed base layer picture P4 in a base layer buffer 103.
- a view converter 105 receives the currently reconstructed base layer picture (hereinafter, "current base layer picture”) P8 from the base layer buffer 103.
- a residual coder 107 receives, through a subtractor 109, picture data obtained by subtracting a prediction picture P5 from the view converter 105 from the enhancement layer's input picture P2, and residual-codes the received picture data.
- the residual-coded enhancement layer picture, or a coded residual picture is output in an enhancement layer bitstream P6.
- the residual coder 107 reconstructs the residual-coded enhancement layer picture, and outputs a reconstructed enhancement layer picture P7, or a reconstructed residual picture.
- the prediction picture P5 from the view converter 105 and the reconstructed enhancement layer picture P7 are added by an adder 111, and stored in an enhancement layer buffer 113.
- the view converter 105 receives, from the enhancement layer buffer 113, a previously reconstructed enhancement layer picture (hereinafter, "previous enhancement layer picture") P9. While the base layer buffer 103 and the enhancement layer buffer 113 are shown separately in the present exemplary embodiment, it is understood that the base layer buffer 103 and the enhancement layer buffer 113 may be implemented in one buffer according to another exemplary embodiment.
- previous enhancement layer picture a previously reconstructed enhancement layer picture
- the view converter 105 receives the current base layer picture P8 and the previous enhancement layer picture P9 from the base layer buffer 103 and the enhancement layer buffer 113, respectively, and generates the view-converted prediction picture P5.
- the view converter 105 generates a control information bitstream P10 including the prediction picture's control information, to be described below, which is used for decoding in a multi-view video decoder.
- the generated prediction picture P5 is output to the subtractor 109 to be used to generate the enhancement layer bitstream P6, and output to the adder 111 to be used to generate the next prediction picture.
- a multiplexer (MUX) 115 multiplexes the base layer bitstream P3, the enhancement layer bitstream P6, and the control information bitstream P10, and outputs the multiplexed bitstreams P3, P6, P10 in one bitstream.
- MUX multiplexer
- the multi-view video coder 100 is compatible with any video coding method, and can be implemented in existing systems and can efficiently support multi-view video services, including 3D video services.
- FIG. 2 shows a structure of a view converter 105 in a multi-view video coder 100 according to an exemplary embodiment.
- the view converter 105 divides picture data in units of M ⁇ N pixel blocks and sequentially generates a prediction picture block by block.
- a picture type decider 1051 decides whether to use a current base layer picture P8, a currently reconstructed enhancement layer picture (hereinafter, "current enhancement layer picture") of a view different from that of the base layer, or a combination of the current base layer picture P8 and a previous enhancement layer picture P9 in generating a prediction picture, according to a Picture Type (PT).
- PT Picture Type
- generating a prediction picture using the current enhancement layer picture may be used when the enhancement layer is plural in number.
- the picture type decider 1051 determines a reference relationship, or use, of the current base layer picture P8 and the previous enhancement layer picture P9 according to the PT of the enhancement layer's input picture P2. For example, if a PT of the enhancement layer's input picture P2 to be currently coded is an intra-picture, view conversion for generation of the prediction picture P5 may be performed using the current base layer picture P8. Furthermore, if a plurality of enhancement layers are provided and the PT is an intra-picture, view conversion for generation of the prediction picture P5 may be performed using the current enhancement layer picture.
- view conversion for generation of the prediction picture P5 may be performed using the current base layer picture P8 and the previous enhancement layer picture P9.
- the PT may be given in an upper layer of the system to which the multi-view video coder of the present exemplary embodiment is applied.
- the PT may be previously determined as one of the intra-picture or the inter-picture.
- a Disparity Estimator/Motion Estimator (DE/ME) 1053 Based on the decision results of the picture type decider 1051, a Disparity Estimator/Motion Estimator (DE/ME) 1053 outputs a disparity vector by performing Disparity Estimation (DE) on a block basis using the current base layer picture P8, or outputs a disparity vector and a motion vector of a pertinent block by performing DE and Motion Estimation (ME) on a block basis, respectively, using the current base layer picture P8 and the previous enhancement layer picture P9. If the enhancement layer is plural in number, the DE/ME 1053 may perform DE on a block basis using the current enhancement layer picture in another enhancement layer having a view different from the view of the enhancement layer's input picture.
- the disparity vector and the motion vector may be construed to be differently named according to which reference picture(s) is used among the current base layer picture and the previous/current enhancement layer pictures, and a prediction process and a vector outputting process based on the used reference picture(s) may be performed in the same manner.
- the view converter 105 performs view conversion in units of macro blocks, or M ⁇ N pixel blocks.
- the DE/ME 1053 may output at least one of a disparity vector and a motion vector on an M ⁇ N pixel block basis.
- the DE/ME 1053 may divide each M ⁇ N pixel block into K partitions in various methods and output K disparity vectors and/or motion vectors.
- the DE/ME 1053 may output one disparity vector or motion vector in every 16 ⁇ 16 pixel block.
- the DE/ME 1053 may selectively output 1K disparity vectors or motion vectors on a 16 ⁇ 16 pixel block basis, or output 4K disparity vectors or motion vectors on an 8 ⁇ 8 pixel block basis.
- a mode selector 1055 determines whether to reference the current base layer picture or the previous enhancement layer picture in performing compensation on an M ⁇ N pixel block, a prediction picture of which is to be generated. If the enhancement layer is plural in number, the mode selector 1055 determines whether to reference the current enhancement layer picture in performing compensation in another enhancement layer having a view different from that of the enhancement layer.
- the mode selector 1055 selects an optimal mode from among a DE mode and an ME mode to perform Disparity Compensation (DC) on the current M ⁇ N pixel block according to the DE mode using a disparity vector, or to perform Motion Compensation (MC) on the current M ⁇ N pixel block according to the ME mode using a motion vector.
- the mode selector 1055 may divide an M ⁇ N pixel block into a plurality of partitions and determine whether to use a plurality of disparity vectors or a plurality of motion vectors. The determined information may be delivered to a multi-view video decoder with the prediction picture's control information to be described later. The number of divided partitions may be determined by default.
- a Disparity Compensator/Motion Compensator (DC/MC) 1057 generates a prediction picture P5 by performing DC or MC according to whether a mode with a minimum prediction cost, which is selected in the mode selector 1055, is the DE mode or the ME mode. If the mode selected in the mode selector 1055 is the DE mode, the DC/MC 1057 generates the prediction picture P5 by compensating the M ⁇ N pixel block using a disparity vector in the current base layer picture. If the selected mode is the ME mode, the DC/MC 1057 generates the prediction picture P5 by compensating the M ⁇ N pixel block using a motion vector in the previous enhancement layer picture.
- mode information indicating whether the selected mode is the DE mode or the ME mode may be delivered to the multi-view video decoder in the form of flag information, for example.
- An entropy coder 1059 entropy-codes the mode information and the prediction picture's control information including disparity vector information or motion vector information, for each block in which a prediction picture is generated, and outputs the coded information in a control information bitstream P10.
- the control information bitstream P10 may be delivered to the multi-view video decoder after being inserted into a picture header of the enhancement layer bitstream P6.
- the disparity vector information and the motion vector information in the prediction picture's control information may be inserted into the control information bitstream P10 using the same syntax during entropy coding.
- a multi-view video coding method will now be described with reference to FIGs. 3 and 4.
- FIG. 3 shows a multi-view video coding method according to an exemplary embodiment.
- a base layer coder 101 outputs a base layer bitstream by coding a base layer's input picture of a first view using a codec.
- the base layer coder 101 reconstructs the coded base layer picture, and stores the reconstructed base layer picture in a base layer buffer 103. It is assumed that at a prior time, a residual coder 107 residual-coded a previous input picture in an enhancement layer of a second view, reconstructed the coded enhancement layer picture, and output the reconstructed enhancement layer picture. Therefore, the previously reconstructed enhancement layer picture has been stored in an enhancement layer buffer 113 after being added to the prediction picture that was previously generated by the view converter 105.
- a view converter 105 receives the reconstructed base layer picture and the reconstructed enhancement layer picture from the base layer buffer 103 and the enhancement layer buffer 113, respectively. Thereafter, the view converter 105 generates a prediction picture that is view-converted with respect to an enhancement layer's input picture using at least one of the reconstructed base layer picture and the reconstructed enhancement layer picture. As described above, the view converter 105 may generate the prediction picture using the current base layer picture, or generate the prediction picture using the current base layer picture and the previous enhancement layer picture in the enhancement layer.
- the residual coder 107 residual-codes picture data obtained by subtracting the prediction picture from the enhancement layer's input picture of the second view, and outputs the coded enhancement layer picture.
- a multiplexer 115 multiplexes the base layer picture coded in step 301 and the enhancement layer picture coded in step 305, and outputs the multiplexed pictures in a bitstream.
- the number of the enhancement layers is exemplarily assumed to be one in the example of FIG. 3, the enhancement layer may be plural in number.
- the prediction picture may be generated using the current base layer picture and the previous enhancement layer picture, or the prediction picture may be generated using the current enhancement layer picture in another enhancement layer having a view different from that of the enhancement layer.
- FIG. 4 shows a view conversion method performed in a multi-view video coder according to an exemplary embodiment.
- a macro block processed during generation of a prediction picture is a 16 16 pixel block, though it is understood that this size is merely exemplary and another exemplary embodiment is not limited thereto.
- a picture type decider 1051 decides whether a PT of an input picture to be currently coded in the enhancement layer is an intra-picture or an inter-picture. If the PT is determined as an intra-picture in step 401, a DE/ME 1053 calculates, in step 403, a prediction cost of each pixel block by performing DE on a 16 ⁇ 16 pixel block basis and an 8 ⁇ 8 pixel block basis, using the current base layer picture as a reference picture.
- the DE/ME 1053 calculates, in step 405, a prediction cost of each pixel block by performing DE and ME on a 16 ⁇ 16 pixel block basis and an 8 ⁇ 8 pixel block each, using the current base layer picture and the previous enhancement layer picture as reference pictures.
- the prediction cost calculated in step 403 and 405 refers to a difference between the current input picture block and a block that corresponds to the current input picture block based on a disparity vector or a motion vector.
- Example of the prediction cost include Sum of Absolute Difference (SAD), Sum of Square Difference (SSD), etc.
- a mode selector 1055 selects, in step 407, the DE mode having a minimum prediction cost by comparing a prediction cost obtained by performing DE on a 16 ⁇ 16 pixel block with a prediction cost obtained by performing DE on an 8 ⁇ 8 pixel block in the 16 ⁇ 16 pixel block.
- the mode selector 1055 determines whether a mode having the minimum prediction cost is the DE mode or the ME mode, by comparing a prediction cost obtained by performing DE on a 16 ⁇ 16 pixel block, a prediction cost obtained by performing DE on an 8 ⁇ 8 pixel block in the 16 ⁇ 16 pixel block, a prediction cost obtained by performing ME on a 16 ⁇ 16 pixel block, and a prediction cost obtained by performing ME on an 8 ⁇ 8 pixel block in the 16 ⁇ 16 pixel block.
- the mode selector 1055 sets flag information "VIEW_PRED_FLAG" to 1.
- the mode selector 1055 sets "VIEW_PRED_FLAG" to 0.
- a DC/MC 1057 performs DC from the current base layer picture using a disparity vector on a 16 ⁇ 16 pixel block basis or an 8 ⁇ 8 pixel block basis, which was generated by DE, in step 411. If “VIEW_PRED_FLAG” is determined as 0 in step 409, the DC/MC 1057 performs MC from the previous enhancement layer picture using a motion vector on a 16 ⁇ 16 pixel block basis or an 8 ⁇ 8 pixel block basis, which was generated by ME, in step 413. In this manner, "VIEW_PRED_FLAG” may indicate which of the base layer picture and the enhancement layer picture is referenced in a process of generating a prediction picture.
- an entropy coder 1059 entropy-codes, in step 415, information about the disparity vector or the motion vector calculated by the DE/ME 1053 and information about the mode selected by the mode selector 1055, and outputs the results in a bitstream.
- the entropy coder 1059 entropy-codes "VIEW_PRED_FLAG" and mode information about use/non-use of the disparity vector or motion vector on a 16 ⁇ 16 pixel block basis or an 8 ⁇ 8 pixel block basis, and performs entropy coding on the disparity vector or motion vector as many times as the number of disparity vectors or motion vectors.
- the entropy coding on the disparity vector or motion vector is achieved by coding a differential value obtained by subtracting the actual vector value from a prediction value of the disparity vector or motion vector.
- the enhancement layer's input picture to be currently coded is an intra-picture
- coding of "VIEW_PRED_FLAG” may be omitted since, to guarantee random access, only DC may be used from the base layer's picture because the previous picture cannot be referenced.
- the multi-view video decoder may perform DC by checking a header of an enhancement layer bitstream, indicating that the enhancement layer picture is an intra-picture.
- the view converter 105 goes to the next block in step 417, and steps 401 to 415 are performed on each block of the enhancement layer's input picture to be currently coded.
- a structure and operation of a multi-view video decoder according to an exemplary embodiment will now be described in detail.
- the exemplary embodiment described below uses both a reconstructed current base layer picture and a reconstructed previous enhancement layer picture during view conversion, and the number of enhancement layers is 1.
- another exemplary embodiment is not limited thereto.
- FIG. 5 shows a structure of a multi-view video decoder 500 according to an exemplary embodiment.
- a demultiplexer 501 demultiplexes a bitstream coded by a multi-view video coder 100 into a base layer bitstream Q1, an enhancement layer bitstream Q2, and a control information bitstream Q3 used during decoding of an enhancement layer picture. Furthermore, the demultiplexer 501 provides the base layer bitstream Q1 to a base layer decoder 503, the enhancement layer bitstream Q2 to a residual decoder 505, and the control information bitstream Q3 to a view converter 507.
- the base layer decoder 503 outputs a base layer picture Q4 of a first view by decoding the base layer bitstream Q1 using a scheme corresponding to a video codec used in the base layer coder 101.
- the base layer picture Q4 of the first view is stored in a base layer buffer 509 as a currently reconstructed base layer picture (hereinafter, "current base layer picture") Q5.
- the view converter 507 receives a previously reconstructed enhancement layer picture (hereinafter, "previous enhancement layer picture") Q9 from the enhancement layer buffer 513.
- the buffers 509, 513 may be realized in a single buffer according to another exemplary embodiment.
- the view converter 507 receives the current base layer picture Q8 and the previous enhancement layer picture Q9 from the base layer buffer 509 and the enhancement layer buffer 513, respectively, and generates a prediction picture Q6 that is view-converted at the present time.
- the prediction picture Q6 is added to the current enhancement layer picture, which is residual-decoded by the residual decoder 505, using the adder 511, and then output to the enhancement layer buffer 513.
- the currently reconstructed enhancement layer picture stored in the enhancement layer buffer 513 is output as a reconstructed enhancement layer picture Q7 of a second view. Subsequently, the currently reconstructed enhancement layer picture may be provided to the view converter 507 as the previous enhancement layer picture so as to be used to generate a next prediction picture.
- the multi-view video decoder 500 may support the existing 2D video services with one decoded view by decoding only the base layer bitstream. Although only one enhancement layer is shown in the example of FIG. 5, the multi-view video decoder 500 may support multi-view video services if the multi-view video decoder 500 outputs decoded views #1 ⁇ N by decoding N enhancement layer bitstreams having different views along with the base layer bitstream. Based on the structure of FIG. 5, the scalability feature for various views may also be provided.
- FIG. 6 shows a structure of the view converter 507 in a multi-view video decoder 500 according to an exemplary embodiment.
- the view converter 507 divides picture data in units of M ⁇ N pixel blocks, and sequentially generates a prediction picture block by block.
- a picture type decider 5071 decides whether to use a current base layer picture, a currently reconstructed enhancement layer picture (hereinafter, "current enhancement layer picture") of a different view, or a combination of the current base layer picture and a previous enhancement layer picture in generating a prediction picture, according to the PT.
- generating a prediction picture using the current enhancement layer picture may be used when the enhancement layer is plural in number.
- the PT may be included in header information of the enhancement layer bitstream Q2 input to the residual decoder 505, and may be acquired from the header information by an upper layer of a system to which the multi-view video decoder of the present exemplary embodiment is applied.
- the picture type decider 5071 determines a reference relationship, or use, of the current base layer picture Q8 and the previous enhancement layer picture Q9 according to the PT. For example, if a PT of the enhancement layer bitstream Q2 to be currently decoded is an intra-picture, view conversion for generation of the prediction picture Q6 may be performed using only the current base layer picture Q8. Furthermore, if a plurality of enhancement layers are provided and the PT is an intra-picture, view conversion for generation of the prediction picture Q6 may be performed using the current enhancement layer picture.
- view conversion for generation of the prediction picture Q6 may be performed using the current base layer picture Q8 and the previous enhancement layer picture Q9.
- An entropy decoder 5073 entropy-decodes the control information bitstream Q3 received from the demultiplexer 501, and outputs the decoded prediction picture's control information to a DC/MC 5075.
- the prediction picture's control information includes mode information and at least one of disparity and motion information corresponding to each of the M ⁇ N pixel blocks.
- the mode information includes at least one of information indicating whether the DC/MC 5075 will perform DC using a disparity vector or perform MC using a motion vector in the current M ⁇ N pixel block, information indicating the number of disparity vectors or motion vectors that the DC/MC 5075 will select in each M ⁇ N pixel block, etc.
- the DC/MC 5075 Based on the prediction picture's control information, if the mode having the minimum prediction cost, selected during coding, is the DC mode, the DC/MC 5075 generates a prediction picture Q6 by performing DC using a disparity vector of the current base layer picture which is identical in time to the enhancement layer's picture to be decoded. Conversely, if the mode having the minimum prediction cost is the MC mode, the DC/MC 5075 generates a prediction picture Q6 by performing MC using a motion vector of the previous enhancement layer picture.
- a multi-view video decoding method will now be described with reference to FIGs. 7 and 8.
- FIG. 7 shows a multi-view video decoding method according to an exemplary embodiment.
- a multi-view video decoder 500 receives a bitstream coded by a multi-view video coder 100 (for example, the multi-view video coder 100 illustrated in FIG. 1).
- the input bitstream is demultiplexed into a base layer bitstream, an enhancement layer bitstream, and a control information bitstream by the demultiplexer 501.
- a base layer decoder 503 receives the base layer bitstream, and reconstructs a base layer picture of a first view by decoding the base layer bitstream using a scheme corresponding to a codec used in a base layer coder 101 of the multi-view video coder 100.
- the base layer decoder 503 stores the base layer picture reconstructed by decoding in a base layer buffer 509.
- a residual decoder 505 receives a current enhancement layer picture and residual-decodes the received current enhancement layer picture. It is assumed that an enhancement layer picture previously reconstructed by residual decoding and a prediction picture previously generated by a view converter 507 were previously added by an adder 511 and stored in an enhancement layer buffer 513 in advance.
- the view converter 507 receives the reconstructed base layer picture and the reconstructed enhancement layer picture from the base layer buffer 509 and the enhancement layer buffer 513, respectively.
- the view converter 507 generates a prediction picture which is view-converted with respect to the enhancement layer's input picture using at least one of the reconstructed base layer picture and the reconstructed enhancement layer picture.
- the view converter 507 may generate the prediction picture using the current base layer picture, or generate the prediction picture using the current base layer picture and the previous enhancement layer picture in the enhancement layer.
- the adder 511 reconstructs an enhancement layer picture of a second view by adding the prediction picture generated in step 703 to the current enhancement layer picture residual-decoded by the residual decoder 505.
- the currently reconstructed enhancement layer picture of the second view is stored in the enhancement layer buffer 513, and may be used as a previous enhancement layer picture when a next prediction picture is generated.
- the enhancement layer may be plural in number so as to correspond to the number of enhancement layers in the multi-view video coder 100.
- the prediction picture may be generated using the current base layer picture and the previous enhancement layer picture, or the prediction picture may be generated using the current enhancement layer picture in another enhancement layer having a view different from that of the enhancement layer.
- decoding of the base layer picture and the decoding of the enhancement layer picture are sequentially illustrated in the example of FIG. 7, it is understood that decoding of the base layer picture and decoding of the enhancement layer picture may be performed in parallel.
- FIG. 8 shows a view conversion method performed in a multi-view video decoder according to an exemplary embodiment.
- a macro block processed during generation of a prediction picture is a 16 ⁇ 16 pixel block, though it is understood that this size is merely exemplary and another exemplary embodiment is not limited thereto.
- a picture type decider 5071 determines whether a PT of an enhancement layer's input picture to be currently decoded is an intra-picture or an inter-picture.
- an entropy decoder 5073 performs entropy decoding according to the determined PT.
- the entropy decoder 5073 entropy-decodes "VIEW_PRED_FLAG," mode information about use/non-use of a disparity vector or a motion vector on a 16 ⁇ 16 pixel basis or an 8 ⁇ 8 pixel basis, and prediction picture control information including disparity vector information or motion vector information, for each block, a prediction picture of which is generated from a control information bitstream.
- the entropy decoder 5073 may entropy-decode the remaining prediction picture control information in the same manner, omitting decoding of "VIEW_PRED_FLAG.”
- the VIEW_PRED_FLAG, decoding of which is omitted, may be set to 1.
- the entropy decoder 5073 entropy-decodes mode information about use/non-use of a disparity vector or a motion vector, and performs entropy decoding on the motion vector as many times as the number of disparity vectors or motion vectors.
- the decoding results on the disparity vectors or motion vectors include a differential value of the disparity vectors or the motion vectors.
- the entropy decoder 5073 generates a disparity vector or a motion vector by adding the differential value to a prediction value of the disparity vector or the motion vector, and outputs the results to a DC/MC 5075.
- step 806 the DC/MC 5075 receives the PT determined in step 801 and the "VIEW_PRED_FLAG" and the disparity vector or motion vector calculated in step 803, and checks a value of "VIEW_PRED_FLAG.”
- a view converter 507 goes to the next block in step 811 so that steps 801 to 809 are performed on each block of the enhancement layer's picture to be currently decoded.
- the multi-view video coder and decoder having a single enhancement layer have been described by way of example. It is understood that when a multi-view video services having N (where N is a natural number greater than or equal to 3) views is provided, the multi-view video coder and decoder may be extended to have N enhancement layers according to other exemplary embodiments, as shown in FIGs. 9 and 10, respectively.
- FIG. 9 shows an exemplary structure of a multi-view video coder 900 with N enhancement layers according to another exemplary embodiment
- FIG. 10 shows an exemplary structure of a multi-view video decoder 1000 with N enhancement layers according to another exemplary embodiment.
- the multi-view video coder 900 includes first to N-th enhancement layer coding blocks 900 1 ⁇ 900 N corresponding to N enhancement layers.
- the first to N-th enhancement layer coding blocks 900 1 ⁇ 900 N are the same or similar in structure, and each of the first to N-th enhancement layer coding blocks 900 1 ⁇ 900 N codes its associated enhancement layer's input picture using a view-converted prediction picture according to an exemplary embodiment.
- Each enhancement layer coding block outputs the above-described control information bitstream and enhancement layer bitstream as coding results, for its associated enhancement layer (901).
- the enhancement layer coding blocks are the same or similar in structure and operation as those described in FIG. 1, and a detailed description thereof is therefore omitted herein.
- the multi-view video decoder 1000 includes first to N-th enhancement layer decoding blocks 1000 1 ⁇ 1000 N corresponding to N enhancement layers.
- the first to N-th enhancement layer decoding blocks 1000 1 ⁇ 1000 N are the same or similar in structure, and each of the first to N-th enhancement layer decoding blocks 1000 1 ⁇ 1000 N decodes its associated enhancement layer bitstream using a view-converted prediction picture according to an exemplary embodiment.
- Each enhancement layer decoding block receives the above-described control information bitstream and enhancement layer bitstream to decode its associated enhancement layer picture 1001.
- the enhancement layer decoding blocks are the same or similar in structure and operation as those described in FIG. 5, and a detailed description thereof is therefore omitted herein.
- the multi-view video coder 900 and decoder 1000 of FIGs. 9 and 10 each use a reconstructed base layer picture P4 in each enhancement layer during generation of a prediction picture
- the multi-view video coder 900 and decoder 1000 may be adapted to use a currently reconstructed enhancement layer picture of a view different from that of the associated enhancement layer, rather than using the reconstructed base layer picture P4 in each enhancement layer during generation of a prediction picture.
- the multi-view video coder 900 and decoder 1000 may be adapted to use a currently reconstructed enhancement layer picture in an enhancement layer n-1, replacing the reconstructed base layer picture P4, when generating a prediction picture in an enhancement layer n, or to use the reconstructed picture in each of enhancement layers n-1 and n+1 when generating a prediction picture in an enhancement layer n.
- exemplary embodiments can also be embodied as computer-readable code on a computer-readable recording medium.
- the computer-readable recording medium is any data storage device that can store data that can be thereafter read by a computer system. Examples of the computer-readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices.
- the computer-readable recording medium can also be distributed over network-coupled computer systems so that the computer-readable code is stored and executed in a distributed fashion.
- exemplary embodiments may be written as computer programs transmitted over a computer-readable transmission medium, such as a carrier wave, and received and implemented in general-use or special-purpose digital computers that execute the programs.
- one or more units of the coder 100, 900 and decoder 500, 1000 can include a processor or microprocessor executing a computer program stored in a computer-readable medium.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Databases & Information Systems (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Description
Claims (21)
- A multi-view video coding method for providing a multi-view video service, the multi-view video coding method comprising:coding a base layer picture using an arbitrary video codec;generating a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; andresidual-coding a layer picture corresponding to the different view using the generated prediction picture.
- The multi-view video coding method of claim 1, wherein the generating the prediction picture comprises generating the prediction picture according to a picture type.
- The multi-view video coding method of claim 1, wherein:the view of the base layer picture is a left view of a three-dimensional (3D) image and the view of the a layer picture is a right view of the 3D image, or the view of the a layer picture is the right view and the view of the base layer picture is the left view.
- The multi-view video coding method of claim 1, wherein the residual-coding the layer picture comprises:obtaining picture data by subtracting the generated prediction picture from the layer picture; andresidual-coding the obtained picture data.
- A multi-view video coding apparatus for providing a multi-view video service, the multi-view video coding apparatus comprising:a base layer coder which codes a base layer picture using an arbitrary video codec;a view converter which generates a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture ; anda residual coder which residual-codes a layer picture corresponding to the different view using the generated prediction picture.
- A multi-view video decoding method for providing a multi-view video service, the multi-view video decoding method comprising:reconstructing a base layer picture using an arbitrary video codec;generating a prediction picture using at least one of the reconstructed base layer picture and a reconstructed layer picture corresponding to a view different from a view of the base layer picture; andreconstructing a layer picture corresponding to the different view using a residual-decoded layer picture and the generated prediction picture.
- The multi-view video coding method of claim 1 or the multi-view video decoding method of claim 6, wherein the generating the prediction picture comprises generating the prediction picture according to flag information indicating which of the reconstructed base layer picture and the reconstructed layer picture is to be used to generate the prediction picture.
- The multi-view video coding method of claim 1 or the multi-view video decoding method of claim 6, wherein the generating the prediction picture comprises:when the reconstructed base layer picture is used to generate the prediction picture, performing Disparity Compensation (DC) from the reconstructed base layer picture.
- The multi-view video coding method of claim 1 or the multi-view video decoding method of claim 6, wherein the generating the prediction picture comprises:when the reconstructed layer picture is used to generate the prediction picture, performing Motion Compensation (MC) from the reconstructed layer picture.
- The multi-view video coding method of claim 1 or the multi-view video decoding method of claim 6, wherein the generating the prediction picture comprises: generating the prediction picture using a disparity vector when a picture type is an intra-picture; andgenerating the prediction picture using a motion vector when the picture type is an inter-picture.
- A multi-view video decoding apparatus for providing a multi-view video service, the multi-view video decoding apparatus comprising:a base layer decoder which reconstructs a base layer picture using an arbitrary video codec;a view converter which generates a prediction picture using at least one of the reconstructed base layer picture and a reconstructed layer picture corresponding to a view different from a view of the base layer picture;a residual decoder which residual-decodes a layer picture corresponding to the different view; anda combiner which reconstructs the layer picture corresponding to the different view by adding the generated prediction picture to the residual-decoded layer picture.
- The multi-view video coding method of claim 1, the multi-view video coding apparatus of claim 5, the multi-view video decoding method of claim 6, or the multi-view video decoding apparatus of claim 11, wherein the reconstructed layer picture is a previously reconstructed layer picture.
- The multi-view video coding method of claim 1, the multi-view video coding apparatus of claim 5, the multi-view video decoding method of claim 6, or the multi-view video decoding apparatus of claim 11, wherein the reconstructed layer picture is a currently reconstructed layer picture.
- The multi-view video coding apparatus of claim 5 or the multi-view video decoding apparatus of claim 11, wherein the view converter comprises a disparity compensator which performs Disparity Compensation (DC) from the reconstructed base layer picture, when the reconstructed base layer picture is used to generate the prediction picture.
- The multi-view video coding apparatus of claim 5 or the multi-view video decoding apparatus of claim 11, wherein the view converter generates the prediction picture according to flag information indicating which of the reconstructed base layer picture and the reconstructed layer picture is to be used to generate the prediction picture.
- The multi-view video coding apparatus of claim 5 or the multi-view video decoding apparatus of claim 11, wherein the view converter comprises a motion compensator which performs Motion Compensation (MC) from the reconstructed layer picture, when the reconstructed layer picture is used to generate the prediction picture.
- The multi-view video coding method of claim 1, the multi-view video coding apparatus of claim 5, the multi-view video decoding method of claim 6, or the multi-view video decoding apparatus of claim 11, wherein if the multi-view system implements a plurality of layer pictures corresponding to a plurality of different views, a plurality of prediction pictures are generated to correspond to the plurality of layer pictures.
- The multi-view video coding apparatus of claim 5 or the multi-view video decoding apparatus of claim 11, wherein the view converter generates the prediction picture using a disparity vector when a picture type is an intra-picture, and generates the prediction picture using a motion vector when the picture type is an inter-picture.
- A computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 1.
- A computer readable recording medium having recorded thereon a program executable by a computer for performing the method of claim 6.
- A multi-view video providing system comprising:a multi-view video coding apparatus, comprising:a base layer coder which codes a base layer picture using an arbitrary video codec,a view converter which generates a prediction picture using at least one of a reconstructed base layer picture, which is reconstructed from the coded base layer picture, and a reconstructed layer picture corresponding to a view different from a view of the base layer picture,a residual coder which residual-codes a layer picture corresponding to the different view using the generated prediction picture, anda multiplexer which multiplexes the coded base layer picture and the residual-coded layer picture into a bitstream, and outputs the bitstream; anda multi-view video decoding apparatus comprising:a demultiplexer which receives and demultiplexes the output bitstream into a base layer bitstream and a layer bitstream,a base layer decoder which reconstructs the base layer picture from the base layer bitstream using a video codec corresponding to the arbitrary video codec,a view converter which generates the prediction picture using at least one of the reconstructed base layer picture and the reconstructed layer picture corresponding to the different view,a residual decoder which residual-decodes the layer bitstream to output a residual-decoded layer picture, anda combiner which reconstructs the layer picture corresponding to the different view by adding the generated prediction picture to the residual-decoded layer picture.
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201080032420.8A CN102577376B (en) | 2009-07-17 | 2010-07-19 | Method, apparatus and system for multi-view video coding and decoding |
| MX2012000804A MX2012000804A (en) | 2009-07-17 | 2010-07-19 | Method and apparatus for multi-view video coding and decoding. |
| JP2012520550A JP2012533925A (en) | 2009-07-17 | 2010-07-19 | Method and apparatus for multi-view video encoding and decoding |
| EP10800076.1A EP2452491A4 (en) | 2009-07-17 | 2010-07-19 | METHOD AND APPARATUS FOR MULTI-VIEWED VIDEO ENCODING AND DECODING |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR10-2009-0065615 | 2009-07-17 | ||
| KR1020090065615A KR20110007928A (en) | 2009-07-17 | 2009-07-17 | Method and apparatus for multiview image encoding and decoding |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2011008065A2 true WO2011008065A2 (en) | 2011-01-20 |
| WO2011008065A3 WO2011008065A3 (en) | 2011-05-19 |
Family
ID=43450009
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/KR2010/004717 Ceased WO2011008065A2 (en) | 2009-07-17 | 2010-07-19 | Method and apparatus for multi-view video coding and decoding |
Country Status (7)
| Country | Link |
|---|---|
| US (1) | US20110012994A1 (en) |
| EP (1) | EP2452491A4 (en) |
| JP (1) | JP2012533925A (en) |
| KR (1) | KR20110007928A (en) |
| CN (1) | CN102577376B (en) |
| MX (1) | MX2012000804A (en) |
| WO (1) | WO2011008065A2 (en) |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2013155795A1 (en) * | 2012-04-19 | 2013-10-24 | Lg Electronics(China) R&D Center Co., Ltd | Method and apparatus for predicting residual |
| CN103563387A (en) * | 2011-05-16 | 2014-02-05 | 索尼公司 | Image processing apparatus and image processing method |
| CN103828371A (en) * | 2011-09-22 | 2014-05-28 | 松下电器产业株式会社 | Moving-image encoding method, moving-image encoding device, moving image decoding method, and moving image decoding device |
| US20160130602A1 (en) * | 2013-06-03 | 2016-05-12 | Vib Vzw | Means and methods for yield performance in plants |
Families Citing this family (35)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20130250056A1 (en) * | 2010-10-06 | 2013-09-26 | Nomad3D Sas | Multiview 3d compression format and algorithms |
| WO2012086203A1 (en) * | 2010-12-22 | 2012-06-28 | パナソニック株式会社 | Image encoding apparatus, image decoding apparatus, image encoding method, and image decoding method |
| US9363500B2 (en) * | 2011-03-18 | 2016-06-07 | Sony Corporation | Image processing device, image processing method, and program |
| KR20120118781A (en) * | 2011-04-19 | 2012-10-29 | 삼성전자주식회사 | Method and apparatus for unified scalable video encoding for multi-view video, method and apparatus for unified scalable video decoding for multi-view video |
| US20130003847A1 (en) * | 2011-06-30 | 2013-01-03 | Danny Hong | Motion Prediction in Scalable Video Coding |
| WO2013022281A2 (en) * | 2011-08-09 | 2013-02-14 | 삼성전자 주식회사 | Method for multiview video prediction encoding and device for same, and method for multiview video prediction decoding and device for same |
| US8923403B2 (en) | 2011-09-29 | 2014-12-30 | Dolby Laboratories Licensing Corporation | Dual-layer frame-compatible full-resolution stereoscopic 3D video delivery |
| TWI595770B (en) | 2011-09-29 | 2017-08-11 | 杜比實驗室特許公司 | Frame-compatible full-resolution stereoscopic 3d video delivery with symmetric picture resolution and quality |
| WO2013051896A1 (en) * | 2011-10-05 | 2013-04-11 | 한국전자통신연구원 | Video encoding/decoding method and apparatus for same |
| US9674534B2 (en) | 2012-01-19 | 2017-06-06 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding multi-view video prediction capable of view switching, and method and apparatus for decoding multi-view video prediction capable of view switching |
| US9961323B2 (en) | 2012-01-30 | 2018-05-01 | Samsung Electronics Co., Ltd. | Method and apparatus for multiview video encoding based on prediction structures for viewpoint switching, and method and apparatus for multiview video decoding based on prediction structures for viewpoint switching |
| US9659372B2 (en) | 2012-05-17 | 2017-05-23 | The Regents Of The University Of California | Video disparity estimate space-time refinement method and codec |
| US9219913B2 (en) * | 2012-06-13 | 2015-12-22 | Qualcomm Incorporated | Inferred base layer block for TEXTURE—BL mode in HEVC based single loop scalable video coding |
| KR101356890B1 (en) * | 2012-06-22 | 2014-02-03 | 한국방송공사 | Method and apparatus of inter-view video encoding and decoding in hybrid codecs for multi-view video coding |
| US20150208092A1 (en) * | 2012-06-29 | 2015-07-23 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding scalable video, and method and apparatus for decoding scalable video |
| WO2014038330A1 (en) * | 2012-09-06 | 2014-03-13 | ソニー株式会社 | Image processing device and image processing method |
| KR102238567B1 (en) * | 2012-09-19 | 2021-04-08 | 퀄컴 인코포레이티드 | Selection of pictures for disparity vector derivation |
| US9648318B2 (en) * | 2012-09-30 | 2017-05-09 | Qualcomm Incorporated | Performing residual prediction in video coding |
| US20150245063A1 (en) * | 2012-10-09 | 2015-08-27 | Nokia Technologies Oy | Method and apparatus for video coding |
| US9635357B2 (en) * | 2013-02-26 | 2017-04-25 | Qualcomm Incorporated | Neighboring block disparity vector derivation in 3D video coding |
| US9900576B2 (en) | 2013-03-18 | 2018-02-20 | Qualcomm Incorporated | Simplifications on disparity vector derivation and motion vector prediction in 3D video coding |
| US9762905B2 (en) * | 2013-03-22 | 2017-09-12 | Qualcomm Incorporated | Disparity vector refinement in video coding |
| WO2014163458A1 (en) * | 2013-04-05 | 2014-10-09 | 삼성전자주식회사 | Method for determining inter-prediction candidate for interlayer decoding and encoding method and apparatus |
| KR102186461B1 (en) * | 2013-04-05 | 2020-12-03 | 삼성전자주식회사 | Method and apparatus for incoding and decoding regarding position of integer pixel |
| EP2965523A1 (en) | 2013-04-08 | 2016-01-13 | Arris Technology, Inc. | Signaling for addition or removal of layers in video coding |
| US9667990B2 (en) * | 2013-05-31 | 2017-05-30 | Qualcomm Incorporated | Parallel derived disparity vector for 3D video coding with neighbor-based disparity vector derivation |
| WO2015008464A1 (en) * | 2013-07-14 | 2015-01-22 | Sharp Kabushiki Kaisha | Video parameter set signaling |
| US9628795B2 (en) * | 2013-07-17 | 2017-04-18 | Qualcomm Incorporated | Block identification using disparity vector in video coding |
| MX364550B (en) | 2014-05-21 | 2019-04-30 | Arris Entpr Llc | Signaling and selection for the enhancement of layers in scalable video. |
| MX360655B (en) | 2014-05-21 | 2018-11-12 | Arris Entpr Llc | Individual buffer management in transport of scalable video. |
| CA3003030C (en) | 2015-10-26 | 2023-12-19 | University Of Wyoming | Methods of generating microparticles and porous hydrogels using microfluidics |
| US20180213202A1 (en) * | 2017-01-23 | 2018-07-26 | Jaunt Inc. | Generating a Video Stream from a 360-Degree Video |
| FR3072850B1 (en) | 2017-10-19 | 2021-06-04 | Tdf | CODING AND DECODING METHODS OF A DATA FLOW REPRESENTATIVE OF AN OMNIDIRECTIONAL VIDEO |
| US12495152B2 (en) * | 2022-04-20 | 2025-12-09 | Nokia Technologies Oy | Method and apparatus for encoding, decoding, or progressive rendering of image |
| US20230283789A1 (en) * | 2022-05-16 | 2023-09-07 | Intel Corporation | Efficient hypertext transfer protocol (http) adaptive bitrate (abr) streaming based on scalable video coding (svc) |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070211796A1 (en) | 2006-03-09 | 2007-09-13 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding multi-view video to provide uniform picture quality |
Family Cites Families (17)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JPH09261653A (en) * | 1996-03-18 | 1997-10-03 | Sharp Corp | Multi-view image coding device |
| JP3519594B2 (en) * | 1998-03-03 | 2004-04-19 | Kddi株式会社 | Encoding device for stereo video |
| US7710462B2 (en) * | 2004-12-17 | 2010-05-04 | Mitsubishi Electric Research Laboratories, Inc. | Method for randomly accessing multiview videos |
| ZA200805337B (en) * | 2006-01-09 | 2009-11-25 | Thomson Licensing | Method and apparatus for providing reduced resolution update mode for multiview video coding |
| US8115804B2 (en) * | 2006-01-12 | 2012-02-14 | Lg Electronics Inc. | Processing multiview video |
| KR100949975B1 (en) * | 2006-03-30 | 2010-03-29 | 엘지전자 주식회사 | A method and apparatus for decoding/encoding a video signal |
| JP2009543514A (en) * | 2006-07-11 | 2009-12-03 | トムソン ライセンシング | Method and apparatus for use in multiview video coding |
| KR100919885B1 (en) * | 2006-10-25 | 2009-09-30 | 한국전자통신연구원 | Multi-view video scalable coding and decoding |
| US8548261B2 (en) * | 2007-04-11 | 2013-10-01 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding multi-view image |
| KR20100014553A (en) * | 2007-04-25 | 2010-02-10 | 엘지전자 주식회사 | A method and an apparatus for decoding/encoding a video signal |
| EP2168380A2 (en) * | 2007-06-28 | 2010-03-31 | Thomson Licensing | Single loop decoding of multi-view coded video |
| CN101415115B (en) * | 2007-10-15 | 2011-02-02 | 华为技术有限公司 | Method for encoding and decoding video based on movement dancing mode, and encoder and decoder thereof |
| US20090116558A1 (en) * | 2007-10-15 | 2009-05-07 | Nokia Corporation | Motion skip and single-loop encoding for multi-view video content |
| CN101415114B (en) * | 2007-10-17 | 2010-08-25 | 华为终端有限公司 | Method and apparatus for encoding and decoding video, and video encoder and decoder |
| CN101420609B (en) * | 2007-10-24 | 2010-08-25 | 华为终端有限公司 | Video encoding, decoding method and video encoder, decoder |
| KR101560182B1 (en) * | 2008-01-07 | 2015-10-15 | 삼성전자주식회사 | Method and apparatus for multi-view video encoding and method and apparatus for multi-view video decoding |
| KR20100089705A (en) * | 2009-02-04 | 2010-08-12 | 삼성전자주식회사 | Apparatus and method for encoding and decoding 3d video |
-
2009
- 2009-07-17 KR KR1020090065615A patent/KR20110007928A/en not_active Ceased
-
2010
- 2010-07-19 MX MX2012000804A patent/MX2012000804A/en active IP Right Grant
- 2010-07-19 EP EP10800076.1A patent/EP2452491A4/en not_active Withdrawn
- 2010-07-19 CN CN201080032420.8A patent/CN102577376B/en not_active Expired - Fee Related
- 2010-07-19 WO PCT/KR2010/004717 patent/WO2011008065A2/en not_active Ceased
- 2010-07-19 JP JP2012520550A patent/JP2012533925A/en active Pending
- 2010-07-19 US US12/838,957 patent/US20110012994A1/en not_active Abandoned
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20070211796A1 (en) | 2006-03-09 | 2007-09-13 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding and decoding multi-view video to provide uniform picture quality |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN103563387A (en) * | 2011-05-16 | 2014-02-05 | 索尼公司 | Image processing apparatus and image processing method |
| CN103828371A (en) * | 2011-09-22 | 2014-05-28 | 松下电器产业株式会社 | Moving-image encoding method, moving-image encoding device, moving image decoding method, and moving image decoding device |
| CN103828371B (en) * | 2011-09-22 | 2017-08-22 | 太阳专利托管公司 | Dynamic image encoding method, dynamic image encoding device and dynamic image decoding method and moving image decoding apparatus |
| WO2013155795A1 (en) * | 2012-04-19 | 2013-10-24 | Lg Electronics(China) R&D Center Co., Ltd | Method and apparatus for predicting residual |
| CN103379340A (en) * | 2012-04-19 | 2013-10-30 | 乐金电子(中国)研究开发中心有限公司 | Residual error prediction method and device |
| US20150215642A1 (en) * | 2012-04-19 | 2015-07-30 | Jie Jia | Method and apparatus for predicting residual |
| CN103379340B (en) * | 2012-04-19 | 2017-09-01 | 乐金电子(中国)研究开发中心有限公司 | A kind of residual error prediction method and device |
| US10397609B2 (en) | 2012-04-19 | 2019-08-27 | Lg Electronics (China) R & D Center Co., Ltd | Method and apparatus for predicting residual |
| US20160130602A1 (en) * | 2013-06-03 | 2016-05-12 | Vib Vzw | Means and methods for yield performance in plants |
| US10801032B2 (en) * | 2013-06-03 | 2020-10-13 | Vib Vzw | Means and methods for yield performance in plants |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2011008065A3 (en) | 2011-05-19 |
| EP2452491A2 (en) | 2012-05-16 |
| US20110012994A1 (en) | 2011-01-20 |
| KR20110007928A (en) | 2011-01-25 |
| MX2012000804A (en) | 2012-03-14 |
| CN102577376B (en) | 2015-05-27 |
| CN102577376A (en) | 2012-07-11 |
| JP2012533925A (en) | 2012-12-27 |
| EP2452491A4 (en) | 2014-03-12 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2011008065A2 (en) | Method and apparatus for multi-view video coding and decoding | |
| JP5602192B2 (en) | Video encoding method, video decoding method and apparatus thereof | |
| US7817866B2 (en) | Processing multiview video | |
| US8270482B2 (en) | Method and apparatus for encoding and decoding multi-view video to provide uniform picture quality | |
| US10080029B2 (en) | Video encoding/decoding method and apparatus | |
| WO2012144821A2 (en) | Method and apparatus for unified scalable video encoding for multi-view video and method and apparatus for unified scalable video decoding for multi-view video | |
| WO2012036468A2 (en) | Method and apparatus for hierarchical picture encoding and decoding | |
| WO2010123203A2 (en) | Reference picture list changing method of multi-view video | |
| WO2010068020A9 (en) | Multi- view video coding/decoding method and apparatus | |
| BRPI0616745A2 (en) | multi-view video encoding / decoding using scalable video encoding / decoding | |
| WO2010087589A2 (en) | Method and apparatus for processing video signals using boundary intra coding | |
| WO2014171768A1 (en) | Video signal processing method and apparatus | |
| WO2014107083A1 (en) | Video signal processing method and device | |
| WO2010090462A2 (en) | Apparatus and method for encoding and decoding multi-view image | |
| WO2012081877A2 (en) | Multi-view video encoding/decoding apparatus and method | |
| EP3059968A1 (en) | Method and apparatus for decoding multi-view video | |
| WO2016056822A1 (en) | 3d video coding method and device | |
| WO2015009098A1 (en) | Method and apparatus for processing video signal | |
| JP2011077722A (en) | Image decoding apparatus, and image decoding method and program for the same | |
| WO2009108028A1 (en) | Method for decoding free viewpoint image, and apparatus for implementing the same | |
| US11445206B2 (en) | Method and apparatus for video coding | |
| WO2014054897A1 (en) | Method and device for processing video signal | |
| WO2015072626A1 (en) | Interlayer reference picture generation method and apparatus for multiple layer video coding | |
| WO2012099352A2 (en) | Device and method for encoding/deciding multi-viewpoint images | |
| WO2014088316A2 (en) | Video encoding and decoding method, and apparatus using same |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| WWE | Wipo information: entry into national phase |
Ref document number: 201080032420.8 Country of ref document: CN |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10800076 Country of ref document: EP Kind code of ref document: A2 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2012520550 Country of ref document: JP Ref document number: MX/A/2012/000804 Country of ref document: MX |
|
| REEP | Request for entry into the european phase |
Ref document number: 2010800076 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2010800076 Country of ref document: EP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 1267/CHENP/2012 Country of ref document: IN |