WO2014094158A1 - Scalable high throughput video encoder - Google Patents

Scalable high throughput video encoder Download PDF

Info

Publication number
WO2014094158A1
WO2014094158A1 PCT/CA2013/050979 CA2013050979W WO2014094158A1 WO 2014094158 A1 WO2014094158 A1 WO 2014094158A1 CA 2013050979 W CA2013050979 W CA 2013050979W WO 2014094158 A1 WO2014094158 A1 WO 2014094158A1
Authority
WO
WIPO (PCT)
Prior art keywords
encoder
frame
macroblock rows
encoding
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CA2013/050979
Other languages
French (fr)
Inventor
Lei Zhang
Ying Luo
Edward HAROLD
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ATI Technologies ULC
Original Assignee
ATI Technologies ULC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ATI Technologies ULC filed Critical ATI Technologies ULC
Priority to JP2015548125A priority Critical patent/JP2016506662A/en
Priority to CN201380069767.3A priority patent/CN104904215A/en
Priority to EP13864147.7A priority patent/EP2936810A4/en
Priority to KR1020157019322A priority patent/KR20150099571A/en
Publication of WO2014094158A1 publication Critical patent/WO2014094158A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Definitions

  • the present disclosure is generally directed to encoding, and in particular, to video encoding.
  • High throughput video encoding is critical for high-performance video transcoding or cloud gaming applications. Often, in video transcoding applications, a two hour movie needs to be transcoded in a few minutes, or at least in a few tens of minutes. In cloud gaming applications, multiple sessions of game rendering needs to be encoded before they can be transmitted across a network, for example, over the Internet or an Intranet.
  • the high performance video transcoding and cloud gaming applications require a few multiples of 1080p at 30fps or 1080p at 60fps. This provides a scalability challenge for hardware video encoders to support a high throughput.
  • Some implementations have resorted to hybrid approaches where part of the encoding of a video frame is completely done in a 3D shader, (which uses the central processing unit or graphics processing unit), while the rest of the encoding of a frame is done on fixed function hardware.
  • a scalable high throughput video encoder is described herein.
  • a plurality of dedicated, hardware video encoders runs in a staggered, parallel architecture, where each video encoder encodes a video frame and the stagger or delay is a programmable number of macroblock rows.
  • the first video encoder signals a second video encoder to start encoding a macroblock row of a next unprocessed frame. Both video encoders continue encoding in parallel in a synchronized staggered manner.
  • the forst video encoder starts encoding x macroblock rows of another unprocessed frame.
  • Figure 1 is an example system architecture that uses high throughput video encoders, according to some embodiments.
  • Figure 2 is an example high throughput video encoder, according to some embodiments.
  • Figure 3 is an example diagram of frames and macroblock rows
  • Figure 4 is an example flowchart for encoding video data using high throughput video encoders, according to some embodiments
  • Figure 5 is another example flowchart for encoding video data using high throughput video encoders, according to some embodiments.
  • Figure 6 is a block diagram of an example source or destination device for use with embodiment of the high throughput video encoders, according to some embodiments.
  • Figure 1 is an example system 100 that uses high throughput video encoders as described herein below to send encoded video data over a network 105 from a source side 110 to a destination side 1 15, according to some embodiments.
  • the source side 110 includes any device capable of storing, capturing or generating video data that may be transmitted to the destination side 1 15.
  • the device may include, but is not limited to, a source device 120, a mobile phone 122, online gaming device 124, a camera 126 or a multimedia server 128.
  • the video data from these devices feeds encoder(s) 130, which in turn encodes the video data as described herein below.
  • the encoded video data is processed by decoder(s) 140, which in turn sends the decoded video data to destination devices, which may include, but is not limited to, destination device 142, online gaming device 144, and a display monitor 146.
  • destination devices may include, but is not limited to, destination device 142, online gaming device 144, and a display monitor 146.
  • the encoder(s) 130 are shown as a separate device(s), it may be implemented as an external device or integrated in any device that may be used in storing, capturing, generating or transmitting video data.
  • FIG. 2 is a block diagram of an example high throughput video encoder 200, according to some embodiments.
  • the high throughput video encoder 200 may include a plurality of video encoders for receiving video data and outputting encoded video data. Each of the plurality of video encoders is a complete, fixed function, hardware video encoder.
  • the high throughput video encoder 200 may include video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220, where video encoder 1 205 is connected to encoder 2 210, video encoder 2 210 is connected to video encoder 3 215 and so on until video encoder N 220, which is connected to video encoder 1 205.
  • Video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220 each receive source video data 225 and output encoded video data 230.
  • Each of the video plurality of video encoders is further connected to a common memory for storing and reading reference data as described herein.
  • video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220 are connected to memory 235.
  • the high throughput video encoder may include 2 to N video encoder instances or circuits.
  • Each video encoder instance encodes a video frame, where video data includes multiple video frames.
  • Figure 3 is an example diagram of a frame 1 300 and a frame 2 305.
  • Each of the frames 300 and 305 contains macroblock rows 1 . . . m, where each macroblock row may have, for example, 8 to 16 raster lines, depending on the video encoding standard or scheme being used.
  • the video encoder uses the reference generated by the previous video frame.
  • all of the video encoders need to work in parallel without having to wait for other video encoders to completely finish encoding a video frame. This is achieved by having each video encoder wait for a programmable or predetermined number of macroblock rows.
  • the predetermined number of macroblock rows is less than the total number of macroblock rows in a frame. In another embodiment, the predetermined number of macroblock rows is small with respect to the total number of macroblock rows in a frame.
  • the predetermined number of macroblock rows may be on the order of 1-10 macroblock rows. This number can be predetermined but can be signaled by the video encoder encoding the previous frame. This method ensures that the video encoder that encodes the previous frame (N-l) finishes generating the reference for the video encoder that encodes the current frame (frame N) needs to use. In this manner, all video encoders are staggered by a few macroblock rows but are working in parallel for maximum throughput.
  • FIG. 4 is an example high level flowchart 400 for a video data using a high throughput video encoder, according to some embodiments.
  • a video encoder encodes a first x macroblock rows of a frame (405).
  • the video encoder signals another video encoder to start encoding a macroblock row of a next unprocessed frame after the first x macroblock rows are complete (410).
  • Both (or all) video encoders continue encoding in parallel (415) in a synchronized staggered manner. If the frame is completed, the video encoder starts encoding x macroblock rows of another unprocessed frame (420). Otherwise, the video encoders continue encoding the frame (425).
  • FIG. 5 is an example flowchart 500 for encoding video data using a high throughput video encoder and is also described with reference to Figures 2 and 3, according to some embodiments.
  • the flowchart 500 is described with reference to two video encoders, encoder 1 205 and encoder 2 210, and assumes that the number of macroblock rows is 5 macroblock rows. This is shown in Figure 2 as macroblock rows 250.
  • encoder 1 205 receives a frame 1 300 from the source video data 225 and starts to encode frame 1 300 (505).
  • Encoder 2 210 waits until encoder 1 205 finishes encoding the programmed or predetermined number of macroblock rows, for example, macroblock rows 350. This constitutes the initial delay.
  • encoder 1 205 completes encoding macroblock rows 350
  • encoder 1 205 generates reference data associated with the macroblock rows 350 and stores the reference data in storage, for example, memory 235 (510).
  • Encoder 1 205 signals encoder2210 to start encoding macroblock row 1 for frame 2 305 (515).
  • Encoder 2 210 starts encoding macroblock row 1 of frame 2 305 and in parallel, encoder 1 205 continues to encode the next macroblock row, i.e. macroblock row 6 frame 1 300 (520). When encoder 1 205 finishes encoding macroblock row 6, encoder 1 205 signals encoder 2 210 to start encoding macroblock row 2 of frame 2 305 (525). Due the dependency relationship between encoder 1 205 and encoder 2 210, (i.e. encoder 2 210 needing the reference data from encoder 1 205), encoder 2 210 is always lagging by the predetermined number of macroblock rows but in-step with encoder 1 205.
  • encoder 1 205 and encoder 2 210 operating in parallel in a synchronized, staggered manner. Assuming for purposes of illustration that the frames have a 1920x1088 frame resolution and that each macroblock has 16x16 pixels, when encoder 1 205 finishes encoding macroblock row 67 of frame 1 300, encoder 1 205 signals encoder 2 210 to encode macroblock row 63 of frame 2 305.
  • encoder 1 205 finishes encoding macroblock row 68 of frame 1 305, encoder 1
  • encoder 2 205 signals encoder 2 210 that encoder 2 210 can encode macroblock rows 64-68 of frame 2 305 since encoder 1 205 has finished generating all the references for frame 1 300 (530).
  • Encoder 1 205 starts encoding frame 3 once macroblock row 68 of frame 1 300 is completed (535). However, encoder 2 210 has to wait for encoder 1 205 to finish encoding the first programmed or predetermined number of macroblock rows of frame 3 before encoder 2 210 can start encoding the next frame, i.e. frame 4.
  • This method can scale to a large number of video encoders for maximum throughput.
  • the long term throughput is N if there are, for example, N video encoders.
  • the initialization delay introduces a fixed amount of stagger or delay for each video encoder. For example, for the Nth video encoder given x as the predefined or programmed number of macroblock rows, then the stagger or delay will be Nx.
  • FIG. 6 is a block diagram of a device 600 in which the high throughput video encoders described herein may be implemented, according to some embodiments.
  • the device 600 may include, for example, a computer, a gaming device, a handheld device, a set-top box, a television, a mobile phone, or a tablet computer.
  • the device 100 includes a processor 602, a memory 604, a storage 606, one or more input devices 608, and one or more output devices 610.
  • the device 600 may also optionally include an input driver 612 and an output driver 614. It is understood that the device 100 may include additional components not shown in Figure 6.
  • the processor 602 may include a central processing unit (CPU), a graphics processing unit (GPU), a CPU and GPU located on the same die, or one or more processor cores, wherein each processor core may be a CPU or a GPU.
  • the memory 604 may be located on the same die as the processor 602, or may be located separately from the processor 602.
  • the memory 604 may include a volatile or non- volatile memory, for example, random access memory (RAM), dynamic RAM, or a cache.
  • the high throughput video encoders are implemented in the processor 602.
  • the storage 606 may include a fixed or removable storage, for example, a hard disk drive, a solid state drive, an optical disk, or a flash drive.
  • the input devices 608 may include a keyboard, a keypad, a touch screen, a touch pad, a detector, a microphone, an accelerometer, a gyroscope, a biometric scanner, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals).
  • the output devices 610 may include a display, a speaker, a printer, a haptic feedback device, one or more lights, an antenna, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals).
  • the input driver 612 communicates with the processor 602 and the input devices
  • the output driver 614 communicates with the processor 602 and the output devices 610, and permits the processor 602 to send output to the output devices 610. It is noted that the input driver 612 and the output driver 614 are optional components, and that the device 600 will operate in the same manner if the input driver 612 and the output driver 614 are not present.
  • the video encoders described herein may use a variety of encoding schemes including, but not limited to, Moving Picture Experts Group (MPEG) MPEG-1 , MPEG-2, MPEG- 4, MPEG-4 Part 10, Windows® *.avi format, Quicktime® *.mov format, H.264 encoding schemes, High Efficiency Video Coding (HEVC) encoding schemes and streaming video formats.
  • MPEG Moving Picture Experts Group
  • MPEG-2 MPEG-2
  • Windows® *.avi format Portable Markup Language
  • Quicktime® *.mov format High Efficiency Video Coding (HEVC) encoding schemes
  • HEVC High Efficiency Video Coding
  • a method for encoding includes encoding a frame using an encoder and encoding a next frame using another encoder after the encoder completes encoding a predetermined number of macroblock rows of the frame.
  • the encoder and the another encoder operate in parallel in a synchronized, staggered manner.
  • the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
  • the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
  • processors include, by way of example, a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) circuits, any other type of integrated circuit (IC), and/or a state machine.
  • DSP digital signal processor
  • ASICs Application Specific Integrated Circuits
  • FPGAs Field Programmable Gate Arrays
  • Such processors may be manufactured by configuring a manufacturing process using the results of processed hardware description language (HDL) instructions and other intermediary data including netlists (such instructions capable of being stored on a computer readable media).
  • HDL hardware description language
  • netlists such instructions capable of being stored on a computer readable media.
  • the results of such processing may be maskworks that are then used in a semiconductor manufacturing process to manufacture a processor which implements aspects of the embodiments.
  • RAM random access memory
  • cache memory volatile and re-volatile memory
  • semiconductor memory devices magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
  • magnetic media such as internal hard disks and removable disks
  • magneto-optical media magneto-optical media
  • optical media such as CD-ROM disks, and digital versatile disks (DVDs).

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computing Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A scalable high throughput video encoder is described herein. A plurality of dedicated, hardware video encoders runs in a staggered, parallel architecture, where each video encoder encodes a video frame and the stagger or delay is a programmable number of macroblock rows. In an example method, after a first video encoder finishes encoding the first x macroblock rows of a frame, the first video encoder signals a second video encoder to start encoding a macroblock row of a next unprocessed frame. Both video encoders continue encoding in parallel in a synchronized, staggered manner. At the end of the frame, the first video encoder starts encoding x macroblock rows of another unprocessed frame.

Description

SCALABLE HIGH THROUGHPUT VIDEO ENCODER
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of U.S. Application No. 13/720,546, filed
December 19, 2012, the contents of which are hereby incorporated by reference herein.
FIELD
[0002] The present disclosure is generally directed to encoding, and in particular, to video encoding.
BACKGROUND
[0003] The transmission and reception of video data over various medium is ever increasing. Typically, video encoders are used to compress the video data and reduce the amount of video data transmitted over the medium. Traditional video encoding applications such as wireless displays or high definition video conferencing requires only modest throughput, such as 1080p at 30 frames per second (fps) or 1080p at 60fps.
[0004] High throughput video encoding is critical for high-performance video transcoding or cloud gaming applications. Often, in video transcoding applications, a two hour movie needs to be transcoded in a few minutes, or at least in a few tens of minutes. In cloud gaming applications, multiple sessions of game rendering needs to be encoded before they can be transmitted across a network, for example, over the Internet or an Intranet. The high performance video transcoding and cloud gaming applications require a few multiples of 1080p at 30fps or 1080p at 60fps. This provides a scalability challenge for hardware video encoders to support a high throughput. Some implementations have resorted to hybrid approaches where part of the encoding of a video frame is completely done in a 3D shader, (which uses the central processing unit or graphics processing unit), while the rest of the encoding of a frame is done on fixed function hardware.
SUMMARY
[0005] A scalable high throughput video encoder is described herein. A plurality of dedicated, hardware video encoders runs in a staggered, parallel architecture, where each video encoder encodes a video frame and the stagger or delay is a programmable number of macroblock rows. In an example method, after a first video encoder finishes encoding the first x macroblock rows of a frame, the first video encoder signals a second video encoder to start encoding a macroblock row of a next unprocessed frame. Both video encoders continue encoding in parallel in a synchronized staggered manner. At the end of the frame, the forst video encoder starts encoding x macroblock rows of another unprocessed frame.
BRIEF DESCRIPTION OF THE DRAWINGS
[0006] A more detailed understanding may be had from the following description, given by way of example in conjunction with the accompanying drawings wherein:
[0007] Figure 1 is an example system architecture that uses high throughput video encoders, according to some embodiments;
[0008] Figure 2 is an example high throughput video encoder, according to some embodiments;
[0009] Figure 3 is an example diagram of frames and macroblock rows;
[0010] Figure 4 is an example flowchart for encoding video data using high throughput video encoders, according to some embodiments;
[001 1] Figure 5 is another example flowchart for encoding video data using high throughput video encoders, according to some embodiments; and
[0012] Figure 6 is a block diagram of an example source or destination device for use with embodiment of the high throughput video encoders, according to some embodiments.
DETAILED DESCRIPTION
[0013] Figure 1 is an example system 100 that uses high throughput video encoders as described herein below to send encoded video data over a network 105 from a source side 110 to a destination side 1 15, according to some embodiments. The source side 110 includes any device capable of storing, capturing or generating video data that may be transmitted to the destination side 1 15. The device may include, but is not limited to, a source device 120, a mobile phone 122, online gaming device 124, a camera 126 or a multimedia server 128. The video data from these devices feeds encoder(s) 130, which in turn encodes the video data as described herein below. The encoded video data is processed by decoder(s) 140, which in turn sends the decoded video data to destination devices, which may include, but is not limited to, destination device 142, online gaming device 144, and a display monitor 146. Although the encoder(s) 130 are shown as a separate device(s), it may be implemented as an external device or integrated in any device that may be used in storing, capturing, generating or transmitting video data.
[0014] Figure 2 is a block diagram of an example high throughput video encoder 200, according to some embodiments. The high throughput video encoder 200 may include a plurality of video encoders for receiving video data and outputting encoded video data. Each of the plurality of video encoders is a complete, fixed function, hardware video encoder. For purposes of illustration only, the high throughput video encoder 200 may include video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220, where video encoder 1 205 is connected to encoder 2 210, video encoder 2 210 is connected to video encoder 3 215 and so on until video encoder N 220, which is connected to video encoder 1 205. Video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220 each receive source video data 225 and output encoded video data 230. Each of the video plurality of video encoders is further connected to a common memory for storing and reading reference data as described herein. For example, video encoder 1 205, video encoder 2 210, video encoder 3 215 through video encoder N 220 are connected to memory 235.
[0015] As described herein, the high throughput video encoder may include 2 to N video encoder instances or circuits. Each video encoder instance encodes a video frame, where video data includes multiple video frames. Figure 3 is an example diagram of a frame 1 300 and a frame 2 305. Each of the frames 300 and 305 contains macroblock rows 1 . . . m, where each macroblock row may have, for example, 8 to 16 raster lines, depending on the video encoding standard or scheme being used.
[0016] In standard encoding schemes, there exists a dependency on a previous frame when encoding a current frame. For example, when encoding the current frame, the video encoder uses the reference generated by the previous video frame. To maximize the video encoding throughput, all of the video encoders need to work in parallel without having to wait for other video encoders to completely finish encoding a video frame. This is achieved by having each video encoder wait for a programmable or predetermined number of macroblock rows. In an embodiment, the predetermined number of macroblock rows is less than the total number of macroblock rows in a frame. In another embodiment, the predetermined number of macroblock rows is small with respect to the total number of macroblock rows in a frame. In another embodiment, the predetermined number of macroblock rows may be on the order of 1-10 macroblock rows. This number can be predetermined but can be signaled by the video encoder encoding the previous frame. This method ensures that the video encoder that encodes the previous frame (N-l) finishes generating the reference for the video encoder that encodes the current frame (frame N) needs to use. In this manner, all video encoders are staggered by a few macroblock rows but are working in parallel for maximum throughput.
[0017] Figure 4 is an example high level flowchart 400 for a video data using a high throughput video encoder, according to some embodiments. A video encoder encodes a first x macroblock rows of a frame (405). The video encoder signals another video encoder to start encoding a macroblock row of a next unprocessed frame after the first x macroblock rows are complete (410). Both (or all) video encoders continue encoding in parallel (415) in a synchronized staggered manner. If the frame is completed, the video encoder starts encoding x macroblock rows of another unprocessed frame (420). Otherwise, the video encoders continue encoding the frame (425).
[0018] Figure 5 is an example flowchart 500 for encoding video data using a high throughput video encoder and is also described with reference to Figures 2 and 3, according to some embodiments. For purposes of illustration only, the flowchart 500 is described with reference to two video encoders, encoder 1 205 and encoder 2 210, and assumes that the number of macroblock rows is 5 macroblock rows. This is shown in Figure 2 as macroblock rows 250.
[0019] Initially, encoder 1 205 receives a frame 1 300 from the source video data 225 and starts to encode frame 1 300 (505). Encoder 2 210 waits until encoder 1 205 finishes encoding the programmed or predetermined number of macroblock rows, for example, macroblock rows 350. This constitutes the initial delay. Once encoder 1 205 completes encoding macroblock rows 350, encoder 1 205 generates reference data associated with the macroblock rows 350 and stores the reference data in storage, for example, memory 235 (510). Encoder 1 205 signals encoder2210 to start encoding macroblock row 1 for frame 2 305 (515). [0020] Encoder 2 210 starts encoding macroblock row 1 of frame 2 305 and in parallel, encoder 1 205 continues to encode the next macroblock row, i.e. macroblock row 6 frame 1 300 (520). When encoder 1 205 finishes encoding macroblock row 6, encoder 1 205 signals encoder 2 210 to start encoding macroblock row 2 of frame 2 305 (525). Due the dependency relationship between encoder 1 205 and encoder 2 210, (i.e. encoder 2 210 needing the reference data from encoder 1 205), encoder 2 210 is always lagging by the predetermined number of macroblock rows but in-step with encoder 1 205. This results in encoder 1 205 and encoder 2 210 operating in parallel in a synchronized, staggered manner. Assuming for purposes of illustration that the frames have a 1920x1088 frame resolution and that each macroblock has 16x16 pixels, when encoder 1 205 finishes encoding macroblock row 67 of frame 1 300, encoder 1 205 signals encoder 2 210 to encode macroblock row 63 of frame 2 305.
[0021 ] Once encoder 1 205 finishes encoding macroblock row 68 of frame 1 305, encoder 1
205 signals encoder 2 210 that encoder 2 210 can encode macroblock rows 64-68 of frame 2 305 since encoder 1 205 has finished generating all the references for frame 1 300 (530). Encoder 1 205 starts encoding frame 3 once macroblock row 68 of frame 1 300 is completed (535). However, encoder 2 210 has to wait for encoder 1 205 to finish encoding the first programmed or predetermined number of macroblock rows of frame 3 before encoder 2 210 can start encoding the next frame, i.e. frame 4.
[0022] This method can scale to a large number of video encoders for maximum throughput. After an initialization delay, the long term throughput is N if there are, for example, N video encoders. The initialization delay introduces a fixed amount of stagger or delay for each video encoder. For example, for the Nth video encoder given x as the predefined or programmed number of macroblock rows, then the stagger or delay will be Nx.
[0023] Figure 6 is a block diagram of a device 600 in which the high throughput video encoders described herein may be implemented, according to some embodiments. The device 600 may include, for example, a computer, a gaming device, a handheld device, a set-top box, a television, a mobile phone, or a tablet computer. The device 100 includes a processor 602, a memory 604, a storage 606, one or more input devices 608, and one or more output devices 610. The device 600 may also optionally include an input driver 612 and an output driver 614. It is understood that the device 100 may include additional components not shown in Figure 6. [0024] The processor 602 may include a central processing unit (CPU), a graphics processing unit (GPU), a CPU and GPU located on the same die, or one or more processor cores, wherein each processor core may be a CPU or a GPU. The memory 604 may be located on the same die as the processor 602, or may be located separately from the processor 602. The memory 604 may include a volatile or non- volatile memory, for example, random access memory (RAM), dynamic RAM, or a cache. In some embodiments, the high throughput video encoders are implemented in the processor 602.
[0025] The storage 606 may include a fixed or removable storage, for example, a hard disk drive, a solid state drive, an optical disk, or a flash drive. The input devices 608 may include a keyboard, a keypad, a touch screen, a touch pad, a detector, a microphone, an accelerometer, a gyroscope, a biometric scanner, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals). The output devices 610 may include a display, a speaker, a printer, a haptic feedback device, one or more lights, an antenna, or a network connection (e.g., a wireless local area network card for transmission and/or reception of wireless IEEE 802 signals).
[0026] The input driver 612 communicates with the processor 602 and the input devices
608, and permits the processor 602 to receive input from the input devices 608. The output driver 614 communicates with the processor 602 and the output devices 610, and permits the processor 602 to send output to the output devices 610. It is noted that the input driver 612 and the output driver 614 are optional components, and that the device 600 will operate in the same manner if the input driver 612 and the output driver 614 are not present.
[0027] The video encoders described herein may use a variety of encoding schemes including, but not limited to, Moving Picture Experts Group (MPEG) MPEG-1 , MPEG-2, MPEG- 4, MPEG-4 Part 10, Windows® *.avi format, Quicktime® *.mov format, H.264 encoding schemes, High Efficiency Video Coding (HEVC) encoding schemes and streaming video formats.
[0028] In general, in accordance with some embodiments, a method for encoding includes encoding a frame using an encoder and encoding a next frame using another encoder after the encoder completes encoding a predetermined number of macroblock rows of the frame. The encoder and the another encoder operate in parallel in a synchronized, staggered manner. In some embodiments, the predetermined number of macroblock rows is less than the number of macroblock rows in the frame. In some embodiments, the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
[0029] It should be understood that many variations are possible based on the disclosure herein. Although features and elements are described above in particular combinations, each feature or element may be used alone without the other features and elements or in various combinations with or without other features and elements.
[0030] The methods provided, to the extent applicable, may be implemented in a general purpose computer, a processor, or a processor core. Suitable processors include, by way of example, a general purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) circuits, any other type of integrated circuit (IC), and/or a state machine. Such processors may be manufactured by configuring a manufacturing process using the results of processed hardware description language (HDL) instructions and other intermediary data including netlists (such instructions capable of being stored on a computer readable media). The results of such processing may be maskworks that are then used in a semiconductor manufacturing process to manufacture a processor which implements aspects of the embodiments.
[0031] The methods or flow charts provided herein, to the extent applicable, may be implemented in a computer program, software, or firmware incorporated in a computer-readable storage medium for execution by a general purpose computer or a processor. Examples of computer-readable storage mediums include a read only memory (ROM), a random access memory
(RAM), a register, cache memory, semiconductor memory devices, magnetic media such as internal hard disks and removable disks, magneto-optical media, and optical media such as CD-ROM disks, and digital versatile disks (DVDs).
* * *

Claims

CLAIMS What is claimed is:
1. A method for encoding, comprising:
encoding a frame using a first encoder; and
encoding a next frame using a second encoder after the first encoder completes encoding a predetermined number of macroblock rows of the frame, wherein the first encoder and the second encoder operate in parallel in a synchronized, staggered manner.
2. The method of claim 1 , wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
3. The method of claim 1 , wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
4. The method of claim 1 , wherein the first encoder signals the second encoder when to start encoding the next frame.
5. The method of claim 1 , wherein the encoder generates reference data for the another encoder and stores the reference data in memory for use by the another encoder.
6. A method for encoding, comprising:
encoding a frame using a first encoder; and
encoding a next frame using a second encoder, wherein the first encoder and the second encoder operate in parallel in a synchronized, staggered manner, wherein the stagger is a predetermined number of macroblock rows.
7. The method of claim 6, wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
8. The method of claim 6, wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
9. The method of claim 6, wherein the first encoder signals the second encoder when to start encoding the next frame.
10. The method of claim 6, wherein the first encoder generates reference data for the second encoder and stores the reference data in memory for use by the second encoder.
1 1. A device, comprising:
a memory;
at least two encoders;
one encoder of the at least two encoders configured to encode a frame; and
another encoder of the at least two encoders configured to encode a next frame after the one encoder completes encoding a predetermined number of macroblock rows of the frame, wherein the one encoder and the another encoder operate in parallel in a synchronized, staggered manner.
12. The device of claim 1 1 , wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
13. The device of claim 1 1 , wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
14. The device of claim 1 1 , wherein the one encoder signals the another encoder when to start encoding the next frame.
15. The device of claim 1 1 , wherein the one encoder generates reference data for the another encoder and stores the reference data in the memory for use by the another encoder.
A device, comprising: a memory;
a plurality of encoders;
an encoder of the plurality of encoders configured to encode a frame; and
another encoder of the plurality of encoders configured to encode a next frame, wherein the encoder and the another encoder operate operate in parallel in a synchronized, staggered manner, wherein the stagger is a predetermined number of macroblock rows.
17. The device of claim 16, wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
18. The device of claim 16, wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
19. The device of claim 16, wherein the encoder signals the another encoder when to start encoding the next frame.
20. The device of claim 16, wherein the encoder generates reference data for the another encoder and stores the reference data in the memory for use by the another encoder.
21. A system for sending data from a source device to a destination device, comprising: a memory;
at least two encoders;
one encoder of the at least two encoders configured to encode a frame received from the source device; and
another encoder of the at least two encoders configured to encode a next frame received from the source device after the one encoder completes encoding a predetermined number of macroblock rows of the frame, wherein the one encoder and the another encoder operate in parallel in a synchronized, staggered manner.
22. The system of claim 21 , wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
23. The system of claim 21 , wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
24. The system of claim 21 , wherein the one encoder signals the another encoder when to start encoding the next frame.
25. The system of claim 21, wherein the one encoder generates reference data for the another encoder and stores the reference data in the memory for use by the another encoder.
26. A system for sending data from a source device to a destination device, comprising: a memory;
a plurality of encoders;
an encoder of the plurality of encoders configured to encode a frame received from the source device; and
another encoder of the plurality of encoders configured to encode a next frame received from the source device, wherein the encoder and the another encoder operate in parallel in a synchronized, staggered manner, wherein the stagger is a predetermined number of macroblock rows.
27. The system of claim 26, wherein the predetermined number of macroblock rows is less than the number of macroblock rows in the frame.
28. The system of claim 26, wherein the predetermined number of macroblock rows is on an order of 1 - 10 macroblock rows.
29. The system of claim 26, wherein the encoder signals the another encoder when to start encoding the next frame.
30. The system of claim 26, wherein the encoder generates reference data for the another encoder and stores the reference data in the memory for use by the another encoder.
PCT/CA2013/050979 2012-12-19 2013-12-17 Scalable high throughput video encoder Ceased WO2014094158A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2015548125A JP2016506662A (en) 2012-12-19 2013-12-17 Scalable high-throughput video encoder
CN201380069767.3A CN104904215A (en) 2012-12-19 2013-12-17 Scalable high throughput video encoder
EP13864147.7A EP2936810A4 (en) 2012-12-19 2013-12-17 Scalable high throughput video encoder
KR1020157019322A KR20150099571A (en) 2012-12-19 2013-12-17 Scalable high throughput video encoder

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US13/720,546 US20140169481A1 (en) 2012-12-19 2012-12-19 Scalable high throughput video encoder
US13/720,546 2012-12-19

Publications (1)

Publication Number Publication Date
WO2014094158A1 true WO2014094158A1 (en) 2014-06-26

Family

ID=50930870

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2013/050979 Ceased WO2014094158A1 (en) 2012-12-19 2013-12-17 Scalable high throughput video encoder

Country Status (6)

Country Link
US (1) US20140169481A1 (en)
EP (1) EP2936810A4 (en)
JP (1) JP2016506662A (en)
KR (1) KR20150099571A (en)
CN (1) CN104904215A (en)
WO (1) WO2014094158A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102273670B1 (en) * 2014-11-28 2021-07-05 삼성전자주식회사 Data processing system modifying a motion compensation information, and method for decoding video data including the same
US11615727B2 (en) * 2021-04-12 2023-03-28 Apple Inc. Preemptive refresh for reduced display judder

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253491A1 (en) * 2006-04-27 2007-11-01 Yoshiyuki Ito Image data processing apparatus, image data processing method, program for image data processing method, and recording medium recording program for image data processing method
CA2682449A1 (en) * 2007-03-29 2008-10-09 Scientific-Atlanta, Inc. Intra-macroblock video processing
US20120263225A1 (en) * 2011-04-15 2012-10-18 Media Excel Korea Co. Ltd. Apparatus and method for encoding moving picture

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060114995A1 (en) * 2004-12-01 2006-06-01 Joshua Robey Method and system for high speed video encoding using parallel encoders
US7920633B2 (en) * 2005-04-22 2011-04-05 Broadcom Corporation Method and system for parallel processing video data
US20080152014A1 (en) * 2006-12-21 2008-06-26 On Demand Microelectronics Method and apparatus for encoding and decoding of video streams
CN101938643A (en) * 2009-07-03 2011-01-05 哈尔滨工业大学深圳研究生院 Hardware Parallel Realization Structure of Video Compression Intra Prediction 16×16 Mode
US8379718B2 (en) * 2009-09-02 2013-02-19 Sony Computer Entertainment Inc. Parallel digital picture encoding
CA2722993A1 (en) * 2010-12-01 2012-06-01 Ecole De Technologie Superieure Multiframe and multislice parallel video encoding system with simultaneous predicted frame encoding

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070253491A1 (en) * 2006-04-27 2007-11-01 Yoshiyuki Ito Image data processing apparatus, image data processing method, program for image data processing method, and recording medium recording program for image data processing method
CA2682449A1 (en) * 2007-03-29 2008-10-09 Scientific-Atlanta, Inc. Intra-macroblock video processing
US20120263225A1 (en) * 2011-04-15 2012-10-18 Media Excel Korea Co. Ltd. Apparatus and method for encoding moving picture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2936810A4 *

Also Published As

Publication number Publication date
US20140169481A1 (en) 2014-06-19
EP2936810A1 (en) 2015-10-28
EP2936810A4 (en) 2016-06-29
JP2016506662A (en) 2016-03-03
KR20150099571A (en) 2015-08-31
CN104904215A (en) 2015-09-09

Similar Documents

Publication Publication Date Title
KR100990565B1 (en) System and method for processing multiple projections of video data in one video file
US20110216829A1 (en) Enabling delta compression and modification of motion estimation and metadata for rendering images to a remote display
KR101266667B1 (en) Dual-mode compression of images and videos for reliable real-time transmission
CN112400320B (en) Method, device and readable medium for decoding coded video sequence
US20140354771A1 (en) Efficient motion estimation for 3d stereo video encoding
TWI393447B (en) Frame buffer compression and memory allocation in a video decoder
US20150043645A1 (en) Video stream partitioning to allow efficient concurrent hardware decoding
US10931952B2 (en) Multi-codec encoder and multi-codec encoding system including the same
CN118400526A (en) Video decoding and encoding method and device, electronic equipment and readable medium
JP2022502950A (en) Video coding and decoding methods and equipment and computer programs
CN112236997B (en) Methods, devices and storage media for decoding and encoding video sequences
US10523958B1 (en) Parallel compression of image data in a compression device
US20140169481A1 (en) Scalable high throughput video encoder
EP3266203A1 (en) Content-adaptive b-picture pattern video encoding
US12604014B2 (en) Method and system of video processing with low latency bitstream distribution
JP2022504379A (en) Intra mode selection in intra prediction
CN110636296B (en) Video decoding method, video decoding device, computer equipment and storage medium
US20130287100A1 (en) Mechanism for facilitating cost-efficient and low-latency encoding of video streams
Salah et al. Hevc implementation for iot applications
US20260006230A1 (en) Video transcoder
WO2024060213A1 (en) Viewport switch latency reduction in live streaming
US20130195198A1 (en) Remote protocol
TW202541495A (en) Reduced leakage power for video codec parallel processing
WO2025099448A1 (en) Striping
HK40080397B (en) Video encoding method and apparatus

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13864147

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2015548125

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2013864147

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013864147

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 20157019322

Country of ref document: KR

Kind code of ref document: A