IP Camera and IP Speed Dome

What is MJPEG Motion JPEG Video Codec?
Why H.264 is The Next Big Thing?
What is H.264 compression?
What is Bit Rate?
What is JPEG2000?
What would the bandwidth be for different Video Media?
Mpeg-4 Brief History
What is MJPEG Motion JPEG Video Codec?

Bitstream encoding for video in which each frame [or field?] is compressed using the JPEG still-image compression algorithm. Taken as a sequence, the series of frames represent the source video. MJPEG bitstreams are often wrapped in AVI files where they carry the Four-Character Code
Why H.264 is The Next Big Thing?

Quality and Size (Bit-rate)

Higher resolution:

Lower storage requirements:

Next Generation Digital TV:

It is likely that future delivery of Digital TV signals (both in SD and HD) will use H.264. For SD, the same content at a given quality can be delivered with a lower bit-rate (allowing for more channels to be transmitted on the same medium), or higher quality and/or higher resolution can be delivered at the same bit-rate.

High-Definition Optical Discs

High-definition video is gaining in popularity, aided by the falling cost of HD television sets. A key deployment vehicle for High Definition content is likely to be optical discs carrying this content. Two optical disc formats are currently proposed: Blu-ray Disc, and HD-DVD. While these formats differ in several ways, both have chosen to adopt H.264 as one of the key means of storing the HD video content. The high bit-rates that are used to encode the video on these HD-discs will be particularly challenging today’s PCs; we will examine this further after we compare

MPEG-2 and H.264

Comparison to MPEG-2

MPEG-2 is today’s dominant video compression scheme; it is used to encode video on DVDs, to stream internet video and is the basis for most worldwide digital television (over-the air, cable and satellite) While MPEG-2 is a video-only format, MPEG-4 is a more generic media exchange format, with H.264 as one of several video compression schemes offered by MPEG-4.

Smaller block size:

MPEG-2, H.264, and other most other codecs treat portions of the video image in blocks, often processed in isolation from each another. Independently of the number of video pixels in the image, the number of blocks has an effect of the computational requirements. While MPEG-2 has a fixed block size of 16 pixels on a side (referred as 16x16), H.264 permits the simultaneous mixing of different block sizes (down to 4x4 pixels). This permits the codec to accurately define fine detail (with more, smaller blocks) while not having to âwaste’ small blocks on coarse detail. In this way, for example, patches of blue sky in a video image can use large blocks, while the finer details of a forest in the frame could be encoded with smaller blocks.

lower storage requirements will allow for large amounts of content to be delivered on a single disc.as the video world transitions to High Definition, a mechanism is needed to deliver it. color bands etc.H.264 clearly has a bright future, mostly because it offers much better compression efficiency than previous compression schemes.
What is H.264 compression?

H.264 is a new video compression scheme that is becoming the worldwide digital video standard for consumer electronics and personal computers. In particular, H.264 has already been selected as a key compression scheme (codec) for the next generation of optical disc formats, HD-DVD and Blu-ray disc (sometimes referred to as BD or BD-ROM) H.264 has been adopted by the Motion Picture Experts Group (MPEG) to be a key video compression scheme in the MPEG-4 format for digital media exchange. H.264 is sometimes referred to as “MPEG-4 Part 10â (part of the MPEG-4 specification), or as “AVCâ (MPEG-4’s Advanced Video Coding). This new compression scheme has been developed in response to technical factors and the needs of an evolving market:

  • MPEG-2 and other older video codecs are relatively inefficient.
  • Much greater computational resources are available today.
  • High Definition video is becoming pervasive, and there is a strong need to storeand transmit more efficiently the higher quantity of data of HD (about 6 times more than Standard Definition video).
What is Bit Rate?

The amount of compressed video data delivered into the decoding system. The higher the bit-rate, the higher the quality and/or the resolution of the video. For optical disc formats, this is usually measured in megabits per second (Mbps).
What is JPEG2000?

JPEG2000 is a new image compression standard being developed by the Joint Photographic Experts Group (JPEG), part of the International Organization for Standardization (ISO). It will reach a "Committee Draft" (CD) status in December 1999. It is designed for different types of still images (bi-level, gray-level, color, multicomponent) allowing different imaging models (client/server, real-time transmission, image library archival, limited buffer and bandwidth resources, etc), within an unified system.

JPEG2000 is intended to provide low bit rate operation with rate-distorsion and subjective image quality performance superior to existing standards, without sacrificing performance at other points in the rate-distorsion spectrum.

It has been decided to register the file extensions for testing and final version of JPEG2000 as ".j2k".

JPEG2000 addresses areas where current standards fail to produce the best quality of performance, such as:

  • Low bit rate compression performance (rates below 0.25 bpp for highly-detailed gray-level images)
  • Lossless and lossy compression in a single codestream
  • Seamless quality and resolution scalability, without having to downlad the entire file. The major benefit is the conservation of bandwidth
  • Large images: JPEG is restricted to 64kx64k images (without tiling). JPEG2000 will handle image sizes up to (2^32 - 1)
  • Single decompression architecture
  • Error resilience for transmission in noisy environments, such as wireless and the Internet
  • Computer generated imagery
  • Compound documents
  • Region of Interest coding
  • Improved compression techniques to accomodate richer content and higher resolutions
  • Metadata mechanisms for incorporating additional non-image data as part of the file

JPEG2000 will be able to handle up to 256 channels of information, as compared to JPEG, which is limited to only RGB data. Thus, JPEG2000 will be capable of describing complete alternate color models, such as CMYK, and full ICC (International Color Consortium).

Compression Efficiency

Early results show a 20% compression efficiency improvement over JPEG, and a 40% improvement over Flashpix.

Important factors taken into account for achieving high compression efficiency:

  • Embedded lossy to lossless
  • Multiple component images
  • Static and dynamic Region-of-interest
  • Error resilience
  • Spatial and quality scalability
  • Rate-control

JPEG2000 has two coding modes:

  • DCT-based coding mode: Currently baseline JPEG
  • Wavelet-based coding mode: Includes non-reversible and reversible transforms

For an complete definition of the existing JPEG2000 compression system, please refer to the Verification Model (see "References" below). The discussion below is adapted from the JPEG2000 VM4.0.


The coder is essentially a bit-plane coder, using the same Layered Zero Coding (LZC) techniques which have been employed in a number of embedded Wavelet coders and were originally proposed by Taubman and Zakhor (IEEE Tx IP, Sep '94). (See References below.) In fact, many of the ideas presented in the VM4.0, including the use of separate code blocks and post-compression rate-distortion optimization are taken directly from that work and Dr. Taubman's own doctoral dissertation (UC Berkeley, 1994). The key additions are:
  • The use of fractional bit-planes, in which the quantization symbols for any given quantization layer (or bit-plane) are coded in a succession of separate passes, rather than just one pass
  • A simple embedded quad-tree algorithm is used to identify whether or not each of a collection of "sub-blocks" contains any non-zero (significant) samples at each quantization layer, so that the encoding and decoding algorithms need only visit those samples which lie within sub-blocks which are known to have significant samples.

EBCOT: The Basic Idea

In VM4, the coding algorithm used is known as "Embedded Block Coding with Optimized Truncation" (EBCOT). The coding subsystem in JPEG2000 is responsible for both the low level entropy coding operations associated with the representation of subband sample values, and organizing and packing the resulting codewords into the bit stream.

The basic idea in EBCOT is to divide each subband into blocks of samples which are coded independently. For each block, a separate bitstream is generated without using any information from the other blocks. The bit stream has the property that it can be truncated to a variety of discrete lengths.

Once the entire image has been compressed, a postprocessing operation passes over all the compressed blocks and determines the extent to which each block's embedded bit stream should be truncated in order to achieve a particular target bit rate, distortion bound or other quality metric. More generally, the final bit stream is composed from a collection of so-called "layers", where each layer has an interpretation in terms of overall image quality.

The first, lowest quality layer, is formed from the optimally truncated block bit streams in the manner described above. Each subsequent layer is formed by optimally truncating the block bit streams to achieve successively higher target bit rates, distortion bounds or other quality metrics, as appropriate, and including the additional code words required to augment the information represented in previous layers to the new truncation points.

An important aspect of the EBCOT algorithm is the manner by which it forms a final bit stream from the independent embedded bit streams generated for every block. The bit stream formation problem is very much simplified when the coder operates on entire subbands at a time, since the additional spatial organization imposed by independent blocks does not exist.

The fact that blocks are encoded independently enables the "random access" feature. Suppose, however, that the bit stream must also possess the "SNR progressive" feature. These two features appear to work against each other since the random access feature requires that individual blocks be separately decodable, while the SNR progressive feature requires that the embedded bit streams for these blocks be distributed throughout the bit stream so that more important information always precedes less important information, regardless of the spatial location associated with this information. It would seem that the amount of overhead required to identify the individual blocks within this distributed representation would be quite considerable.

As the image becomes larger, the increased overhead required to identify the exact sequence of a larger number of block segments is largely wasted because many of these blocks will have almost identical rate-distortion slopes so that the order in which they appear is largely immaterial. It makes sense, therefore, to identify the block truncation points which are very similar and include the relevant code bytes for each of these blocks in a pre-defined order. This is essentially the bit stream layering idea.

Basically, the bit stream is organized as a succession of layers, where each layer contains the additional contributions from each code block (some contributions may be empty). The block truncation points associated with each layer are optimal in the rate-distortion sense, which means that the bit stream obtained by discarding a whole number of least important layers will always be rate-distortion optimal. If the bit stream is truncated part way through a layer then it will not be strictly optimal, but the departure from optimality can be small if the number of layers is large.

As the number of layers is increased so that the number of code bytes in each layer is decreased, the rate-distortion slopes associated with all block truncation points in the layer will become increasingly similar; however, the number of code blocks which do not contribute to the layer will also increase so that the overhead associated with identifying the code blocks which do contribute to the layer will increase. In practice, we find that optimal compression performance for SNR progressive applications is achieved when the number of layers is approximately twice as large as the number of sub-bit-plane passes made by the entropy coder (that is, the bit stream contains twice as much granularity as that provided by previous verification models).

The boundaries of the sub-bit-plane passes are also the truncation points for each block's embedded bit stream. Consequently, on average each layer contains contributions from approximately half the code blocks so that the cost of identifying whether or not a block contributes to any given layer (about 2 bits per block) is much less than the cost of identifying a strict order on the block contributions. Moreover, the relative contribution of this overhead to the overall bit rate is independent of the size of the image.

The DIG2000 File Format Proposal

The goal of the DIG2000 Initiative is to create a digital file format that embodies a tightly-integrated set of essential features for storing images, and provides the needed mechanisms for images to be used effectively.

Some of the most important features are:

  • Flexible metadata architecture
  • Unambiguous specification of color (default sRGB)
  • Resolution-independent coordinate system
  • Asymmetric storage and delivery - server holds original and all metadata.
  • Client can select subset to be delivered
  • Protection of intellectual property - requires encryption, watermarking etc
  • Improved quality and rendition - consistent colour, print capability
  • Some backwards compatibility with original JPEG compression
  • Object oriented functionalities (coding, information embedding, …)
  • File format

Interesting links

For your convenience, these links will open in a new window.
From the official JPEG web site:
    • JPEG links
    • JPEG2000links, including a PDF version of the first Committee Draft
    • Public Relations
  • A set of JPEG2000 toolsfor coding and decoding, JP2 file parsing and validation, and Static ROI setting and displaying.
  • A non-technical article about JPEG2000 from WebReview.com
  • Organizationof the JPEG2000 committee, from the SPEAR projectweb page
  • Watermarkingon JPEG2000
  • The Digital Imaging Group(DIG) web site. The DIG2000 working group proposed a file format for use with JPEG2000


  • JPEG2000 Verification Model 4.0, ISO/IEC JTC 1/SC 29/WG 1, Charilaos Christopoulos (Ericsson, Sweden), Editor, April 22 1999. (Note: The latest version of the VM is v5.0).
  • Requirements Ad Hoc Group, "JPEG2000 requirements and profiles version 6.0," WG1 Vancouver Meeting, July 1999.
  • D. Taubman, "High Performance Scalable Image Compression with EBCOT", to appear in IEEE Trans. Image Proc., Submitted March 1999; Revised August 1999. Available in PDF format
  • D. Taubman and A. Zakhor, ``Multirate 3-D Subband Coding of Video,'' IEEE Transactions on Image Processing, September 1994, vol. 3, no. 5, pp. 572-588.
  • D. Taubman, ``Directionality and Scalability in Image and Video Compression,'' Ph.D. Thesis, Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, December 1994.
  • JPEG2000 Committee Draft Version 1.0, December 1999. Available in PDF format
  • Tutorial on JPEG2000, by Dr. Charilaos Christopoulos, presented at ICIP '99
  • Video Technology Branch, Media Technologies Laboratory, DSP Solutions R&D Center, Texas Instruments.
  • Digital Imaging Group, "DIG2000 file format proposal overview," DIG2000 Working Group, October 30, 1998.
  • The Digital Imaging Group's DIG2000 Initiative, "An Overview of JPEG2000 Technology and Benefits,"
  • JPEG Public Relations press releases

What would the bandwidth be for different Video Media?

HDV Video (high resolution video recorded on mini DV tape as MPEG-2)
HDV-1 at 720p: 19 MBit/s
HDV-2 at 1080i: 25 MBit/s

Computer Graphics SXGA (1280 x 1024 @ 60Hz):
pixel clock of 108 MHz, color depth 24 bit => 2.6 Gbps

High Quality Video (not compressed):
Digital Data = 30 frames per second / 640 x 480 pixels / 24-bit color / pixel => 221 Mbps

Reduced Quality Video (not compressed):
Digital Data = 15 frames per second / 320 x 240 pixels / 16-bit color / pixel => 18 Mbps

Reduced Quality Video (16bit color), 16 frames per second, 320 x 240 pixel x 16 bits x 16 frames = 19,660,800 bits per second => 19.7 Mbps

Videoconferencing: 64 kbps to 3 Mbps
Corporate Video: 2 Mbps to 3 Mbps
MPEG-1 Video: 1 to 3 Mbps

MPEG-2 Video:
- Compression 35 : 1 - 16 : 1 (Quality for Presentation Purpose) 5 - 10 Mbps
- Compression 16 : 1 - 7 : 1 (Betacam SP Replacement) 10 - 25 Mbps
- Compression 7 : 1 - 2 : 1 (Spectacular Imaging) 25 - 90 Mbps
- Compression 40 : 1 - 16 : 1 (DVD) 3.5 - 10 Mbps

CD Quality Audio:
44,1 kHz sample rate / 16-bit samples / 2 audio channels = 1.4 Mbps

Low Quality Audio:
11,05 kHz sample rate / 8-bit samples / 1 audio channel = 0.1 Mbps

MPEG-1 Audio, Layer 3 (MP3), "near CD-quality": 96 to 256 kbps

Computer Data:
10BaseT Ethernet: 10 Mbps
100BaseT Ethernet: 100 Mbps
1Gigabit Rthernet: 1 Gbps

FireWire: 400 Mbps
FireWire (new): 800 Mbps

USB (1.1): 12 Mbps
USB (2.0): 480 Mbps

Fibre Channel FC: 1 Gbps .. 4 Gbps
Fibre Channel (new developments coming up): 40 Gbps

Computer PCI Bus: 132 Mbps
14.4 modem: 1.44 kbyte/sec
28.8 modem: 2.88 kbyte/sec
ISDN: 6.4 kbyte/sec to 15.44 kbyte/sec (std MPEG1)
ATM: 45 to 155 Mbps

(Mbps = Megabits per second)
Mbps stands for millions of bits per second and is a measure of bandwidth. It represents the total information flow over a given time on a communications medium.

A megabit is a million binary pulses or "bits".

Sometimes 1 Mbps is defined as 1,048,576 bits. But bits in data communications have historically been counted using the decimal number system. This would mean, 1 Mbps is 1.000.000 bits per second, and 28.8 kilobits is 28,800 bits per second.

Bandwidth (the width of a band of electromagnetic frequencies) is used to mean how fast data flows on a given transmission path, and the width of the range of frequencies that an electronic signal occupies on a given transmission medium. In digital systems, bandwidth is expressed as data speed in bits per second.

Mpeg-4 Brief History

MPEG-4: ISO/IEC 14496

The MPEG-4 format was approved in different versions:
Version 1 was approved by MPEG in December 1998, version 2 December 1999. Every new version must be backward compatible to the previous versions.

MPEG-4 does not define specific transport layers. The adaptation to existing transport layers has been defined as transport over MPEG-2 Transport Stream and transport over IP.
The MPEG-4 format allows the hybrid coding of pixel-based natural images and video together with computer generated synthetic scenes. Video can be prograssive or interlaced. Resolution can be up studio resolution of 4000 x 4000 pixels.

Bitrates are typically between 5 kbit/s and more than 1 Gbit/s.

The MPEG-4 compression algorithms is very efficient for all used bit rates and can deliver a much higher image quality as MPEG-2.

MPEG-4 audio can use a wide variety of audio farmats from intelligible speech to high quality multichannel audio, from very low to very high bitrates.

The MPEG-4 coding of audio-visual objects provides a large set of tools, called Profiles. These are:
Visual Profiles
Audio Profiles
Graphics Profiles
Scene Graph Profiles
MPEG-J Profiles and
Object Descriptor Profiles