Return to Digital Photography Articles

JPEG Color Space Conversion Error

As discussed in the JPEG Compression article, there are several sources of "error" in going from the original raw image data to the compressed JPEG photo. One of the less obvious "losses" is in the color conversion step.

Original Image before Color Space Conversion Error

Why change the color space? (RGB to YCbCr / YCrCb / YCC)

A single 6-megapixel digital photo should actually consume 18 MB if it were stored in its uncompressed form on a PC! Clearly, such huge file sizes would cause countless problems for the average consumer, meaning larger memory cards, longer import times and slower frame-per-second capture rates.

One of the main reasons for the popularity of the JPEG compressed file format for photographic images is its ability to encode high quality photos in a much smaller file size. 6 megapixel images can typically be compressed down to 2 to 3 MB while still retaining much of the original image quality. In order to achieve this 8:1 compression ratio, some detail is discarded in the process. The unique characteristic of JPEG image compression is the manner in which various details are selected for discard.

The JPEG format has been engineered to take advantage of the limitations of the Human Visual System (HVS), discarding "information" that the human eye has a difficult time discerning. If the average person can't differentiate particular types of image detail, then there is little benefit in retaining them in the resulting image file.

For ease of implementation, most native uncompressed image data is stored as RGB (Red-Green-Blue) tristimulus values. An image is typically composed of 3 channels, one for each of these three components (e.g. Red, Green or Blue). JPEG offers some flexibility in how each channel is compressed, both through the specification of quantization tables and chroma subsampling.

Low frequency image content is made up of the brightness or color differences that change slowly across a large area of the image. For example, a sunset might have low-frequency brightness content that starts bright at the horizon and gradually fades out to dark at the top of the photo.

High frequency information is observed in any fine detail. The highest frequency content would be a change in value (brightness or color) between every adjacent pixel in an image. From a distance, it's easy to expect that this high-frequency content can be discarded with little impact to the overall image quality.

The human eye has a frequency response that suggests a strong sensitivity to high-frequency luminance (brightness) and a weak sensitivity to high-frequency chrominance (color information). Realizing this, an optimized compression scheme would apply less compression to the luminance detail (i.e. higher quality, lower loss) than the chrominance detail.

To accomplish this, the JPEG compression scheme begins with a color space conversion from RGB (Red - Green - Blue) into YCbCr (Luminance - Blue/Yellow - Red/Green) (also called YCrCb by some). This conversion takes the three standard channels (RGB) and maps them into a different representation that is based on a luminance (brightness) channel and two opposing color channels.

RGB color space conversion to YCbCr

Having done this conversion step, the JPEG image compression algorithm can then apply more compression to the color information channels than the luminance information and yet still arrive at an acceptable resulting image quality.

The most significant savings in JPEG compression come from the truncation or elimination of high-frequency detail (through a process called quantization). Low-frequency information is preserved, while detail at higher frequencies is progressively discarded in greater amounts as the frequency increases.

If JPEG compression operated on RGB data instead of YCbCr data, one would not be able to discard as much of the higher frequency content without causing a noticeable loss in image quality. Some studies have compared the suitability of other color spaces (e.g. RGB, HSV, Lab, etc.) for image compression and have demonstrated that YCbCr is a very suitable choice (at least for human observation of natural photos).

Attempt to Recompress Losslessly

This article came to be as I was attempting to answer a reader's question about a way to perform a lossless extend (lossless extension of image data, not rotation) with a source JPEG image. Trying to be clever, I initially figured that I could do the following:

  • Analyze the source Quantization tables
  • Analyze the source chroma subsampling factor
  • Decompress the source JPEG into a lossless format (such as BMP or TIFF)
  • Edit the intermediate image in Photoshop and resave in a lossless format
  • Recompress the intermediate image with the same quantization tables and chroma subsampling
  • Compare the results

So, what happened?

 

Image after Color Space Conversion Error
Color Space Conversion Error Differences

The resulting images showed a nearly perfect lossless edit cycle, but there were a small number of MCU blocks (minimum coded units, 8x8, 16x8 or 16x16 pixel tiles) that exhibited some "error". While this error was very small (not more than about 1% deviation in any of the three color channels), and only affecting a fraction (0.72%) of the overall image, it still exists. More interestingly, it only seemed to be apparent in the highlight regions of my original photo.

The above image "Error Differences" is the RGB differences enhanced with a threshold cutoff of 1.

Resulting error differences:

Total Pixel Count: 3,145,728
DifferenceRedGreenBlue
Same 3,127,957 (99.44%) 3,124,464 (99.32%) 3,118,702 (99.14%)
1 15,898 (0.51%) 20,608 (0.66%) 13,991 (0.44%)
2 1,772 (0.06%) 642 (0.02%) 11,101 (0.35%)
3 101 (0.00%) 14 (0.00%) 1,767 (0.06%)
4 0 0 153 (0.00%)
5 0 0 13 (0.00%)
6000
7 0 0 0

Although I had removed the quantization loss step (the primary source of the compressed JPEG file savings), there still remained the error in the color space conversion step.

Color Space Conversion Error


Color Space Conversion Error

Any time that you convert an image from one color space to another, you are almost always guaranteed to introduce some error (image degradation) in the resulting output. This is because most images are defined in 8-bit precision per channel (24 bits/pixel), and if you output to a format that is also only 8-bit precision per channel with a floating-point conversion, you are going to have to perform some degree of rounding or truncation to end up at an 8-bit integer again.

For example, let's assume that a transfer / conversion function is required that is: (R is the integer Red value 0-255 and Z is some arbitrary resulting intermediate color space component value 0-255)

Z = R * 0.29900

If the original R value is 132, then:

Z = 39.468 -> Rounding -> 39

The delta is now 0.468, which was lost in the process. You'll see that inputs of 129...132 all give a rounded integer output of 39. Therefore, we have a compression in the translation process and a loss of information. This is a many-to-one relationship, which is irreversible.

Input ROutput Z
126...128 38
129...132 39
133...135 40

If the process is now reversed, you'll see that:

Input ZOutput R
38127
39 130
40 134

Therefore, in the reverse direction, there is no way that one can create a resulting R channel output of 128...129, 131...133, etc. This is known as banding or posterization and, if significant enough, it may be visible in the resulting image (blue sky gradients, etc.).

In the above, it's clear to see that round-trip conversion (into and out of the intermediate color space) results in some degree of error:

Input RIntermediate ZOutput RError
12939 130 +1
13039 130 0
13139 130 -1
13239 130 -2
133 40 134 +1
134 40 134 0
135 40 134 -1

Measuring the Error

To get an idea of the magnitude of error that one may observe in the JPEG color resampling stage, I coded up the RGB -> YCbCr conversion algorithm in Excel, then took that output and performed the inverse conversion YCbCr -> RGB, taking into account the 8-bit limits on precision (i.e. rounding / truncation).

Across a random sampling of tristimulus (RGB) inputs, I see a resulting round-trip error in the amount of:

  • Red Error: -3 ... +0
  • Green Error: -1 ... +1
  • Blue Error: -3 ... +0

For each of these deltas, it is the measure of the difference in a given Red, Green or Blue input values to the final Red, Green or Blue output values after an intermediate conversion to YCbCr. In other words, I have observed that in both the Red and Blue channels, the round-trip process causes an overall reduction in the resulting value, while the Green channel may either experience a slight reduction or increase.

Although the graph above shows only a random sampling, these limits were shown to hold true across a significant number of input combinations.

Conclusions

Even with the careful application of consistent quantization tables and chroma subsampling that identically matches the source image, you cannot resave a JPEG image without some small image degradation! This slight degradation is incurred in the color space conversion step.

Therefore, the only way to resave a JPEG image is to skip the decompression stage and perform matrix manipulation on the MCUs themselves without entering the quantization stage. This process severly limits what you are capable of doing in the way of edits.

Reference Formulae

For the purposes of the JPEG color space conversion (between RGB and YCbCr or YCrCb), the formula is given below.

Note that for the following formulae, the range of each input (R,G,B) is [-128...+127]. As mentioned on the JPEG Huffman Coding page, a level shift of (+128) will be required to get the standard [0...+255] range for each value.

Coefficients RGB to YCbCr YCbCr to RGB
Cred 0.299
Cgreen 0.587
Cblue 0.114

 


Reader's Comments:

Please leave your comments or suggestions below!
2016-06-22Anderson Lima
 Another doubt is, whete I can find in the file that I have a subsampling ratio of 1x1, 1x2, 2x1 or 2x2? Tkank you.
 The subsampling ratio can be extracted when you parse the SOF marker. In the SOF marker, you'll find "Horizontal Sampling Factor" and "Vertical Sampling Factor". I create a representation of the subsampling ratio (eg. AxB) by:

A = max(SOF_HorzSampFact_Hi[])/SOF_HorzSampFact_Hi[Comp]
B = max(SOF_VertSampFact_Vi[])/SOF_VertSampFact_Vi[Comp]
Comp is the index of a chrominance component (eg. Cb).
For more details, have a look at ITU-T.81 section A.1.1
2016-01-20j7n
 1) Assuming we have a 16-bit image, would the quality thoretically increase if the YCC transform was done at a higher bit depth, and the result then dithered to 8-bits for the rest of the encoding? Currently my image editor Photoshop 7.0 requires I convert to 8-bit RGB before JPEG becomes available as a saving option, which results in two truncation/rounding errors. (I'm certain PS's behavior hasn't changed for several versions at least.) Why is it that no image editor operates in higher precision?

2) I would leave the dreaded color subsampling out of this experiment, so that you test 1 aspect at a time. Subsampling introduces massive color bleeding if the viewer does a more sophisticated (and irreversible) interpolating than simple doubling of the chroma values. Photoshop, at least until recently, used pixel doubling which does minimize generation loss, but looks bad. At the support forums, Adobe staff maintained that they've chosen the right method. Of course they have the right to say anything. I figure that Photoshop users would know better than to use JPEG as an editing format.

Here I made a small experiment to demonstrate generation loss from subsamping. Sites like ImgUr usually force it on, and the modern "better formats" like WebP don't even support full resolution.

http://i.imgur.com/lmZSiNG.png
 1) For the moment I'll assume it was a 16-bit image from a raw capture. Most JPEG encoders only support 8-bit mode, though a few may support 12-bit as well. When Photoshop requires that you change the color mode to 8-bit RGB before saving as JPEG, it is quite likely that the double-conversion is done as you point out. However, it may be theoretically possible that only a single conversion step is performed (ie. 16-bit RGB to 8-bit YCC) if the change to "8-bit RGB" was only conceptual in nature... (ie. show 8-bit mode but the actual conversion only happens during an output operation). However, this seems less likely to me since that wouldn't help reduce memory consumption / increase performance when changing to 8-bit.

2) Interesting observations regarding the color subsampling. It might be worth pointing out that the compressed file size is a poor quantitative indicator of whether generational losses occur, but it does give a ballpark hint.

Thanks for sharing!
2011-08-16Vinicius Garcia
 Hey!
Hello, my friend!

I need to convert RGB to YCbCr and I trying to do this way:

/* Autor: Vinicius Garcia
* Data : 09.ago.2011
*
* Função que converte um pixel RGB em YCbCr
*
* param : int R valor do pixel no canal red
* param : int G valor do pixel no canal green
* param : int B valor do pixel no canal blue
* return: int* vetor de inteiros com os valores H, S e V calculados - nesta ordem
*/
int* converter_RGB_para_YCbCr(int R, int G, int B){
int* YCbCr = (int*) malloc(3 * sizeof(int));

double delta = 128.0; //Constante necessaria para o calculo da conversão de cor
double Y = (0.299 * R + 0.587 * G + 0.114 * B);
double Cb = ((B - Y) * 0.564 + delta);
double Cr = ((R - Y) * 0.713 + delta);

YCbCr[0] = (int) Y;
YCbCr[1] = (int) Cb;
YCbCr[2] = (int) Cr;
return YCbCr;
}

But it doesn't work for me!

I was comparing with cvCvtColor (from OpenCv library) and the results doesn't match:

R = 88, G = 76, B = 78
cvCvtColor: Y = 80, Cb = 127, Cr = 134
myfunction: Y = 382, Cb = 132, Cr = 132 (cr and cr are always equal!)

I really need help with this, I trying do this for a long time and I couldn't find any answer for my doubt.

I appreciate any help!
tnks!
2009-05-29dbaker
 Thanks for your articles, very well written, clear, informative, technical.
 Thanks!
2009-05-05 
 Hi Cal,

Thanks for the nice article. I have a doubt regarding YCbCr to RGB conversion. I have an 8x8 pixel JPEG file. I decoded it and after DCT I have 4 blocks of Y, 1 block of Cb and 1 block of Cr with me (each 8x8 pixels). Y0 is valid and Y1, Y2 and Y3 are zero matrices. I understand that an 8x8 pixel JPEG with 4:2:0 subsampling YCbCr to RGB color conversion will need 1 block (8x8) Y. Here, I have Y0 and that is perfect. But I think, we need only 4x4 values (left most 4x4 submatrix in the 8x8 matrix) for Cb and Cr. Is this understanding correct?

I noticed that there is some error in the JPEGsnoop tool in doing this. Because, if you decode the same image with JPEGsnoop and other picture viewers, you can see the clear difference. And, I feel that this difference is due to the difference in handling the Cb and Cr blocks for 8x8 image.

Can you please look at this and throw some light on this specific case (and in general the 8x8 blocks of Cb and Cr in which only 4x4 will be used (in whatever case - not just 8x8 pixel case) for 4:2:0 subsampling?
2009-03-23TSD
 Thanks for this great article
2009-02-16Sankara Narayanan V
 Hi Cal,

Thanks for the nice article. I have a doubt regarding YCbCr to RGB conversion. I have an 8x8 pixel JPEG file. I decoded it and after DCT I have 4 blocks of Y, 1 block of Cb and 1 block of Cr with me (each 8x8 pixels). Y0 is valid and Y1, Y2 and Y3 are zero matrices. I understand that an 8x8 pixel JPEG with 4:2:0 subsampling YCbCr to RGB color conversion will need 1 block (8x8) Y. Here, I have Y0 and that is perfect. But I think, we need only 4x4 values (left most 4x4 submatrix in the 8x8 matrix) for Cb and Cr. Is this understanding correct?

I noticed that there is some error in the JPEGsnoop tool in doing this. Because, if you decode the same image with JPEGsnoop and other picture viewers, you can see the clear difference. And, I feel that this difference is due to the difference in handling the Cb and Cr blocks for 8x8 image.

Can you please look at this and throw some light on this specific case (and in general the 8x8 blocks of Cb and Cr in which only 4x4 will be used (in whatever case - not just 8x8 pixel case) for 4:2:0 subsampling?
 Hello Sankara --

I took a look at your test image and now see where your confusion may have come in. When you use 4:2:0 subsampling, the minimum image size actually becomes 16x16 pixels. Irrespective of the subsampling, each channel is encoded as an 8x8 array. When you have 4:2:0 subsampling (also known as 2x2), you get: Y0 Y1 Y2 Y3 Cb Cr. The chroma subsampling means that the 8x8 arrays within each Cb and Cr are actually representing a 16x16 pixel region in the final image (their resolution has been halved).

Original Image
Right-click and Save As the above image

Therefore, although your rb8.jpg sample image might have been saved as an 8x8 image (with 4:2:0 subsampling), the scan data actually represents a 16x16 pixel region. Most image editors will display your image by truncating the 16x16 pixel region to the 8x8 region that you requested (always measured from the top-left corner).

JPEGsnoop is different in that I always display the entire image region defined by the MCUs, but then draw a dashed line to represent where the image is "supposed" to be cut off. So, you will see that JPEGsnoop does indeed display it correctly (look inside the dashed top-left region), but it gives you a hint as to what the full MCUs would look like.

Enlarged with JPEGsnoop
Note the dashed line that represents the edges of the "desired" image dimensions. The single MCU in your image actually represents the 16x16 pixel region.


So, the short answer to your question is that you still use 8x8 regions, not 4x4 regions, but that the image dimensions will truncate the MCUs to be partial MCUs. I hope that helps, Cal.
2008-03-10subrahmanyam
 I am using JPEG decoder, out put of the decoder is RGB format, I need to convert RGB to YCbCr, please provide C source file for that.

Regards,

Subrahmanyam
2008-02-10Jameel
 Once i get the values for Y,Cr and Cb for each pixel using the refered formulas , what should i do with these YCrCb values to see how the resultant Image look like in YCrCb Color space....????
 
2008-01-14Antonio Barbra
 Hello,
My Question is should all DHT positions and lengths be shown in jpeg snoop after i open a file?
 If you have turned on the DHT Expand option under the Options menu, then yes, this information will be shown.
2007-09-22Ken
 This is the best I have read so far about Jpeg anywhere!

However I would like to take exception to the user who commented about camera pixels, although he is correct in most cases - there are situations where 6M pixels really means 6 Megapixels per channel R, G and B. - typically the commenter is right in "low grade" devices such as consumer cameras.
 Thanks!
2007-08-19Dave_R
 I have continued searching for info on this and concluded that my previous assumptions were slightly off target.

It appears that the encoding is sYCC not PhotoYCC

The Exif 2.2 Annex E references sYCC, and the JPEGs from Canon, Kodak and Samsung include Exif 2.2 in their header.

The encoding definition is on the ICC website:
http://www.color.org/documents/sycc.pdf

It turns out that it uses the standard sRGB encoding, but includes the out of range and negative RGB values in the YCbCr encoding, the same way that PhotoYCC did.

It seems to be promoted as a printing solution, but really it is a scene mapped extension to a device mapped sRGB encoding.

Some published analysis (CIE TC8) shows that sYCC includes 99% of real world surface colours compared with 61% for sRGB.

Now I need to find some software that imports the full sYCC gamut.

I don't think I would have worked this out without JPEGsnoop identifying the out of range values first. Thank you!
2007-08-14Dave_R
 Hello

Thanks for acknowledging my message.

If you do look into this here are a couple of links:

http://www.steves-digicams.com/2007_reviews/c875/samples/100_0178.jpg

This is a jpeg from a Kodak C875 which shows even more extreme out of range RGB values, but none in YCC. It probably means that KODAK as well as Samsung are still encoding to PhotoYCC even though KODAK seem to have removed most of their documentation.

Since KODAK and Samsung are about No.2 and No.3 camera makers I don't understand why no one else has noticed the problem. I seem to be the only one "out of step".

FlashPix specification

This is where I found the FlashPix specification. Section 5.3.1 of this defines the conversion from XYZ to CCIR 709 RGB to YCC.

This problem does have some relevance to your original objective of lossless conversion to RGB and back, for editing. Because there are now two requirements that need to be met:

  • a) Convert between YCbCr and RGB with sufficient bit depth. I expect 16bpc TIFF would be enough!

  • b) Use coefficients that relate to a colourspace that is large enough to contain all of the colours in the image. XYZ has to be enough! And XYZ is used as the profile connection space in ICM.
 Great. Thanks for pointing out an example and the specification. Upon a quick glance, I can see the differences in ranging that you've referred to. In the past, I've been looking at the exceeded ranges simply from the perspective of detecting corrupt images, but now you've pointed out a very interesting application for it. Once I get a chance to do a deeper analysis, I'll post back. Thanks Dave!
2007-08-14Dave_R
 Hello

Thanks for acknowledging my message.

If you do look into this here are a couple of links:

http://www.steves-digicams.com/2007_reviews/c875/samples/100_0178.jpg

This is a jpeg from a Kodak C875 which shows even more extreme out of range RGB values, but none in YCC. It probably means that KODAK as well as Samsung are still encoding to PhotoYCC even though KODAK seem to have removed most of their documentation.

Since KODAK and Samsung are about No.2 and No.3 camera makers I don't understand why no one else has noticed the problem. I seem to be the only one "out of step".

http://graphcomp.com/info/specs/livepicture/fpx.pdf

This is where I found the FlashPix specification. Section 5.3.1 of this defines the conversion from XYZ to CCIR 709 RGB to YCC.

This problem does have some relevance to your original objective of lossless conversion to RGB and back, for editing. Because there are now two requirements that need to be met:

a) Convert between YCbCr and RGB with sufficient bit depth. I expect 16bpc TIFF would be enough!

b) Use coefficients that relate to a colourspace that is large enough to contain all of the colours in the image. XYZ has to be enough! And XYZ is used as the profile connection space in ICM.
2007-08-12Dave_R
 Hello

JPEGsnoop has helped me understand a problem that I have with Camera originated JPEGs.

The problem is that the colours, in particular saturation are wrong.

JPEGsnoop shows large amounts of clipping on conversion from YCbCr to RGB colourspaces and, having found this, I have hunted down what may be the sources of the problem.

  • 1) JFIF defines the conversion from JPEG YCbCr to RGB and specifies the coefficients to use. Although JPEG itself does not asume any particular colourspace, every application that I have tried seems to have these coefficients hardwired into the decompression process. It does not seem to be possible to access YCbCr directly or adjust the coefficients for a different colourspace.

    Although a camera has a much larger gamut than sRGB, its is instantly clipped to this lowest common denominator when the JPEG file is opened.

  • 2) KODAK have proposed various specifications over many years such as Photo CD, FlashPix and RIMM/ROMM. All of these have a common process as far as I understand it, which takes advantage of the fact that YCbCr can hold a larger gamut than sRGB

    • a) Starting from camera XYZ, the components are converted to RGB using CCIR 709 colour primaries. Where the XYZ is outside of the RGB triangle the conversion produces RGB values less than 0 or greater than 255. These values are retained, and not clipped.

    • b) The RGB values are converted to the YCbCr space. This retains the full gamut and is called PhotoYCC.

    • c) On decoding for sRGB the process is reversed and the out of range values clipped.

So KODAK have encoded a large gamut in a way that is also directly compatible with JFIF and sRGB. But to access the larger gamut a different decoding process is needed which retains the values outside of sRGB.

Unfortunately I can't find any application which does this. They all seem to follow the JFIF settings blindly.

I expect that Adobe Photoshop with the KODAK Plug-in and stdpyccl.icm handles the files properly, but I can't find any other program that does.

It seems that my Samsung S850 is following the KODAK specification. When checking the camera files with JPEGsnoop there is never any error in YCC, but RGB values before clipping that I have seen range from -67 Blue to 309 Red.

Effectively the YCbCr to sRGB conversion is done outside of the control of ICM and defeats the ability to maintain full gamut in the workspace.

I wonder whether you know of any tools that could do the conversion to a larger gamut than sRGB. It only requires some different coefficients for the conversion from YCbCr to RGB.

Thanks for producing JPEGsnoop, at least I think I understand the problem even if I can't fix it!
2007-07-22oland
 hi...
I am to the point...ok!!
Thank's for your information about image processing..
and this time i have to finally my last project in my graduation...and about that, I need the source code about RGB to YCbCr conversion and RGB to HSI conversion...I found any difficult when I was trying do that...can you help me friends....I'm so thank's of your help...see you all
by me
Oland
 Hi Oland --
I am less familiar with the RGB to HSI conversion, but I can help you with the RGB to YCbCr. The following C code should do the trick:
unsigned in_r,  in_g,   in_b;   // Input RGB (0..255)
unsigned out_y, out_cb, out_cr; // Output YCC (0..255)

// Input RGB values (in_*) are integers 0..255
// Output YCC values (out_*) are integers 0..255
// Conversion (typically used for JPEG):
//    Y =  0.2989 R + 0.5866 G + 0.1145 B
//   Cb = -0.1687 R - 0.3312 G + 0.5000 B
//   Cr =  0.5000 R - 0.4183 G - 0.0816 B
// Temporary YCC values (tmp_*) are floats -127..+127

float tmp_y, tmp_cb, tmp_cr;

tmp_y  =  (0.2989*in_r) + (0.5866*in_g) + (0.1145*in_b);
tmp_cb = -(0.1687*in_r) - (0.3312*in_g) + (0.5000*in_b);
tmp_cr =  (0.5000*in_r) - (0.4183*in_g) - (0.0816*in_b);

// The resulting luminance value should be 0..255
// The resulting chrominance values will be -127..+127, 
//   so they need to be shifted by +127.5 each to 
//   bring them into range 0..255
// Finally, we perform explicit integer truncation here
out_y  = unsigned(tmp_y);
out_cb = unsigned(tmp_cb+127.5);
out_cr = unsigned(tmp_cr+127.5);
2007-07-04User
 I think you have a mistake in the beginning of the article:

A single 6-megapixel digital photo should actually consume 18.9 megabytes if it were stored in its original, uncompressed form!
It appears that you have multiplied 6M pixels by 3 bytes/pixels (24 bit color RGB) = 18MB (+0.9MB which i don't know why). That is not the correct way to derive the "raw" data size. The actual uncompressed data from the camera is computed as follows:

1. first, you need to be aware that when the manufacturer specifies 6 megapixels, it *really* means there are 6M "sensors". using the std bayer pattern, 50% of them will be G, 25% R and 25% B. there are *not* 6Mx3 pixels, but really 6M pixels!

2. second, in the number of bits per pixels, there is a difference between CMOS and CCD sensors:

CCD sensors are really linear (1 photon = 1 electron, multiplied by conversion efficiency), which is nice for scientific applications, but annoying for photography because of limited dynamic range. thus CCD pixels are sampled at the highest feasible resolution, which is usually 14 bits/pixel.

CMOS sensors are somewhat non-linear. in other words, the sensor itself has a non-unitary gamma (>1), that "compresses" the dynamic range (which is good for photography). therefore CMOS sensors require less resolution, typically 12 bits. note: the main reason for 12 bits ADC converters is that in CMOS, the converter is usually integrated into the sensor chip, which further limits the ADC resolution. this disadventage is offset by the fact that the sensor already provides a slightly compressed dynamic-range signal.

to summarize, a 6M pixel CMOS camera would take 6M*12bits = 9MB per image, and a 6Mpixel CCD camera would take 6M*14bits = 10.5MB per image.

ISO100/200/400 settings don't matter to this discussion - they determine the analog gain (between the sensor and the converter), and don't affect converter *resolution*.
 First off, thanks for your excellent input!

In my text where I referenced the uncompressed size, the intention was to compare against the storage requirements on the PC system (e.g. bitmap), not the camera's RAW storage format. In the case of most 6 megapixel cameras (3072 x 2048 pixels), this works out to be 18.9 million bytes or 18 MB. (I've corrected the MB).

As you rightly point out, one must keep in mind the Color Filter Array mosaic that is typically used in these cameras. The camera's RAW storage will be a function of both the bit depth and what sensor array type is being used (Bayer, Foveon, Kodak's new panchromatic pixel CFA, etc.).

I wasn't aware of the differences you note between the CMOS and CCD sensor technologies, particularly with respect to the photon transfer linearity. This makes perfect sense, and I appreciate you sharing these details.

Thanks!
2007-05-23JackyDong
 Hi, thanks very much for your article.
But i still want to know more about YUV411! such as how jpeg files arrange its YUV components and how to convert it into RGB?
Thanks for your help in advance!
 My understanding is that YUV411 is written in the following sequence:

123456
U Y00 Y10 U Y01 Y11

The above sequence represents a single coded unit of 2x2 image elements. You can use the conversion functions I posted in my comment on 2007-04-16.

As an aside, JPEG images generally are coded with YCbCr, not YUV. They are similar in concept, but not the same.
2007-04-17Stacy Waugh
 Hi, thanks very much for your help. I have an RGB value with the following values:
R = 196, G = 173, B = 255

When I use your conversion to YUV, I get

Y=179, Cr = 132, Cb = 161

I have checked this calculation by converting this back to RGB and it is very close, in fact within one of the original value. My problem now is, how is this represented in a directdraw memory surface. Each pixel is suppose to be represented by 16 bits, how ever above I have to represent this with 24 bits. I have had a look at your array examples and you appear to have pairings
[ Y0(dc), Y0(ac)]
[ Cb(dc), Cb(ac)]
... and so on. I'm not sure what (dc) or (ac) stand for. If you could demonstrate on how the YUV values above should be represented in memory, this will be appreciated.
 Hi Stacy --

If the directdraw requires a 16-bit pixel representation, then it sounds like you will need to write each 32-bit dword macropixel to memory as follows:

Macropixel 1: Y0 U0 Y1 V0
Macropixel 2: Y2 U2 Y3 V2
Macropixel 3: Y4 U4 Y5 V4

So, for each 32-bit word, you are actually encoding 2 luminance (Y) pixels and 1 chrominance (U or V) pixel. When you divide 32 bits by 2 pixels (with 4:2:2 chroma subsampling, 2 pixels per macropixel), you end up with the 16 bits per pixel definition. In each 32-bit macropixel, the left-most byte is the lowest byte.

As for the (dc) and (ac), I was referring to the DCT coefficients for each component that are used to encode the data stream (see JPEG Huffman Coding and the Chroma Subsampling section of JPEG Decoder). If you are not doing the JPEG decoding yourself (with Huffman table lookups), then you can ignore this.

Let me know if this works out.
2007-04-16Stacy Waugh
 Hi I am fairly new to different direct draw surfaces shown in video. One of the surfaces I am currently using is in the YUY2 format.

As far as I know each color is represented in 16 bits. I have an RGB color that I want to color this surface with, but I am not sure how to get the correct value of it. I am not sure if the RGB to YCbCr formulas will help me. Can you please help or point me in the right direction.

Thanks for your help in advance
 From my understanding, the YUY2 format is effectively the same as YCbCr (YUV) with 2x1 (4:2:2) subsampling, but the order of the components is slightly different.

Instead of the usual YCC (JPEG) ordering:
[Y0(dc), Y0(ac)]
[Y1(dc), Y1(ac)],
[Cb(dc), Cb(ac)],
[Cr(dc), Cr(ac)]

You get:
[Y0(dc), Y0(ac)]
[Cb(dc), Cb(ac)],
[Y1(dc), Y1(ac)],
[Cr(dc), Cr(ac)]

While the effective bits per pixel is 16, the chroma subsampling really means that there are 24 bits per pixel (8 bits per channel per pixel) like JPEG. I believe that the same formulae can be used to convert the components as for RGB-YUV after the reordering is accounted for.

NOTE: There a difference between YUV and YCbCr. The conversions for YUV are based on a range of values that are intended for video display (leaving some headroom): 16-240 instead of 0-255. So, the formulae used to do the RGB to YUV (or RGB to YUY2) conversions are slightly different:

From Video Demystified:

RGB to YUV Conversion
Y  =      (0.257 * R) + (0.504 * G) + (0.098 * B) + 16
Cr = V =  (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
Cb = U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
YUV to RGB Conversion
B = 1.164(Y - 16)                  + 2.018(U - 128)
G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
R = 1.164(Y - 16) + 1.596(V - 128)
Hope that helps! Cal.

 


Leave a comment or suggestion for this page:

(Never Shown - Optional)
 

Visits!