Why a single 2k sensor? at DVinfo.net

David Ziegelheim · December 7th, 2006, 11:19 PM

That requires interpolation, reducing the actual resolution. Why not use either a 3-sensor system or a sensor large enough not to have to use use a demosaic algorithm?

Thanks,

David

Bob Grant · December 8th, 2006, 03:56 AM

I'm no expert however I think two reasons:

1) 3 chips designs introduce problems with optics, these are possibly solveable however that'd mean custom made optics, and that'd be mighty expensive.

2) If you made the chips bigger then you'd need matching optics, and that costs way more.

3) In any case unless you want 4:4:4 sampling there's not much point. Most 3 CCD designs in fact do the same thing, to the best of my knowledge nothing has say 1920x1080x3 pixels to produce a 1920x1080 res image. Certainly a 1920x1080 sensor cannot resolve that resolution in say the red channel but nor can our eyes.

Jason Rodriguez · December 8th, 2006, 07:09 AM

Basically for a 3-chip sensor multiply everything by three, with the requirement for custom optics . . . so you now have three chips, three sets of support boards with each chip, etc., etc.

Single sensor designs are much simpler, plus you can use film-style glass. There's also no limit to the T-stop from the prism design (2/3" 3-chip systems are limited to T1.6). Fringing is not an issue in the same way it is with 3-chip systems, and you really aren't loosing that much information with a good debayer algorithm and CineForm RAW's full-raster codec, especially when you consider that HDCAM is only 3:1:1 (1440x1080 in the luma and 480x1080 in the chroma) and DVCProHD, while 4:2:2, is only 1280x1080 in the luma and 640x1080 in the chroma.

So after compression and demosaicing, our single-sensor 2K image comes out quite a bit ahead in resolution of your other HD formats, and it's super-low/efficient compression (5:1 visually lossless wavelet).

David Ziegelheim · December 8th, 2006, 10:40 AM

Bob, the Canon HD cameras have 3 1440x1080 sensors.

Jason, using a 1920x1080 single sensor with Bayer filter, or a 2k for 2k, basically yields 1/2 the pixels for the final image. While this may not be that different from some other compressed formats, and the interpolation algorithms are sophisticated, there must be some loss of resolution. The use of Cineform, while giving an advantage over other compression algorithms, probably makes the loss of detail more important.

Could a larger sensor have been used to require less interpolation? Could a Foveon sensor have been used? Do you have actual tests with a system configured not to need interpolation (persumably an image resolution 1/2 the sensor resolution) compared to a demosaiced version to determine the actual losses?

Thanks,

David

Jason Rodriguez · December 8th, 2006, 12:55 PM

Actually, yes, we've had monochrome-only versions of the chip in-house for testing, and it's approximately a 15% loss in resolution in real-world scenes and resolution charts between the monochrome (so no filter mask), and the bayer color version depending on the subject. NOT a 50% resolution loss as you mention.

Now if you shot something that was only 100% red, and that "real-world" object somehow was a wavelength that didn't register any blue or green pixel values (so those pixels came out pure black), then yeah, that would be a larger resolution loss, but with real-world images, along with the required optical low-pass filter, that is prevented from ever happening, so there's always information recorded in the other two channels, and as a result, we're able to pick up information from the surrounding pixels and get a really good interpolation result without aliasing.

David Ziegelheim · December 8th, 2006, 01:10 PM

That was a 50% loss in pixels assuming 50% green, 25% red, 25% blue. That would be a 29% loss on a 'green' image on an axis, and 50% loss on red and blue. The 15% you are reporting is close.

Those measurements were after Cineform compression?

Thanks,

David

Jason Rodriguez · December 8th, 2006, 05:27 PM

Cineform RAW compression is a full-raster codec, meaning it encodes the whole image pixel-for-pixel, and since it's visually lossless, it doesn't "lose" resolution. The only time that resolution is lost is if you are encoding a noisy image.

BTW, if you're worried about resolution loss with CineForm RAW, there's always uncompressed 12-bit RAW, but again, in our testing, there's no discernable resolution loss with CineForm RAW, and it maintains high-frequency detail very nicely. The only areas where resolution may be lost over uncompressed is in really noisy areas where the wavelet compression might smooth over edges are aren't well defined. But the PSNR of CineForm RAW vs. Uncompressed is above the visible threshold, meaning that on a clean, well exposed image (like we would shoot a resolution chart at), there would be no visibly discernable difference between a compressed and uncompressed frame (at our 5:1 compression setting).

BTW, you need to think about demosaicing in another color-space besides RGB . . . that's too simple a model. If you start thinking about alternate color-spaces (such as YUV), you'll see that you can extract luminance information from every pixel and interpolate using intelligent "guestimates" to create a very accurate luma image. I can't tell you exactly how we debayer, but it's much better than the simple RGB rules you are describing, which would be true for something really simple in the RGB domain like a bilinear or nearest-neighbor algorithm.

Glenn Chan · December 8th, 2006, 07:14 PM

Does Cineform RAW decode to R'G'B' or Y'CbCr? Looking at stills from Red (presumably they are doing something similar and the results are analogous?), taking their compressed RAW footage and converting it to (4:2:2) Y'CbCr degrades quality more so than wavelet compressing the RAW data in the first place.

2- I believe Bayer resolution is much higher than 50% since the better de-mosaic algorithms try to intelligently guess the resolution. You can make guesses based on assumptions that:
- there isn't much color detail / the color detail is not high frequency. i.e. you don't have alternating 1-pixel bands of red and blue.
- on edges, the change in color is located at the same place where there's a change in luminance. When you are shooting objects, this is likely to be true.

This is my understanding of adaptive de-mosaic algorithms anyways. In real-world situations, the above conditions are more likely to hold true more often than not. This allows real world resolution to be significantly higher than 50%. Graeme Nattress stated somewhere that it's about 70%.

The other pieces to the puzzle is how much aliasing is acceptable, and what minimum amplitude is acceptable / MTF (modulation transfer function).

Removing the optical low-pass filter allows for greater resolution, at the expense of aliasing in the form to stair-stepping on diagonals (especially when panning).

On poor lenses, high frequencies / fine detail is there but very low in contrast (i.e. a blurry mess). So if you don't want that, the optics have to be good.

I believe that the aliasing of a Bayer-design sensor may be dependent on picture content and demosaic algorithm (plus sensor design).

3- Not all 3CCD designs are created equal. A logical way of aligning the CCDs is to make them all line up. In pixel-shift designs, the CCDs are slightly moved with respect to each other (by half a pixel or something like that). This allows greater resolution and more aliasing (bad). The DVX100 does this, even though each CCD has slightly more than 720X480 pixels. I'm not sure exactly why the DVX100 does that- it may be to make the auto focus work better (?).

4- To add on: The debayer algorithms will do more guessing + information extraction to obtain better results. If SI offers a real-time codec, it may not give quite as good quality/resolution compared to slower non-real-time algorithms. That being said, this sort of gives you more flexibility.

Chris Hurd · December 8th, 2006, 08:36 PM

Quote:

Originally Posted by Glenn Chan

In pixel-shift designs... this allows... more aliasing (bad).

Incorrect. Pixel Shift does not cause more aliasing. Remember that the CCD output is analog. You should look at Pixel Shift as a way to obtain more sampling points per pixel. If you had to draw a curve on a piece of graph paper based only on a series of points that show where the curve is, you know that the more sampling points you have have to go on, the smoother and more accurate the rendered curve will be. Pixel Shift is a useful way to provide the Analog to Digital converter with more information, that's all. And that does not cause more aliasing.

Quote:

The DVX100... each CCD has slightly more than 720X480 pixels. I'm not sure exactly why the DVX100 does that- it may be to make the auto focus work better (?).

Autofocus measures contrast and an increased number of pixels makes its job harder, not easier. Sometimes people forget that CCD output is analog in nature (and monochrome, to boot) and they look at the number of pixels on the sensor vs. the number of pixels in the format and wonder why they're not equal. They don't have to be. Almost every 3-chip camcorder on the market today uses one or another variant of Pixel Shift... those that do not have it are not as common as those that do.

But the SI camera is a single-chip design, which is where the entire industry is moving toward (just not quickly enough, in my opinion). The still photography industry got there quite awhile ago.

Jason Rodriguez · December 8th, 2006, 09:09 PM

Hi Glenn,

CineForm RAW is really flexible, in that it allows for a user to playback the AVI with a very fast real-time quadlet playback algorithm (or you can chose bilinear), and then with the flip of a software switch go to a very high-end (but slow) adaptive algorithm that we've created in-house.

So you get the choice of super-high quality or fast real-time playback (even on slower computers) all from the same file, and without having to go through a conversion program.

CineForm RAW is really powerful where it offers you in your editing application, compositing program, etc., all the tricks of wavelet transforms to either get you real-time multistream playback for all your creative decisions, or full-quality high-resolution adaptive demosaicing for final output.

Glenn Chan · December 9th, 2006, 01:02 AM

Quote:

Pixel Shift does not cause more aliasing.

Well from a practical standpoint, I don't think that form of aliasing would really matter that much. In an extremely bizarre test pattern situation, you could have bands of different saturated colors with the same luminance values (i.e. to the camera, if it were looking at big areas of those colors alone). If those two colors were to alternate at a high frequency, those colors would alias into luminance information.

They don't even have to be the same luminance; although in that case, the aliasing/artifacts wouldn't be very objectionable. So for me to say that that form of aliasing bad may have been sloppy of me and an exaggeration.

Quote:

Remember that the CCD output is analog.

This is a red herring?? How the CCDs are offset/positioned doesn't change the nature of the CCD's output. / The same concepts would apply if the sensor elements had some sort of theoretical digital output.

Chris Hurd · December 9th, 2006, 02:09 AM

Hold on, I think there's some confusion here.

You said: "This allows greater resolution and more aliasing (bad)."

I said: "Incorrect. Pixel Shift does not cause more aliasing."

Then you said: "I don't think that form of aliasing would really matter that much."

I have taken the word m-o-r-e "more," that we've both used, to mean "additional." More = additional. Based on your reply about "that form of aliasing," I think what you meant to say instead was Moiré , as in Moiré pattern aliasing.

Quote:

This is a red herring? How the CCDs are offset/positioned doesn't change the nature of the CCD's output.

My statement was entirely relevant. The whole point is that since a CCD is an analog device, what the employment of Pixel Shift does is to increase the amount of information going to the A/D converter. Some people tend to overlook the fact that the image sensors are outputting an analog signal. They tend to get hung up on the number of pixels on the chip, and they can't understand why it's not a 1 for 1 ratio with the resolution of the format. I'm simply pointing out that those numbers don't have to match because there's an entire A/D conversion process that takes place between the image sensors and the recording format. In fact, it's rare that those numbers (CCD pixels vs. recorded resolution) do match, just as it's rare for a 3-chip camera not to use Pixel Shift. You said you were confused as to why the DVX100 has Pixel Shift and I'm trying to answer that for you. The answer is that Pixel Shift provides more information to the A/D converter so that it can do its job with greater accuracy.

But now we're hopelessly off topic again since, as I've pointed out before, the SI-2K is a single-chip camera.

Van Cleave · December 9th, 2006, 10:04 PM

Gentlemen,
Can you tell me if there is some fundamental problem or issue with the Foveon approach to imaging? Why has this imager not been widely adopted?
Thx

Chris Hurd · December 9th, 2006, 10:27 PM

Why has the Foveon imager not been widely adopted for digital video? Probably because its maximum sustainable frame rate is currently something like what... eight or ten frames per second?

Graeme Nattress · December 9th, 2006, 10:53 PM

Due to the way that the silicon in Foveon filters the incomming photons, there are some strong interpolations needed to convert from the native space of the sensor to RGB, and these large factors lead to noise. And yes, the frame rate at a decent resolution is poor, but that's improving I think.

Glenn, I think I mentioned > 70%, and I think you'd probably have the AA filter factor in there too.

Aliassing is related to sensor pixel pitch compared to frequency of incomming detail. If the pixelshift means you can use larger pixels for a given pixel resolution, perhaps it could lead to more aliassing, as that occurs in the system as the light hits the sensor, not after.

Bayer Pattern CFA on a single sensor is very much more clever than most people realize and good interpolation can produce fantastic images. We've had single chip consumer cameras for ages, but they never did employ very good demosaic algorithms or image processing, whereas today, we do things differently, and having higher resolutions helps too.

Graeme

December 7th, 2006, 11:19 PM	#1
David Ziegelheim Major Player Join Date: Jul 2003 Location: Warren, NJ Posts: 398	Why a single 2k sensor? That requires interpolation, reducing the actual resolution. Why not use either a 3-sensor system or a sensor large enough not to have to use use a demosaic algorithm? Thanks, David

December 8th, 2006, 07:14 PM	#8
Glenn Chan Inner Circle Join Date: Jun 2003 Location: Toronto, Canada Posts: 4,750	Does Cineform RAW decode to R'G'B' or Y'CbCr? Looking at stills from Red (presumably they are doing something similar and the results are analogous?), taking their compressed RAW footage and converting it to (4:2:2) Y'CbCr degrades quality more so than wavelet compressing the RAW data in the first place. 2- I believe Bayer resolution is much higher than 50% since the better de-mosaic algorithms try to intelligently guess the resolution. You can make guesses based on assumptions that: - there isn't much color detail / the color detail is not high frequency. i.e. you don't have alternating 1-pixel bands of red and blue. - on edges, the change in color is located at the same place where there's a change in luminance. When you are shooting objects, this is likely to be true. This is my understanding of adaptive de-mosaic algorithms anyways. In real-world situations, the above conditions are more likely to hold true more often than not. This allows real world resolution to be significantly higher than 50%. Graeme Nattress stated somewhere that it's about 70%. The other pieces to the puzzle is how much aliasing is acceptable, and what minimum amplitude is acceptable / MTF (modulation transfer function). Removing the optical low-pass filter allows for greater resolution, at the expense of aliasing in the form to stair-stepping on diagonals (especially when panning). On poor lenses, high frequencies / fine detail is there but very low in contrast (i.e. a blurry mess). So if you don't want that, the optics have to be good. I believe that the aliasing of a Bayer-design sensor may be dependent on picture content and demosaic algorithm (plus sensor design). 3- Not all 3CCD designs are created equal. A logical way of aligning the CCDs is to make them all line up. In pixel-shift designs, the CCDs are slightly moved with respect to each other (by half a pixel or something like that). This allows greater resolution and more aliasing (bad). The DVX100 does this, even though each CCD has slightly more than 720X480 pixels. I'm not sure exactly why the DVX100 does that- it may be to make the auto focus work better (?). 4- To add on: The debayer algorithms will do more guessing + information extraction to obtain better results. If SI offers a real-time codec, it may not give quite as good quality/resolution compared to slower non-real-time algorithms. That being said, this sort of gives you more flexibility. Last edited by Glenn Chan; December 8th, 2006 at 07:58 PM.

December 9th, 2006, 10:04 PM	#13
Van Cleave New Boot Join Date: Nov 2006 Location: Fife Wa. Posts: 9	Foveon Question Gentlemen, Can you tell me if there is some fundamental problem or issue with the Foveon approach to imaging? Why has this imager not been widely adopted? Thx

December 9th, 2006, 10:27 PM	#14
Chris Hurd Obstreperous Rex Join Date: Jan 2001 Location: San Marcos, TX Posts: 27,366 Images: 513	Why has the Foveon imager not been widely adopted for digital video? Probably because its maximum sustainable frame rate is currently something like what... eight or ten frames per second? __________________ CH Search DV Info Net \| 20 years of DVi \| ...Tuesday is Soylent Green Day!

December 9th, 2006, 10:53 PM	#15
Graeme Nattress RED Problem Solver Join Date: Sep 2003 Location: Ottawa, Canada Posts: 1,365	Due to the way that the silicon in Foveon filters the incomming photons, there are some strong interpolations needed to convert from the native space of the sensor to RGB, and these large factors lead to noise. And yes, the frame rate at a decent resolution is poor, but that's improving I think. Glenn, I think I mentioned > 70%, and I think you'd probably have the AA filter factor in there too. Aliassing is related to sensor pixel pitch compared to frequency of incomming detail. If the pixelshift means you can use larger pixels for a given pixel resolution, perhaps it could lead to more aliassing, as that occurs in the system as the light hits the sensor, not after. Bayer Pattern CFA on a single sensor is very much more clever than most people realize and good interpolation can produce fantastic images. We've had single chip consumer cameras for ages, but they never did employ very good demosaic algorithms or image processing, whereas today, we do things differently, and having higher resolutions helps too. Graeme __________________ www.nattress.com - filters for FCP

December 8th, 2006, 03:56 AM	#2
Bob Grant Trustee Join Date: Nov 2005 Location: Sydney Australia Posts: 1,570	I'm no expert however I think two reasons: 1) 3 chips designs introduce problems with optics, these are possibly solveable however that'd mean custom made optics, and that'd be mighty expensive. 2) If you made the chips bigger then you'd need matching optics, and that costs way more. 3) In any case unless you want 4:4:4 sampling there's not much point. Most 3 CCD designs in fact do the same thing, to the best of my knowledge nothing has say 1920x1080x3 pixels to produce a 1920x1080 res image. Certainly a 1920x1080 sensor cannot resolve that resolution in say the red channel but nor can our eyes.

December 8th, 2006, 07:09 AM	#3
Jason Rodriguez Trustee Join Date: Mar 2003 Location: Virginia Beach, VA Posts: 1,095	Basically for a 3-chip sensor multiply everything by three, with the requirement for custom optics . . . so you now have three chips, three sets of support boards with each chip, etc., etc. Single sensor designs are much simpler, plus you can use film-style glass. There's also no limit to the T-stop from the prism design (2/3" 3-chip systems are limited to T1.6). Fringing is not an issue in the same way it is with 3-chip systems, and you really aren't loosing that much information with a good debayer algorithm and CineForm RAW's full-raster codec, especially when you consider that HDCAM is only 3:1:1 (1440x1080 in the luma and 480x1080 in the chroma) and DVCProHD, while 4:2:2, is only 1280x1080 in the luma and 640x1080 in the chroma. So after compression and demosaicing, our single-sensor 2K image comes out quite a bit ahead in resolution of your other HD formats, and it's super-low/efficient compression (5:1 visually lossless wavelet).

December 8th, 2006, 10:40 AM	#4
David Ziegelheim Major Player Join Date: Jul 2003 Location: Warren, NJ Posts: 398	Bob, the Canon HD cameras have 3 1440x1080 sensors. Jason, using a 1920x1080 single sensor with Bayer filter, or a 2k for 2k, basically yields 1/2 the pixels for the final image. While this may not be that different from some other compressed formats, and the interpolation algorithms are sophisticated, there must be some loss of resolution. The use of Cineform, while giving an advantage over other compression algorithms, probably makes the loss of detail more important. Could a larger sensor have been used to require less interpolation? Could a Foveon sensor have been used? Do you have actual tests with a system configured not to need interpolation (persumably an image resolution 1/2 the sensor resolution) compared to a demosaiced version to determine the actual losses? Thanks, David

December 8th, 2006, 12:55 PM	#5
Jason Rodriguez Trustee Join Date: Mar 2003 Location: Virginia Beach, VA Posts: 1,095	Actually, yes, we've had monochrome-only versions of the chip in-house for testing, and it's approximately a 15% loss in resolution in real-world scenes and resolution charts between the monochrome (so no filter mask), and the bayer color version depending on the subject. NOT a 50% resolution loss as you mention. Now if you shot something that was only 100% red, and that "real-world" object somehow was a wavelength that didn't register any blue or green pixel values (so those pixels came out pure black), then yeah, that would be a larger resolution loss, but with real-world images, along with the required optical low-pass filter, that is prevented from ever happening, so there's always information recorded in the other two channels, and as a result, we're able to pick up information from the surrounding pixels and get a really good interpolation result without aliasing.

December 8th, 2006, 01:10 PM	#6
David Ziegelheim Major Player Join Date: Jul 2003 Location: Warren, NJ Posts: 398	That was a 50% loss in pixels assuming 50% green, 25% red, 25% blue. That would be a 29% loss on a 'green' image on an axis, and 50% loss on red and blue. The 15% you are reporting is close. Those measurements were after Cineform compression? Thanks, David

December 8th, 2006, 05:27 PM	#7
Jason Rodriguez Trustee Join Date: Mar 2003 Location: Virginia Beach, VA Posts: 1,095	Cineform RAW compression is a full-raster codec, meaning it encodes the whole image pixel-for-pixel, and since it's visually lossless, it doesn't "lose" resolution. The only time that resolution is lost is if you are encoding a noisy image. BTW, if you're worried about resolution loss with CineForm RAW, there's always uncompressed 12-bit RAW, but again, in our testing, there's no discernable resolution loss with CineForm RAW, and it maintains high-frequency detail very nicely. The only areas where resolution may be lost over uncompressed is in really noisy areas where the wavelet compression might smooth over edges are aren't well defined. But the PSNR of CineForm RAW vs. Uncompressed is above the visible threshold, meaning that on a clean, well exposed image (like we would shoot a resolution chart at), there would be no visibly discernable difference between a compressed and uncompressed frame (at our 5:1 compression setting). BTW, you need to think about demosaicing in another color-space besides RGB . . . that's too simple a model. If you start thinking about alternate color-spaces (such as YUV), you'll see that you can extract luminance information from every pixel and interpolate using intelligent "guestimates" to create a very accurate luma image. I can't tell you exactly how we debayer, but it's much better than the simple RGB rules you are describing, which would be true for something really simple in the RGB domain like a bilinear or nearest-neighbor algorithm.

December 8th, 2006, 09:09 PM	#10
Jason Rodriguez Trustee Join Date: Mar 2003 Location: Virginia Beach, VA Posts: 1,095	Hi Glenn, CineForm RAW is really flexible, in that it allows for a user to playback the AVI with a very fast real-time quadlet playback algorithm (or you can chose bilinear), and then with the flip of a software switch go to a very high-end (but slow) adaptive algorithm that we've created in-house. So you get the choice of super-high quality or fast real-time playback (even on slower computers) all from the same file, and without having to go through a conversion program. CineForm RAW is really powerful where it offers you in your editing application, compositing program, etc., all the tricks of wavelet transforms to either get you real-time multistream playback for all your creative decisions, or full-quality high-resolution adaptive demosaicing for final output.