Abstract

Traditional video coding standards process video frames in the YCbCr color format, with a single luminance channel (Y) and two chroma channels (Cb, Cr). There is a strong correlation between the three channels that can enable efficient encoding and good compression, since decisions such as motion estimation and block partitioning can be made once for the Y plane and applied to the chroma channels. Also, subsampling can be used to reduce the resolution of the chroma planes. However, these standards are unable to efficiently encode additional data such as depth, alpha, velocity, etc. that is needed for immersive applications such as virtual reality (VR), augmented reality (AR), or extended reality (XR) to provide a high quality photorealistic experience. This disclosure describes configuring the additional data as a single input that can optionally be combined with the YCbCr input. Regardless of the number of channels, the implementation of the encoder can be designed such that the complexity of the encoder is a function of one dominant plane/channel. Current video coding standards can be extended, and future video coding standards can be implemented to incorporate such input by including a provision to signal the number of input channels and their resolution. A dominant plane can also be explicitly signaled, or it can be implied to be the first plane.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS