Visual masking is the perceptible effect of a signal when another signal exists in the airspace, time domain, or near the spectrum. Generally, the spatial-temporal sensitivity function is used to measure the human visual sensitivity.
Visual Attention is the most specific cognitive process of HVS. Its research mainly involves two aspects: top-down (also called Concept-driven) attention clues and bottom-up (also called stimulus-driven) Attention clues.
The visual visibility effect is measured by just perceiving distortion (just noticeable distortion, JND) as the visibility of the video image for HVS, it can be divided into pixel base JND and Subband base JND (including JND based on discrete cosine transform and JND based on wavelet transform ).
Depth exactly perceptible distortion model (just noticeable distortion in depth, jndd)
Binary exactly perceptible distortion model (binocular just noticeable distortion difference, bjnd)
Jndd indicates that distortion of a deep video smaller than the threshold does not affect the video quality of the virtual viewpoint used for viewing. Therefore, the jndd model can be used to perform lossy compression on Deep videos while ensuring the quality of virtual viewpoint sensing, saving the bit rate required for deep video encoding.
Bjnd reflects the difference between the left and right viewpoints recognized by humans. In a stereo image center, if the distortion of one of the images is less than bjnd, the human eye cannot perceive the distortion of the stereo image pair.
Visual psychology research shows that there is a masking effect in the human eye stereoscopic vision, that is, the quality of viewpoint images with good quality contributes a lot to the overall quality of the stereoscopic image in the two viewpoints that constitute the stereoscopic image.
The closer the video imaging is to the human eye, the larger the depth value. The farther the video imaging is to the human eye, the smaller the depth value.