Erkut ERDEM

A Region Covariances-based Visual Attention Model for RGB-D Images

Existing computational models of visual attention generally employ simple image features such as color, intensity or orientation to generate a saliency map which highlights the image parts that attract human attention. Interestingly, most of these models do not process any depth information and operate only on standard two-dimensional RGB images. On the other hand, depth processing through stereo vision is a key characteristics of the human visual system. In line with this observation, in this study, we propose to extend two state-of-the-art static saliency models that depend on region covariances to process additional depth information available in RGB-D images. We evaluate our proposed models on NUS-3D benchmark dataset by taking into account different evaluation metrics. Our results reveal that using the additional depth information improves the saliency prediction in a statistically significant manner, giving more accurate saliency maps.

Keywords:

Visual attention Visual saliency; Depth saliency; RGB-D images; Region covariances,

PDF

___

Y. Benjamini, and Y. Hochberg (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), pages 289-300.
A. Borji, and L. Itti (2013). State-of-the-art in Visual Attention Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35(1), pages 185-207.
N. D. Bruce, and J. K. Tsotsos (2005). An attentional framework for stereo vision. In Proc. IEEE Canadian Conference on Computer and Robot Vision, pages 88-95.
N. Bruce, and J. Tsotsos (2006). Saliency based on information maximization. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 155-162.
N. Bruce, and J. Tsotsos (2009). Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, Vol. 9(3):5, pages 1-24.
Z. Bylinskii, T. Judd, A. Borji, L. Itti, F. Durand, A. Oliva, and A. Torralba (accessed by 2016). MIT Saliency Benchmark, http://saliency.mit.edu.
Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand (2016). What do different evaluation metrics tell us about saliency models?. arXiv preprint arXiv:1604.03605.
E. Erdem, and A. Erdem (2013). Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of Vision, Vol. 13(4):1, pages 1-20.
W. Föerstner, and B. Moonen (1999). A metric for covariance matrices (Tech. Rep.). Department of Geodesy and Geoinformatics, Stuttgart University, Germany.
D. Gao, and N. Vasconcelos (2007). Bottom-up saliency is a discriminant process. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 1-6.
S. Goferman, L. Zelnik-Manor, and A. Tal (2010). Context-aware saliency detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2376-2383.
J. Harel, C. Koch, and P. Perona (2007). Graph-based visual saliency. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 545-552.
X. Hong, H. Chang, S. Shan, X. Chen, and W. Gao (2009). Sigma Set: A small second order statistical region descriptor. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1802-1809.
X. Hou, and L. Zhang (2007). Saliency detection: A spectral residual approach. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1-8.
B. Hu, R. Kane-Jackson, and E. Niebur (2016). A proto-object based saliency model in three-dimensional space. Vision Research, Vol. 119, pages 42-49.
H. Hügli, T. Jost, and N. Ouerhani (2005). Model performance for visual attention in real 3d color scenes. In Proc. Artificial intelligence and knowledge engineering applications: A bioinspired approach, pages 469-478
I. Iatsun, M.-C. Larabin, C. Fernandez-Maloigne (2015). A visual attention model for stereoscopic 3D images using monocular cues. Signal Processing: Image Communication, Vol. 38, pages 70-83.
L. Itti, C. Koch, and E. Niebur (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20(11), pages 1254-1259.
T. Jost, N. Ouerhani, R. von Wartburg, R., Müri, and H. Hügli (2004). Contribution of depth to visual attention: Comparison of a computer model and human. In Proc. Early Cognitive Vision Workshop, pages 1-4.
T. Judd, K. Ehinger, F. Durand, and A. Torralba (2009). Learning to predict where humans look. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 2106-2113.
S. S. S. Kruthiventi, V. Gudisa, J. H. Dholakiya, and R. V. Babu (2016). Saliency Unified: A Deep Architecture for Simultaneous Eye Fixation Prediction and Salient Object Segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5781-5790.
C. Lang, T. V. Nguyen, H. Katti, K. Yadati, M. Kankanhalli, and S. Yan (2012). Depth matters: Influence of depth cues on visual saliency. In Proc. European Conference of Computer Vision (ECCV), pages 101-115.
C.-Y. Ma, and H.-M. Hang (2015). Learning-based saliency model with depth information. Journal of Vision, Vol. 15(6):19, pages. 1-22.
R. Margolin, A. Tal, and L. Zelnik-Manor (2013). What makes a patch distinct? In Proc IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1139-1146.
N. Ouerhani, and H. Hügli (2000). Computing visual attention from scene depth. In Proc. International Conference on Pattern Recognition, pages 375-378.
J. Pan, E. Sayrol, X. Giro-i Nieto, K. McGuinness, and N. O’Connor (2016). Shallow and deep convolutional networks for saliency prediction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 598-606.
S. Ramenahalli, and E. Niebur (2013). Computing 3D saliency from a 2D image. In Proc. Annual conference on information sciences and systems (CISS), pages 1-5.
A. F. Russell, S. Mihalas, R. von der Heydt, E., Niebur, and R. Etienne-Cummings (2014). A model of proto-object based saliency. Vision Research, Vol. 94, pages 1-15.
H. J. Seo, and P. Milanfar (2009). Static and space-time visual saliency detection by self-resemblance. Journal of Vision, Vol. 9(12):15, pages 1-27.
O. Tuzel, F., Porikli, and P. Meer, (2006). Region covariance: A fast descriptor for detection and classification. In Proc. European Conference of Computer Vision (ECCV), pages 589-600.
J. Wang, M. P. DaSilva, P. LeCallet, and V. Ricordel (2013). Computational model of stereoscopic 3d visual saliency. IEEE Transactions on Image Processing, Vol. 22(6), pages 2151-2165.
L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, Vol. 8(7):32, pages 1-20.
Y. Zhang, G. Jiang, M. Yu, and K. Chen (2010). Stereoscopic visual attention model for 3D video. In Proc. Advances in Multimedia Modeling, pages 314–324.

International Journal of Intelligent Systems and Applications in Engineering-Cover

ISSN: 2147-6799
Yayın Aralığı: 4
Başlangıç: 2013
Yayıncı: Ismail SARITAS

Arşiv

Sayıdaki Diğer Makaleler

A Region Covariances-based Visual Attention Model for RGB-D Images

Erkut ERDEM

Adaptive Control Solution for a Class of MIMO Uncertain Underactuated Systems with Saturating Inputs

Ajay KULKARNİ, Abhay KUMAR

The Control of A Non-Linear Chaotic System Using Genetic and Particle Swarm Based On Optimization Algorithms

Ercan KOSE, Aydin MUHURCU

Solution for the Travelling Salesman Problem with a Microcontroller-based Instantaneous System

İlhan ILHAN

The Usage Of Artificial Neural Networks Method In The Diagnosis Of Rheumatoid Arthritis

Kadir TOK, İsmail SARİTAS

Long Term and Remote Health Monitoring with Smart Phones

Pinar KİRCİ, Gokhan KURT