Üretken Çekişmeli Ağlar ile Görsel Çözünürlük Artırımı Üzerine Bir Araştırma

Çözünürlük artırımı (süper-çözünürlük) belirli bir artırım değeri ile görselin yüksek çözünürlükteki detaylarını korumaya çalışarak boyutlarını artırma işlemidir. Süper-çözünürlük birçok teknik ile gerçekleştirilebilir. Ancak bu konudaki en etkili teknikler çeşitli sinir ağı tasarımlarından yararlanan tekniklerdir. Bazı ağ tasarımları belirli konularda diğerlerine göre daha uygundur. Bu çalışma Üretken Çekişmeli Ağlar ile gerçekleştirilmiş çözünürlük yükseltme işlemlerine odaklanmıştır. Birçok çalışma yapar veri üretimi ve verinin daha anlamlı hale getirilmesi gibi çeşitli konularda bu yapay sinir ağı tipini kullanır. Bu yapay sinir ağı tipi ile yapay veri üretimi ve verinin daha anlamlı hale getirilmesi gibi alanlarda başarılı çalışmalar mevcuttur. Daha gerçekçi sonuçlar üretebilmesi için birbirini yenmeye çalışan iki alt ağdan oluşması bu ağ türünün kilit noktasıdır. Üretilen görselin kalitesini ölçen başarım ölçümleri, sinir ağında kullanılan yitim fonksiyonları ve Üretken Çekişmeli Ağ kullanarak çözünürlük artırımı üzerine çalışılmış araştırma makaleleri bu çalışmanın temel alanında yer almaktadır.

A Survey on Image Super-Resolution with Generative Adversarial Networks

Super-resolution is a process to increase image dimensions with a specific upscaling factor while trying to preserve details that matche with the original high-resolution form. Super-resolution can be done with many techniques. But the most effective technique is the one that takes advantage of several neural network designs. Some network designs are more appropriate than others on the specific subject. This study focuses on super resolution studies using Generative Adversarial Network. Many studies use this neural network type to look at various topics such as artificial data production and making the data more meaningful. The key point of this neural network type is having two different sub-networks that try to defeat each other in order to make more realistic results. Performance metrics that measure the quality of a generated image, loss functions used in a neural network and research papers on super-resolution with Generative Adversarial Network are the main domains of this study.

PDF

___

Agustsson, E., & Timofte, R. (2017). Ntire 2017 challenge on single image super-resolution: Dataset and study. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops(126-135).
Bai, Y., Zhang, Y., Ding, M., & Ghanem, B. (2018). Finding tiny faces in the wild with generative adversarial network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Bevilacqua, M., Roumy, A., Guillemot, C., & Alberi-Morel, M. L. (2012). Low-complexity single-image super-resolution based on nonnegative neighbor embedding. Proceedings of the 23rd British Machine Vision Conference (BMVC).
Bin, H., Weihai, C., Xingming, W., & Chun-Liang, L. (2017). High-quality face image generated with conditional boundary equilibrium generative adversarial networks. Pattern Recognition Letters.
Blau, Y., Mechrez, R., Timofte, R., Michaeli, T., & Zelnik-Manor, L. (2018). The 2018 pirm challenge on perceptual image super-resolution. Proceedings of the European Conference on Computer Vision (ECCV).
Bulat, A., & Tzimiropoulos, G. (2017). How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3d facial landmarks). International Conference on Computer Vision.
Bulat, A., Yang, J., & Tzimiropoulos, G. (2018). To learn image super-resolution, use a gan to learn how to do image degradation first. Proceedings of the European conference on computer vision (ECCV), (pp. 185-200).
Caltech Pedestrian Detection Benchmark. (2019, 12 23). Retrieved from http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/
Dataset, M. -J. (2019, 12 23). Retrieved from http://www.manga109.org/en/
Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition.
Dong, C., Loy, C., & Tang, X. (2016, 12 23). Accelerating the Super-Resolution Convolutional Neural Network. European Conference on Computer Vision (ECCV).
Dosselmann, R., & Yang, X. D. (2005). Existing and emerging image quality metrics. Canadian Conference on Electrical and Computer Engineering.
Gerchberg, R. W. (1974). Super-resolution through error energy reduction. Optica Acta: International Journal of Optics, 21(9), 709-720.
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. International Conference on Learning Representations.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 2672-2680.
Gotoh, T., & Okutomi, M. (2004). Direct super-resolution and registration using raw CFA images. Computer Vision and Pattern Recognition (CVPR), 2.
Gupta, A., Vedaldi, A., & Zisserman, A. (2016). Synthetic data for text localisation in natural images. IEEE Conference on Computer Vision and Pattern Recognition.
Hradiš, M., Kotera, J., Zemcık, P., & Šroubek, F. (2015). Convolutional neural networks for direct text deblurring. Proceedings of BMVC, 10(2).
Huang, J. B., Singh, A., & Ahuja, N. (2015). Single image super-resolution from transformed self-exemplars. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Huynh-Thu, Q., & Ghanbari, M. (2008). Scope of validity of PSNR in image/video quality assessment. 44(13), 800-801. Electronics letters.
ITU-T. (2006). Rec. P.10: Vocabulary for performance and quality of service.
Jaderberg, M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2015). Deep structured output learning for unconstrained text recognition. International Conference on Learning Representations (ICLR).
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. European conference on computer vision.
Kaggle - T91 Image Dataset. (2019, 12 23). Retrieved from https://www.kaggle.com/ll01dm/t91-image-dataset
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., & Shafait, F. e. (2015). ICDAR 2015 competition on robust reading. International Conference on Document Analysis and Recognition (ICDAR).
Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., i Bigorda, L. G., & Mestre, S. R. (2013). ICDAR 2013 robust reading competition. International Conference on Document Analysis and Recognition.
Kingma, D. P., & Welling, M. (2014). Auto-Encoding Variational Bayes. International Conference on Learning. Representations.
Kingma, D. P., & Welling, M. (2019). An Introduction to Variational Autoencoders. Foundations and Trends® in Machine Learning, 12(4), 307-392.
Large-scale CelebFaces Attributes (CelebA) Dataset. (2019, 12 23). Retrieved from http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
Ledig, C., Theis, L., Huszár, F., Caballero, J., Cunningham, A., Acosta, A., & Shi, W. (2016). Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 4681-4690).
Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., & Yan, S. (2017). Perceptual generative adversarial networks for small object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Liu, W., Liu, X., Ma, H., & Cheng, P. (2017). Beyond human-level license plate super-resolution with progressive vehicle search and domain priori GAN. Proceedings of the 25th ACM international conference on Multimedia, (pp. 1618-1626).
Liu, X., Liu, W., Mei, T., & Ma, H. (2016). A deep learning-based approach to progressive vehicle re-identification for urban surveillance. European conference on computer vision, (pp. 869-884).
Lucas, S. M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., . . . Lin, X. (2005). ICDAR 2003 robust reading competitions: entries, results, and future directions. International Journal of Document Analysis and Recognition (IJDAR), 2(105-122), 7.
Ma, W., Pan, Z., Guo, J., & Lei, B. (2018). Super-resolution of remote sensing images based on transferred generative adversarial network. IGARSS 2018- 2018 IEEE International Geoscience and Remote Sensing Symposium.
Mishra, A., Alahari, K., & Jawahar, C. V. (2012). Top-down and bottom-up cues for scene text recognition. IEEE Conference on Computer Vision and Pattern Recognition.
Mjolsness, E. (1985). Neural networks, pattern recognition, and fingerprint hallucination. Diss. California Institute of Technology.
Nasrollahi, K., & Moeslund, T. B. (2014). Super-resolution: a comprehensive survey. Machine vision and applications, 25(6), 1423-1468.
Park, S. J., Son, H., Cho, S., Hong, K. S., & Lee, S. (2018). Srfeat: Single image super-resolution with feature discrimination. European Conference on Computer Vision (ECCV).
Phan, T., Shivakumara, P., Tian, S., & Tan, C. (2013). Recognizing text with perspective distortion in natural scenes. International Conference on Computer Vision.
Protter, M., Elad, M., Takeda, H., & Milanfar, P. (2008). Generalizing the nonlocal-means to super-resolution reconstruction. IEEE Transactions on image processing, 18(1), 36-51.
PSNR. (2020, 7 6). Retrieved 11 23, 2019, from MathWorks: https://www.mathworks.com/help/vision/ref/psnr.html
Risnumawan, A., Shivakumara, P., Chan, C. S., & Tan, C. L. (2014). A robust arbitrary text detection system for natural scene images. Expert Systems with Applications, 18(8027-8048), 41.
Sajjadi, M. S., Scholkopf, B., & Hirsch, M. (2017). Enhancenet: Single image super-resolution through automated texture synthesis. International Conference on Computer Vision (ICCV).
Shi, B., Yang, M., Wang, X., Lyu, P., Yao, C., & Bai, X. (2018). Aster: An attentional scene text recognizer with flexible rectification. IEEE transactions on pattern analysis and machine intelligence, 41(9), 2035-2048.
The Berkeley Segmentation Dataset and Benchmark. (2019, 12 23). Retrieved from https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
Timofte, R., Agustsson, E., Van Gool, L., Yang, M. H., & Zhang, L. (2017). Ntire 2017 challenge on single image super-resolution: Methods and results. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 114-125.
Traffic-Sign Detection and Classification in the Wild. (2019, 12 23). Retrieved from https://cg.cs.tsinghua.edu.cn/traffic-sign/
UC Merced Land Use Dataset. (2019, 12 23). Retrieved from http://weegee.vision.ucmerced.edu/datasets/landuse.html
Wang, K., Babenko, B., & Belongie, S. (2011). End-to-end scene text recognition. International Conference on Computer Vision.
Wang, W., Xie, E., Sun, P., Wang, W., Tian, L., Shen, C., & Luo, P. (2019). TextSR: Content-Aware Text Super-Resolution Guided by Recognition. arXiv preprint.
Wang, X., Yu, K., Dong, C., & Change Loy, C. (2018). Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., & Change Loy, C. (2018). ESRGAN: Enhanced super-resolution generative adversarial networks. Proceedings of the European Conference on Computer Vision.
Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4), 600-612.
Wu, B., Duan, H., Liu, Z., & Sun, G. (2017). Srpgan: Perceptual generative adversarial network for single image super resolution. arXiv preprint.
Xie, Y., Franz, E., Chu, M., & Thuerey, N. (2018). tempoGAN: A temporally coherent, volumetric gan for super-resolution fluid flow. ACM Transactions on Graphics (TOG), 37(4), 1-15.
Xu, X., Sun, D., Pan, J., Zhang, Y., Pfister, H., & Yang, M. H. (2017). Learning to super-resolve blurry face and text images. Proceedings of the IEEE International Conference on Computer Vision.
Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). WIDER FACE: A Face Detection Benchmark. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Zhang, D., Shao, J., Hu, G., & Gao, L. (2017). Sharp and real image super-resolution using generative adversarial network. International Conference on Neural Information Processing, (pp. 217-226).