Environment Descriptor for Visually Impaired People Implemented on Raspberry Pi Based on Convolutional and Recurrent Artificial Neural Networks

Authors

  • Rafael Chourio Maestría en Ingeniería Eléctrica, Facultad de Ingeniería. Universidad de Carabobo. Valencia, Venezuela. https://orcid.org/0000-0002-8160-6439
  • Wilmer Sanz Laboratorio de Robótica y Visión Industrial. Escuela de Ingeniería Eléctrica. Facultad de Ingeniería, Universidad de Carabobo. Valencia, Venezuela https://orcid.org/0000-0001-7847-2372

DOI:

https://doi.org/10.54139/revinguc.v28i1.15

Keywords:

Artificial Neural Network, Natural Languaje Processing, Raspberry Pi, Deep Learning, Features Extraction

Abstract

Vision problems and blindness are disorders of the human body that according to figures from the World Health Organization affect 217 million people in low-income countries. The quality of life of at least 75 million of them can be improved with the development of systems that allow guiding them safely in their daily tasks; this is where it is important to look for technological alternatives oriented to solve this problem and it is precisely where the idea of this research work was born. The idea presented here is based on the development of an image description system trained with deep learning algorithms based on convolutional and recurrent neural networks, implemented in a single-board computer. This implementation uses a low-cost camera for taken images of the environment and obtains a description of it that can be converted to an audible voice signal through a hearing aid system so that visually impaired people can improve their standard of living by obtaining real-time information from the environment surrounding.

Downloads

Download data is not yet available.

References

Organización Mundial de la Salud, Clasificación Estadística Internacional de Enfermedades y Problemas Relacionados con la Salud. Décima revisión. Organización Panamericana de la Salud, 2008.

N. Chenthamil, N. Rekha, and P. Poovizhi, "Portable Camera Based Identification System for Visually Impaired People," International Journal of Innovative Research in Computer and Communication Engineering, vol. 19, no. 3, pp. 19 141- 19 146, 2016.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, USA: The MIT Press, 2016.

S. Ramón y Cajal, "Estructura de los Centros Nerviosos de las Aves," Revista trimestral de Histología Normal y Patológica, vol. 1, no. 1, pp. 314-318, 1888.

D. Rumelhart, G. Hinton, and J. Mcclelland, "A General Framework for Parallel Distributed Processing," Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, 1986. https://doi.org/10.7551/mitpress/5236.001.0001

D. Hubel and T. Wiesel, "Receptive fields of single neurones in the cat's striate cortex," The Journal of physiology, vol. 148, no. 3, pp. 574-591, 1959. https://doi.org/10.1113/jphysiol.1959.sp006308

K. Fukushima, "Neocognitron: A hierarchical neural network capable of visual pattern recognition," Neural Networks, vol. 1, no. 2, pp. 119-130, 1988. https://doi.org/10.1016/0893-6080(88)90014-7

D. Ciresan, M. Ueli, J. Masci, L. Gambardella, and J. Schmidhuber, "Flexible, High Performance Convolutional Neural Networks for Image Classification," in Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, vol. 2, 2011, pp. 1237-1242.

Y. Bengio, P. Simard, and P. Frasconi, "Learning Long-Term Dependencies with Gradient Descent is Difficult," IEEE Transactions On Neural Networks, vol. 5, no. 2, pp. 157-166, 1994. https://doi.org/10.1109/72.279181

S. Hochreiter and J. Schmidhuber, "Long Short-Term Memory," Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997. https://doi.org/10.1162/neco.1997.9.8.1735

O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and Tell: A Neural Image Caption Generator," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3156-3164. https://doi.org/10.1109/CVPR.2015.7298935

J. Donahue, L. Hendrick, M. Rohrbach, S. Venugopalan, S. Guadarrama, K. Saenko, and T. Darrell, "Long-term Recurrent Convolutional Networks for Visual Recognition and Description," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 677-691, 2017. https://doi.org/10.1109/TPAMI.2016.2599174

T. Yao, P. Yingwei, Y. Li, Z. Qiu, and T. Mei, "Boosting Image Captioning With Attributes," in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 4904-4912. https://doi.org/10.1109/ICCV.2017.524

X. Lu, B. Wang, X. Zheng, and X. Li, "Exploring Models and Data for Remote Sensing Image Caption Generation," IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 4, pp. 2183-2195, 2018. https://doi.org/10.1109/TGRS.2017.2776321

D. Velmurugan, M. Sonam, S. Umamaheswari, S. Parthasarathy, S. Guadarrama, K. Saenko, and T. Darrell, "A Smart Reader for Visually Impaired People Using Raspberry PI," International Journal of Engineering Science and Computing, vol. 6, no. 3, pp. 2997-3001, 2016.

R. Ardiansyah, "Design of An Electronic Narrator on Assistant Robot for Blind People," in MATEC Web of Conferences, vol. 42, no. 03013, 2016, pp. 03 013p.1-03 013p.5. https://doi.org/10.1051/matecconf/20164203013

C. Elamri and T. de Planque, "Automated Neural Image Caption Generator for Visually Impaired People," IOSR Journal of Engineering (IOSRJEN), vol. 10, pp. 28-33, 2018.

G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning. New York, USA: Springer, 2015.

Published

2021-05-03

How to Cite

Chourio , R., & Sanz , W. (2021). Environment Descriptor for Visually Impaired People Implemented on Raspberry Pi Based on Convolutional and Recurrent Artificial Neural Networks. Revista Ingeniería UC, 28(1), 152–164. https://doi.org/10.54139/revinguc.v28i1.15

Issue

Section

Research Event. School of Electrical Engineering. "Prof. César R. Ruíz"