Abstract
Vision plays a crucial role to comprehend the world around us as more than 85% of the external information is obtained through the vision system. It largely influences our mobility, cognition, information access, and interaction with the environment as well as with other people. Blindness prevents a person from gaining knowledge of the surrounding environment and makes unassisted navigation, object recognition, obstacle avoidance, and reading tasks major challenges. Many existing systems are often limited by cost and complexity. To help the visually challenged overcome these difficulties faced in everyday life, we propose the idea of VisBuddy, a smart assistant which will help the visually challenged with their day-to-day activities. VisBuddy is a voice-based assistant, where the user can give voice commands to perform specific tasks. VisBuddy uses the techniques of image captioning for describing the user's surroundings, optical character recognition (OCR) for reading the text in the user's view, object detection to search and find the objects in a room and web scraping to give the user the latest news. VisBuddy has been built by combining the concepts from Deep Learning and the Internet of Things. Thus, VisBuddy serves as a cost-efficient, powerful and all-in-one assistant for the visually challenged by helping them with their day-to-day activities.
Abstract (translated)
URL
https://arxiv.org/abs/2108.07761