The Blind Aide is a computer vision system designed to help those with visual impairments to complete tasks that are otherwise difficult or impossible. The system is made up of a microphone, microprocessor, camera, speakers and the controlling software. These components allow for a unique interface that minimizes the need for physical interaction with the device, allowing the user to control the system entirely with his/her voice. The system output is entirely audio (voice) which further increases the accessibility of the device.
The system is to be worn either on the head or chest with the camera facing outward. Using modern image processing and interpretation techniques, the system performs a series of functions that enhance the user's abilities to perceive his/her surroundings. The initial (prototype) function is that of interpreting text in the real-world images captured by the device. Upon receiving a command similar to "Read it to me...", the system captures an image, interprets the text found in the image, then converts the text to speech and outputs it through the speakers. The system accomplishes this through a technique proposed by Lukas Neumann and Jiri Matas. The technique is improved upon by using a series of two deep neural networks that perform the functions of character/non-character classification and then character recognition.
The software controlling the device is designed in such a way to foster augmentation. The device operates using a series of 'Ability Modules'. These modules share an interface that specifies their activation command, functional operation, and output format. Adding functionality is as easy as implementing a module and adding it to the list of supported abilities. The Blind Aide has been implemented in this way so that the project may be open sourced, improved upon, added to, and distributed freely by a community of developers focused on improving lives and furthering research in computer vision applications for the disabled.