/ Accessibility / Information Consumption /


Enable blind people to know exactly what is in front of them in real time by only using a smartphone

This Challenge had a Bounty! $250 The bounty was awarded to charlesic on October 31, 2016

Mobile Camera and Sensor Based Events


First of all the device (smartphone) must be with a quality specification and feature rich. The following technology will be used to detect in real time what the blind person is facing at that moment:

1.Machine Vision (OpenCV , Google Vision API, Fast CV ..)
2.GPS / Geo Location detection in realtime
3.Sensor (Found in some android Phones)
4.Search engine (Google is best)

This will be in the form of an app , which will auto start which the phone is booted ,using the camera we can capture series of photos, an algorithm will be created to calculate the object ahead of the blind person, is the object also moving forward or approaching the blind person (possible calculating the distance of the blind person and the object from the captured image and checking the distance differences between each image if the same object is the same in each image)

Machine Vision
With Machine vision , we can detect alot information about the information the camera was able to capture,Note that the camera keeps on capturing and data is been proccessed and returned as soon as possible, example we can detect the type of person approaching , such as color type of person, gender, race, typr of cloth, color of cloth, is the person in uniform , what type of uniform (example: a polic) and etc..

More over we can also detect the mood of the person, is the person having a happy face , sad or an angry face.. Advanced machine vision can detect if the approching person or object is holding an object(example : a weapon).

If the approaching object is not a human we can also detect the type of object, color of object and more

Using OCR(Optical Character Recognision),QR and BarCode detection in machine vision, we can also detect if the object infront of us has any text , barcode , or QR code in realtime and proccessed

After detecting enough info using Machine vision through the camera, the GPS too can help get a little info of the current place the object in front of us is , example getting the longitude and latitude can help us to know if the blind person is in a dangerous place or a safer place and what he or she is approaching, example using google geo location and map api to detect if the person is heading towards a building, road , river ….

Most phones contains sensors , I will talk about Heartbeat sensor ,
lets say someone shouted at the blind man, naturally , the blind person’s heart will react to the tone of the voice shouting at him, using the heart beat senson, we can also detect if there is danger approaching or not

4.Well as a final resort, we can make the smartphone gather infomation such as text , pictures and auto google them returning the results.

5.All the proccessed results will need a way to inform the blind person, this is where TTS comes in, the app will use Text to Speech technology to explain to the blind person what is really happening

Real Time Video or Image Feed Analysis


The idea is simple and straightforward. Visually impaired smartphone users can start the application through speech commands and the application will capture an image or a short video of the obstacles. Then the recorded video or image can be processed through 3rd party image or video recognition API like clarifai. The recognized information is then converted to voice reply which can be playbacked on the phone.

touch and vibrate


When a blind person touches the screen of the smartphone there should be vibration at the places where there is obstacle. It will be something like brell letters but we should use real time video and image processing to make people see actual thing.

The intensity of the vibration should correspond to the depth of the object in front of them.

If there is an apple in front of them, they can take the picture and feel it in their phones. The phone will vibrate more where around the edges and less at the center. They will be able to know it is an apple.

Similar thing can be done with roads. Take a video while walking and feel it in the smartphone.

3D Vibrational Language Array Bracelet


A bracelet with built-in servos, tappers, and buzzers that link to a real-time mapping and data (net) system. A 3D Vibrational Language Array would be a new language that you “feel”. It would stream social network data, allow you to feel terrain and objects around you, sense moving objects, etc.

Within the bracelet circumference will be the navigational mechanics: North, South, East, West. If I’m walking North and I feel a buzz on the outside my wrist it’s essentially telling me to turn to the “felt” direction (East). Hard and soft buzzes can determine distance or proximity toward collidable objects. Tappers and servos can be used for streaming data straight to the skin, just a new learning curve which will be picked up quickly enough. This is a new way to intake data, silently as well.

Virtual white cane


A virtual white cane can be created using a smartphone by creating an app which uses the camera and vibration motor of the smartphone.

This is done by integrating 3D imaging software from Dibotics which uses a single camera sensor and movement to generate a 3D view of the sidewalk in front of the user and making the smartphone vibrate with different intensities depending on the distance to the ground surface. The user can customise the settings to adjust their virtual cane to their required height. This also has the advantage of being able to warn the user of uneven terrain or low hanging obstructions by vibrating in a certain pattern or playing a tone.

When combined with Google Maps and with the GPS function turned on, the app would be able to assist the user in navigation by using a voice recognition software or virtual assistant where the user speaks the address into the phone and the route is plotted out in Google Maps and the app directs the user through the use of different tones that vary depending on the direction of travel and facing of the user. The user would be warned of traffic crossings if required and other potential hazards.

Users would be able to create their own hazard points on the map by saying ‘hazard point’ into their phone and then pointing it at the hazard for the next few seconds. This would upload the hazard point onto a shared database for the app users, allowing other users to be warned of the potential hazards.

If the user needs to look at signs for directions, the user can activate a sign reading mode by saying ‘read signs’ into their phone and then as they point their phone about, it would vibrate whenever it is pointing at text, informing the user where the text is, and then reads the text out to the user if it is pointed at the text for a couple of seconds. This can be done through the use of Google’s Optical Character Recognition software.

If the user wants to have an idea of the surroundings, the user can activate an image recognition mode by saying ‘describe to me’ into their phone then pointing their phone around. By using Google’s Cloud Vision API, the app would then describe out loud to the user the object that the phone is pointed at, such as ‘car’, ‘building’, ‘phone booth’, etc.

Other features can be built into the app such as recognition of individuals, informing the user of the incoming bus number, tracking of small objects such as keys, etc.

All the software required to make this app are available on the market and only requires funding and a software developer to combine it into a single app. This app should enable a blind user to commute around with minimal assistance required, allowing them to travel about as a fully functioning individual.

Hologram Website

Waseem Raja

Making a device which can be produce a hologram to view a website and study the content in it . Let me explain it . I want to develop a device where hologram website will be projected . Which helps future education system to next level . If a person comes across any word in website which he doesn’t know then he will open a new tab in browser to search about it . But while using hologram website . He can just click on it or hold it for a while to pop up meaning and all details of it opens like a window in hologram website which will be located at the top ,right or left side of page . hologram website device will be small so everyone can carry it with them to cafe or any place that want too .if you open a browser to see meaning of any content in some website . You need to open a new tab for it . Which will make take you time a lot when pages increases in browser but my website device will hold all pages in side of webpages . Details I can explain it on any conversation with a presentation for more explanation on this topic .

Accessible of every application in the smartphone

Waseem Raja

Creating a application similarly as accessible for blind persons . I saw them using phone by just hearing voice from it . My application will help every blind person to use smart phones . I want to make an application which help them by hearing voice in there only language . Language is being barrier for every human being . So this application will help them to understand all stuffs in smartphones by hearing them with there only language and even guiding them what to do for accessing this application and few games which they can play by saying smartphone the instructions like clash of clan . If they want to build there town hall, they just need to say upgrade my town hall and application checks where there’s any word with town hall after clicking it . Is there any word upgrade in it . Simple way they need to know just how to play and application will guide them in every way by just using keyword of main application .



My first thought was Google’s Cloud vision API. It would be more effective with an external camera such as a GoPro connected via bluetooth, but you could get by with just the smartphone’s camera. Very simple really, video is being processed as single frames at an ideal frequency rate, Google’s Cloud Vision API is then used to detect objects and expresions. Objects or expressions with high percentage chances and ‘confidence’ are then read out to the user through headphones.

Submit your own idea!

Submit your idea to win the $250 bounty or earn a future revenue share!

+ Submit Idea