The Be My Eyes app is set to gain a new feature called the Virtual Volunteer, powered by OpenAI's recently announced GPT-4 model. This new tool has the potential to be a game changer for people with visual impairments, providing a virtual sighted assistant that can generate context and understanding for images in the same way as a human volunteer can.
Be My Eyes has been providing technology for the blind and low-vision community since 2012, connecting users with volunteers for assistance with everyday tasks like navigating airports and identifying products. The new feature, which uses GPT-4’s visual input capability, will allow the app to offer an even greater degree of independence to users and expand the use cases for the app.
Michael Buckley, CEO of Be My Eyes, stated that “in the short time we’ve had access, we have seen unparalleled performance to any image-to-text object recognition tool out there,” and added that “the implications for global accessibility are profound. In the not so distant future, the blind and low vision community will utilize these tools not only for a host of visual interpretation needs, but also to have a greater degree of independence in their lives.”
The difference between GPT-4 and other language and machine learning models is both the ability to have a conversation and the greater degree of analytical prowess offered by the technology. Basic image recognition applications can only identify what’s in front of them, while GPT-4 can extrapolate, analyze, and understand context, allowing it to offer much more comprehensive assistance.
One of the most exciting aspects of the Virtual Volunteer feature is its ability to offer instantaneous visual assistance for a wide variety of tasks. In a give example, users can send an image of the contents of their fridge to the tool, which will then not only identify what’s in the fridge, but also suggest recipes that can be prepared with those ingredients.
The new feature has undergone beta testing with a select group of Be My Eyes employees, yielding overwhelmingly positive feedback. Testers have praised the feature's functionality, particularly in one instance where a user was able to expertly navigate the railway system. The feature provided detailed information about the user's location on a map and offered step-by-step instructions on how to safely reach their desired destination.
In the video accompanying the announcement, a Be My Eyes user shows the app helping her with a number of tasks, including describing the look of a dress, identifying a plant, reading a map, translating a label, directing her to a certain treadmill at the gym and telling her which buttons to push at a vending machine.
Jesper Hvirring Henriksen, CTO of Be My Eyes, explains that traversing the complicated physical world is only half the story. There are challenges faced by visually impaired people when using screen readers to understand web pages, especially when it comes to images. “GPT-4 is able to summarize the search results the way the sighted naturally scan them—not reading every minuscule detail but bouncing between important data points.”This technology could simplify tasks such as reading news online and navigating cluttered e-commerce sites for people with visual impairments.
The Virtual Volunteer will be available to users in the coming months, and it has been hailed as “game changing” by Buckley. “Ultimately, whatever the user wants or needs, they can re-prompt the tool to get more information that is usable, beneficial and helpful, nearly instantly.”
If and when the Virtual Volunteer is unable to answer a question, it will automatically offer users the option to be connected via the app to a sighted volunteer for assistance.
The tool will be free for all blind and low-vision community members using the Be My Eyes app. You can currently register in the app to be placed on a waiting list for access.
We are excited to discover how the Virtual Assistant will compare to other similar apps in the market, such as Seeing AI and Envision AI. If it lives up to the hype, this should indeed be a game changing development.
Please share your own thoughts and hopes about this exciting development in the comments section below.
Looking forward to testing it
I applied to become a beta-tester via their iOS app. The proof of the pudding is in the eating and I'd like to experience the potential improvements myself. I'm, of course, a bit worried about how they want to tackle our personal information, images, captured videos, etc., so am looking forward to hearing from them regarding this irking issue.
I'll give this a try.
Equally promising is the notion of replacing the existing SIRI system with one based on generative AI. Google, Amazon, and Apple are all looking at this new technology to replace their existing smart assistant products.
Data risks the same as ever
I’m not any more concerned giving my data to Be My Eyes than I was already … so it bothers me a little but I’ll still do it. They already get my camera and mic access, I’m ok giving them location permission like I did with Aira. Maybe OpenAi will get access to thattoo, I don’t love the thought but the benefits seem like they could be well worth that for me.
Interesting. In some ways, I…
Interesting. In some ways, I actually like the idea of dealing with a computer for tasks like this more comfortable than the idea of working with a volunteer I've never met, so that's a plus for me.
On the other hand, all the stories I've read about these models makes me kind of nervous. Given that they're prone to making up things that aren't there, and that there's no way for us to confirm what they're telling us, I'm kind of concerned it'll invent train platforms that aren't there or read me signs for shops that don't exist or something.
I am a fan of, Be My Eyes, this is treading into the realm of science fiction. Never did I believe that I could someday contact an AI for visual assistance .
Thank you Be My Eyes
It struck me, while reading your comment, that people are prone to making up things that aren't there too. And sometimes with malicious intent.
I'd want to watch what others experience with it for a time, and let the AI ripen a bit.
As far as data goes
When I go into town, I make myself available to people, both visually and physically in some cases. People with bad intentions can easily follow me around or find ways to do me harm. That being said, would that not be somewhat similar on the web? How would putting myself on the web be any riskier than say, having a coffee at a coffee shop, going to church or visiting a nightclub? Surely, you don’t go places to play Casper? Surely, you don’t join sites where you can meet people, expecting not to meet people? That’s my take.
This feels important.
Is this a paradigm shift in assistive technology? It feels like it might be. Fingers crossed and register button tapped.
I signed up
I signed up to become a beta tester all this new feature. I am extremely excited about its potential, so is my wife, she is a volunteer with the Be My Eyes app, where is I am a user. of course I think it will have bugs in the beginning, no technology is perfect after all. And yes, there will be some risks, but in my opinion, I think the benefits will outweigh the risks.
I wonder ...
I wonder if my virtual AI assistant has ever been to a strip club.
Instant bann right there I guess.
On a more serious note, what if you receive a dirty pick and try to have it recognised? Would they fault you for that? It may even be a dirty meme from Twitter or Facebook. The main reason for sending picks in the first place, is because we don’t know what they are.
This is wonderful!
This is incredible. Does anyone have some audio that we could hear of this in action? I believe it will work via the share sheet, though I hope we will also be able to call it. Will there be any certain camera requirements?
I'd like it to:
- support URL upload. Image on DropBox or whatever cloud systems together with camera roll.
- have it as a browser DESKTOP extension, safari edge and chrome, so that you can press a combination key and your screen will instantly be shared with virtual assistant for the time of answering your question - could it be a captcha or whatever.
- option to call human assistance to integrate or verify information, yes, even via web on desktop with ability to share the computer screen.
as BeSpecular (image description app based on human help) is no longer here, I find it a very useful feature and signed up, although I still find GPT4 engine not so mature: I tried it with an image found into unsplash.com representing a turtle - it was written on description and human confirmed it-, but bing replied it was a woman in the mountains, much nonsense.
And a friend had a badly formatted PDF, asked Bing and it gave him fake information with invented address and numbers; of course when it's available to the market it has to deal with sensitive data and give the users conditions to accept, as it's a personal aid that a blind person would rely mostly on.
About data spreading, let me show an extreme scenario: person in need of a pregnancy test, the ones to perform at home, or even an HIV self-test such as Oraquick or similar.
They usually show two lines if positive and one line if negative. Or, reading a credit card with number, expiration date, security code on the back.
Honestly in this case I wouldn't worry about where my data go, because if I have a close human who tells me information in person, who can guarantee they cannot blackmail me for something or, in the credit card case, they could write the numbers down on their hand with a pen or on their smartphone then do whatever... We are vulnerable people and this must be accepted after all, so, an artificial assistant in many cases can be better than a human because at least it cannot blackmail us or betray us telling our information to whoever we wouldn't like to talk to.
About "dirty", erotic material: why shouldn't a blind person have that material in hand! It wouldn't be fair to have a virtual machine telling you that this material is "not suitable".
So, hopefully, it can manage also that kind of material.
RE: Instant Ban Right There I Guess
You do bring up a good point. Will the company be watching, see what we get the assistance with, and then slap us on the wrist for being bad little boys and girls if we unsuspectingly show it something inappropriate? I make jokes, of course, but I could really imagine that happening.
Why would that be a ban?
If it's being sent to a computer, what harm is there in showing something nsfw? It'd be a bit silly for them to give you the boot because you took a picture of a toy or a person with a collar or something.
when will i get access?
guys, that's awesome! i pressed to register into the waitlist and checking out hourly if they accept me to the betatesters.
I'm looking forward to beta testing this when they open it up!
Me too. I signed up and I keep checking my access every day. Lol.
Exciting, but something to keep in mind.
It will be months before this is widely open for everyone to use, even people on the waiting list. The underlying technology is still in limited beta, and being slammed by requests from other projects and users.
This wil blow seeing AI and things like that out of the water.
Here are some videos, they're short but damn are they empressive: Be My Eyes: How ChatGPT4 is helping visually impaired people picture what they can't see | ITV News https://www.youtube.com/watch?v=cUSeFnZGIzY
I'm blind, let me show you how AI is going to change your life: https://www.youtube.com/shorts/-fDRYZXR4YM
I'm blind, I Replaced my PT with AI [AD]: https://www.youtube.com/watch?v=RIcuQUthfXc
I think the AD, means add, not Audio description in this case :)
This is going to be game changing but for those refreshing every day, I'd recommend turning on notifications instead.
I've messed with chat gbt and open AI, this will replace all of your blind text programs, I can garentee it.
I think the new tool for the Be my eyes is amazing, but there is only one thing, nothing can replace the volunteers that make be my eyes so brilliant to use. basically what I am saying is, you cannot replace people
Here's one more video.
The speech is quite fast, I couldn't understand it but it's still cool.
I understand what you mean about the human factor but for living our day to day lives, this is amazing!
There's so many things we won't need a human for, I think you'll be surprised when you try it out.
Re: Here's one more video
So sorry for the rant - but please get ready for it <LOL>! Seriously, what's the point of using such a high speed rate when the not-so-hidden purpose of her demos is promoting the new AI capabilities of Be My Eyes? She gets overly excited in her Be My Eyes videos when, in effect, oftentimes she herself understands the speech output. Wish Be My Eyes had better folks to demo the new features for the new AI capabilities. On a relatively similar topic, getting an app to read a menu at a restaurant isn't so exciting, after all, I mean it shouldn't reach the wow moment because apps like Seeing AI can do it.
I disagree about the menu.
Yes her speed is fast, at least on that video, but I think she just made it on the fly and hopefully be my eyes will come out with other videos.
As for the menus I disagree, yes seeing AI can read them, depending on font and things like that, but with this part of BeMyEyes you can ask very spesific questions.
For example, it tells Lucy the prices of gin and reads out the headders, like starters, chicken and so on, so I really do think this will be a game changer.
Imagine going into a gym, you find the treadmill and step on it, you then take a picture of the screen and ask the assistant what programs are on the screen and where are they located, it then tels you something like, the timer is on the left, you press it and then take a picture and ask, where are the buttons to raise and lower the timer, you find them, take a pic and ask how much the timer has been increased by and hear, the timer has increased by 5 minutes. You now know something you didn't before and all without sighted help.
Now; I'm not saying this is something that we'll be able to do but honestly applevis community, i'd not be surprised at all if we can.
I'm excited for this and can't wait to hear more podcasts/videos on this feature.
Re: I disagree about the menu.
I'd still want to wait for the AI technology to develope a track record before walking in front of trains and such. However, the menu post makes me wonder if this could be used in those computer startup UEFI menus, or the older BIOS. Could it identify highlighted selections in a list on a computer screen?
why would you walk in front of a train?
That seams a bit bad. :)
this is definitely an error
I dayly check for be my eyes virtual volunteer and it says access pending. This is a serious bug because I am in a wait list for long long eleven days and still it doesn't allow me to test it.
Me too but I think its by design.
I understand wanting to get in there and see what they’ve made, it’s like a fun house that doesn’t open til 9am and I’ve been queueing since 2am. I was hopefully one of the first to register as well. It’s been well over a week. I’m guessing they’re just being cautious. I wish they wouldn’t, I understand it’s not going to be finished and might do potentially undesirable things but I’m okay with that.
It's not a bug.
This is being rolled out slowly.
i'd not be surprised if we get access in a month or so.
having said that; it could be longer because they might have to sort out server stuff on their end.
One more thing.
Don't be surprised if they want you to pay for this feature. Perhaps you get x amount of pictures for free but after that you have to pay, honestly? I'd not mind that at all.
I'd be paying to help with server costs and improving the AI. It would honestly be something I'd actually want to pay for.