Voiceover recognition is the way of the future!

Accessibility Advocacy

I was sitting at Dish network a couple of years ago waiting for another call to come over the Sling TV customer service line, as that was my job. I had a lot of time between calls. I'm one of those people who is quite creative, but don't know how to code, so all these thoughts just hang around my mind, in less they are music or other audio. I thought of JAWS implementing a new feature, that would be, so to speak, scripting for dummies, or, the myself, the general JAWS user. I would call it JAWS AI learning. You would go into an inaccessible app, you would perform a keystroke, JAWS would take some time and label the apps various controls, using OCR to discover the text which corresponded to each control, and let the user know the type of control. After this machine learning was completed, JAWS would take the end user through a wizard, the list of controls to allow the end user to see if the machine learning had been accurate, and allow the end user to make any necessary changes. IF the end user had a sighted person with them JAWS would show where the control in question was on the screen for even better scripting accuracy. Once a person had finished the app could be quite accessible. I had never written to FS about this because, I feel as the general public, my thoughts wouldn't gain any traction. Now a couple of years later, in iOS we have a some what similar feature, that will only get better in time. I personally couldn't be more excited.
I tried an inaccessible app this morning, Pacemaker with the Voiceover recognition enabled. This app would not do anything at all before. I have had it for a few years. With this new feature, it wasn't totally usable but I could at least get somewhere.

after iOS got image recognition in Voiceover other screen readers also picked this up. Now, could my idea come to a reality in another mainstream screen reader? maybe so. We all, as blind people spend so much time with software that isn't accessible. Jobs aren't available for us because of this, as one of the main holdbacks. If this did come to the main stream computing world, this would open a lot of doors! Again, Apple, way to go! You guys always innovate, and I am always truly so thankful for it! I hope someone who has some clout in the accessibility world sees this post, or builds off of these machine learning ideas in the future.



Submitted by Remy on Thursday, September 17, 2020

Let me first say I love the idea of Voiceover Recognition. Even if all I'd like to know about a picture is the text displayed, it's a big step forward. I love your idea of having a screen reader use OCR and machine learning to then label elements. It's such a simple sounding solution. The only thing that would need to happen to make that really work is the elements would then have to be interactable. It's one thing to label a non-labeled button on an app, but if a part of the screen isn't even selectable by voiceover, it doesn't really help much. Unless I'm vastly misunderstanding things. EIther way it's really cool and I can't wait to see how it improves and evolves. I'd love to, for instance, be able to have Voiceover read any on-screen text in real time on a video I'm watching or game I'm playing. Even if I had to use a command for that or something. Heck, the back tap would be a perfect shortcut to activate such a thing.

As I say I do have one concern, and that is this takes the responsibility away from the developers to make their apps accessible. On one hand if recognition really did start replacing conventional accessibility, that would mean more apps would simply just work. But on the other hand, We'd then rely one a single service for accessibility. I realize that's highly unlikely, bt it does give me pause. Having elements labeled properly is going to make navigating far faster than Recognition will. Recognition is a great backup though with a whole lot of potential.

Submitted by Holger Fiallo on Thursday, September 17, 2020

It will be interesting when VR becomes a reality in which a empty desk can project a keyboard or a viewer in which people can interact. How it would work for those of us who are blind. How can you use a VR keyboard that is projected on top of your desk? People who can see the hologram can do so but how accessibility deal with something that is not really there?Now someone will say that is sci-fi and not real but that may the future.

Submitted by blindbossfisher on Thursday, September 17, 2020

My only concern is that this feature will make Apple itself lazy and not properly label its UI items. I hear this has happened with a few widgets and in the new Translate app. I say hear, because I have an older phone, and I, along with several others who can't afford to upgrade, are therefore being left behind with this innovation.

Submitted by Paul H on Thursday, September 17, 2020

I know the phrase game changer is over used but, with Apple, yet again, it seems justified to use it.

I totally agree with you as to how the future could look now that Apple have given us a crystal ball to some extent with this new feature.

Since you imagined this one so well, I’ll mention something else and that is the image recognition feature.

When you think that Facebook have barely developed their image recognition beyond a few basic tags which read something like: image of people standing, water, nature and plants. Compared to Apple maing full and meaningful descriptions like the one I got a couple of days ago when I was lining my camera up to take a photo along the river Themes in London. I was looking at the London Eye and County Hall was in the background.

For now I got: Ferris wheel against a blue sky next to water with buildings in the background.

Now imagine Apple developing this further for blind users and non blind users alike. All they’d have to do is lever actual information about the location and where you are in proximity to those buildings and features and it could have so easily reported: You are standing on Hungerford Bridge looking downstream over the river thames. The London Eye is to your right and County Hall is in the background. A Themes Clipper (boat) is moving beneath you on the left bank.

I predict this will be called Apple Tour Guide and it will have multi layered levels of information so, the tourist could choose from a bunch of recommended walking routes or make one up for themselves. They’d be able to ask the guide to include an artisan coffee shop along the way and so on.

Meanwhile, blind users could be given live information along any route such as narrow pavements, any steps coming up; which way a path in a park is meandering and so on.

Submitted by Paul H on Thursday, September 17, 2020

In reply to by Holger Fiallo

I think that is where haptics will really come into its own. Its true potential isn’t really public at the moment but, imagine your brain being convinced you are interacting with real buttons, different feeling surfaces and so on, well, that is what haptics aims to do and more. Remember, we all didn’t think we would ever be able to interact with touch screens, yet, here we are doing just that.

Submitted by kool_turk on Friday, September 18, 2020

In reply to by Holger Fiallo

It may seem like Sci-Fi now, but what use to be a Sci-Fi concept is slowly becoming reality.

Tablets like the iPad are similar to the data pad in star trek.

Those Mirrors that were used in the Harry Potter series, even though we never saw them in the official books can be like video calls.

I can think of a way that keyboard example might work, but the technology isn't there yet.

For those of you that remember the old boom box, remember when you would turn it up and you put your hand in front of the speaker and you could feel the air?

That is how I would imagine how we would be able to feel those VR objects.

It wouldn't be perfect, but at least its something.

You could even use another concept, this one would be like the reverse magnet effect.

In fact, researchers have been playing with something like you describe involving air, though I forget the details.

I don’t see a projected holographic keyboard being possible in the near future as one person suggested, but I think such a thing would be possible using AR glasses, which technically wouldn’t be a hologram, but more likely given current technology.

Someone else suggested the image recognition feature could be developed into a virtual tour guide. It sounds like an interesting concept, and I think this is something that might be helped along by the rumored Apple Glasses, assuming Apple ever does create them. I think their AR push suggests Apple Glasses are under development since the AR experience is more cumbersome with a device that needs to be held or mounted in a bracket.