The ChatGPT app has found a place in my iPhone's dock. I use it for many things, both serious and fun. Part of me is convinced that it's going to turn into Marvin the Paranoid Android from The Hitchhiker’s Guide to the Galaxy. There it is, brain the size of a planet, and I'm constantly asking it to answer very simple or repetitive queries. Yet, it always remains eager to assist with any question.
AI is a vast topic, and Morgan has already written an excellent post about it for AppleVis. Here, I want to explore how AI models have the potential to increase accessibility, both now and in the future. I'm calling it an exploration because I'm still discovering and experimenting with the capabilities of AI models. It's also a chance for you to explore with me, to tell me in the comments how you're using these models, what has worked well or hasn't, and your hopes and fears for the future of AI accessibility. I'll be mostly talking about ChatGPT, because that's what I'm most familiar with, but feel free to discuss other models and AI apps in the comments.
Describing Images, Real or Imagined
This is probably the most obvious way AI is currently being used to increase accessibility. Many of you will already be using Be My AI, the feature of the Be My Eyes app that provides AI-generated image descriptions. These descriptions are generated by GPT-4, the model powering the paid version of ChatGPT. Be My AI is a versatile and flexible tool. Its ability to answer follow-up questions is extremely useful for getting detailed descriptions of the aspects of an image you're most interested in, or requesting more information and context to help you understand its meaning.
The capabilities of models like GPT-4 extend beyond merely describing existing images. They are also invaluable for generating new ideas for visual content. For instance, if you are a content creator who has been blind since birth or for a long time, it might be difficult to generate ideas for the visual aspects of your projects when you haven't had recent exposure to visual content. You could try asking ChatGPT to generate textual descriptions of possible images or designs. I recently tried asking it to generate some logo ideas for a project, and the ideas it generated have helped me start thinking about possible designs. It won't entirely replace sighted assistance with design, but it might allow you to have a little more input into the process.
Text-Based Educational Content
When you're trying to learn, whether in a formal educational setting or for your own curiosity, it can be hard to find resources that don't depend on visuals in some subjects. What if you want to learn about the visual arts? What if you're trying to grasp mathematical concepts or discover more about science subjects normally taught through images, such as astronomy? ChatGPT can explain just about any subject and can customise its explanations to your needs. If you need explanations of visual concepts, ChatGPT can provide them.
As with any source, think critically about any information an AI provides. ChatGPT has an immense amount of knowledge on just about any topic you might be curious about, but it does have biases, doesn't know about very recent events, and won't always have detailed information about very obscure or highly specialised topics. However, every source of information has its limitations and weaknesses, and ChatGPT's ability to engage in conversation and tailor its responses makes it an excellent tool for getting exactly the information you need, explained in a way that'll make sense to you.
Most readers will already be aware of the limited range of accessible games on Apple devices and other platforms compared to what’s available for sighted people. ChatGPT doesn't entirely solve that problem but can generate an endless supply of accessible and customisable games and fun activities, from trivia games to text adventures. If you're asking the model to generate any kind of story or fictional world, it'll probably work better if you specify what kind of scenario you want. The old principle "garbage in, garbage out" applies. If you give it a generic prompt, you'll get a generic response. If you craft your prompt thoughtfully, the model will be able to build on your idea. Alternatively, you can create fictional worlds in collaboration with it, for example, engaging in role-play or building a story together, taking turns to write one sentence at a time.
You could also try games and fun experiments that test the capabilities of the AI by giving it guessing games or seeing how it responds to different prompts. One that I've tried is to ask ChatGPT to make guesses about events from after its training data was last updated. Without browsing the web, it won't know about events after that point. Begin by asking it when its training data was last updated. I'm suggesting you ask ChatGPT directly, rather than giving the date here, because it might have changed by the time you read this, and because it'll be different depending on whether you're using GPT-3.5, which is the free model, or GPT-4, the model that's only available to paying subscribers. Next, try asking it to make guesses about events that have happened since then, whether major world events, news about your favourite band or TV show, or anything else.
There are lots of games to try, so keep experimenting and let us know what works, or doesn't work, in the comments.
I've been describing what AI models and apps like ChatGPT can do now, but AI has the potential to make even more accessibility improvements in the future. Apple is reportedly planning to add AI capabilities to Siri. When Siri was launched in 2011, I remember being amazed that you could ask "Do I need an umbrella?" and it would understand that you were asking whether it was going to rain. Now, in 2024, when we have services like ChatGPT, and when Siri has seemingly regressed, Apple's offering seems inadequate. With other companies launching AI devices like the Humane AI Pin and the Rabbit r1, Apple will need to catch up quickly.
I recently told Siri, "I'm not your friend anymore because ChatGPT is better." In response, Siri launched the ChatGPT app. I can't argue with that. I told it I like ChatGPT, so it gave me an opportunity to interact with ChatGPT. Yet it highlights Siri's limited ability to understand language and context. It picked up on the fact that my query included the name of an app on my device, and launched that app, but didn't understand the rest of what I said.
A new and improved Siri, which utilises the latest advancements in AI, could be very powerful if Apple gets it right. If AI language models become more integral to the way we interact with our devices, this could make our screen readers much easier to customise. When we want to customise existing screen readers like VoiceOver, we have to navigate a complex selection of settings. Sometimes, the available settings don't allow us to customise in quite the way we want. I hope that in the near future, we'll be able to tell our devices, in natural language, exactly how we want them to behave at any given time. I'd like to be able to tell VoiceOver exactly what information I want it to read and exactly what I want it to skip.
I hope we'll soon be able to use AI to interact with technology in more accessible ways. If I need to use an inaccessible app or website, I'd like to be able to ask my AI assistant to interact with it on my behalf, explaining to it exactly what I need it to do. AI like this is already being developed, although it remains to be seen whether anything of this sort will be available on Apple platforms. Apple might be reluctant to allow Siri, or third-party AI assistants, to interact with apps and websites in the way this would require.
There's also the potential for AI to make tasks easier and more blind-friendly, even when the usual ways of doing those tasks aren't entirely inaccessible. Whenever I'm writing a document, I format it with Markdown because I find it much easier than formatting with a traditional word processor. When I've finished writing, I convert my Markdown to the type of file I need. Markdown can't do everything, and for a long time, I've been looking for an app that can convert my Markdown to a Word document or PDF file while allowing me to specify exactly how the converted document should be formatted. I hope that this, too, will be something AI assistants will be able to do in the future. While it's possible for blind people to format documents with a word processor, having the ability to control document formatting through text-based interactions would be much easier and would help to reduce formatting mistakes.
This blog post has only scratched the surface, so please share your thoughts in the comments. Let us know how AI has made a positive difference in your life, ways your experiences with AI have been less positive, or your hopes and fears for the future of AI accessibility.