Introducing Voice OCR 5.0 – Crafted from scratch with cutting-edge technology
UPDATE: For the next 24 hours, I have made Voice $1.99. After that, it will return to its original price of $4.99. For those who have already purchased, all updates are free. Tell everyone to download this so they can get it at the sale price!
For those of you who don’t know me, I’m Shalin Shah, an undergraduate student studying computer science at UC Berkeley. I am the developer of the application Voice OCR Document Reader on the iOS App Store (https://apps.apple.com/us/app/voice-ocr-document-reader/id903772588).
My journey with the low-vision community started when I was 15, after creating and launching the first version of Voice. At the time, Voice was the only 100% free OCR application built with VoiceOver. However, I was just learning to code at the time, so there were many bugs in the app and it crashed a lot. As time went by, I stopped updating it and sort of lost touch with the community I had worked so hard to build with this product.
Last summer, I decided to rekindle that community by launching a fresh upgrade to the application. I had a lot of fun being able to use the programming I learned in college to create a better version of Voice. The response from this community was overwhelmingly positive and has inspired me to keep going. So I’ve been working hard the past few months to build something better than all the previous versions of Voice combined. I have some exciting news for you guys.
I would like to introduce Voice 5.0! Voice 5.0 was crafted from scratch using a fresh new technical stack, that’s built to perform with precision and reliability.
The OCR quality has never been better, even on images with really bad lighting, horrible focus, and messy handwritten text. To put the icing on the cake, our field of view detector is more accurate than ever, giving you feedback whenever it sees a document.
Voice has 180 new text-to-speech reading voices that are simply gorgeous. You won’t find higher quality voices than these on any application on the planet, period. They harness the power of cutting-edge artificial intelligence and contain pitch-perfect intonations for more than 50 languages.
Finally, Voice is entirely hands-free. There are a fresh set of commands that make it easier than ever to control Voice with just your words! In addition to saying things like “capture” and “read”, you can now say things like “voice pause”, “voice play”, “voice restart“, and “go back” while Voice is reading your document. And when you don’t want to talk to Voice, we’ve included a polished, revolutionary user interface upgrade with a new standard — no more than 4 buttons per screen. Crafted with VoiceOver, you simply cannot find another interface that is this simple to use.
Additionally, in order to get access to the 180 premium quality reading voices, you will need to subscribe to Voice Pro, which is a monthly subscription costing $4.99 a month. I wish I could make this free for everyone, but powering the server to constantly improve the AI voices is super costly, especially as a solo developer. I still wanted to provide this functionality for the power users who might really need this, so I’ve built it as a subscription. But you can still get access to standard quality voices in over 30 languages without the subscription.
With that being said, there are still many improvements I will be making to version 5.0 in the next few days.
Here’s what’s on my To-do list so far:
1. Make the scanning speed significantly faster.
2. Adding Offline OCR for the first time ever
3. Add a feature to import PDFs found on safari to Voice OCR.
4. Exporting as a PDF should be fully accessible, right now it just exports the Raw Image.
5. Create an instruction manual and video that is accessible and teaches people how to use Voice, with the full set of voice commands.
If you guys use the application and have any feedback, PLEASE comment below or send it my way at email@example.com. All ideas are super welcome, and I would love to incorporate the community feedback into the app over the next few days.
Here is the link to the app on the iTunes App Store: https://itunes.apple.com/us/app/voice-take-picture-have-it/id903772588?…
Lastly, if the product is useful for you, it would mean the world to me if you could write a quick review on the App Store and share this with your friends. It’s the community feedback that really helps me continue to maintain this.
Anyway, feel free to reach out! I will respond to all emails very quickly. Thank you all for your time.
I got this app ages ago, but haven't had much opportunity to test it out. I will certainly do so now. I'm curious about the $4.99 for the pro voices, though. I liked the voice that did the tutorial quite a lot, but is there a way to hear samples of these voices before subscribing? It's not a huge expense, but I have tons of subscriptions, and they start to add up after a while. Thank you very much for such an informative post, and I look forward to using your app on a more regular basis.
Thanks for sticking with Voice for so long, I truly appreciate the support. To answer your question, yes, you can totally hear samples of the voices before subscribing! Just go to "Settings", then scroll down until you find "Reading Voice", and then from there you can preview any of the reading voices. Also, if you want to preview the reading voices for other languages, you can also change your language in settings to something like "English (Australian)" for example and then preview the reading voices in that language too!
Let me know if you have any other questions. Thanks again!
I remember you from the first iteration. I think it was the first competition the abysmal (and exspensive) KNFB Reader had). I enjoyed Voice more and more as it improved, and I'm currently using Envision AI and Voice Dream Scanner (for different tasks), and I have found them to be the crowning achievements in IOS OCR convenience and accuracy. But I still have Voice and will absolutely give it a try with the new update. Speaking allows the camera to be a lot steadier.
Thanks for being a supporter of Voice for such a long time! Let me know if you have any feedback that can help me improve Voice, and looking forward to making larger improvements very soon.
Are there plans for multilingual support for OCR recognition?
I’m just curious, but are you on Twitter?
Thanks for the comment! The current OCR recognition is already multilingual, supporting over 50 languages at its current state. Try it out on something other than English and let me know how it works for you. :)
I am currently not on Twitter, but as soon as I open a twitter account I will be sure to post it on here.
How can I scan multilingual document? When I check the settings for language selection I only can highlight the language I need to use in my scan. For example, if I choose French from the menu it announces that French is selected, and when I scroll a bit up to English it announces that English is selected. When I check if French is still selected it announces nothing. Am I missing something or VoiceOver does not really announce the actual status of my selection?
will you be putting it on sale anytime again soon? I just saw this.
You know what, I've put it on sale again for the next 24 hours. Thanks for reaching out. My goal is for all in the community to benefit so I hope you get a chance to test this out now that it's on sale again!
Sorry that it's kind of confusing right now. But basically, the Languages page that you are referencing only lets you select one language at a time, and it's used for the text-to-speech. However, if you leave it at French, then turn off "text to speech reading" in settings, then the OCR will detect the words natively for both languages and you can use VoiceOver to do the reading. Currently, the text-to-speech voices can only read in one language at a time, so you would have to use VoiceOver to do the reading. But the OCR will still correctly detect all the languages. In fact,
a document could have more than 40 languages and the OCR will detect the text in all of the languages!
Thank you for continuing to develop this awesome app. I have one suggestion. Please use Apple's speech API to allow the app to speak using any of the voices on the local device that Apple currently makes available to developers. I know this should work for everything except the Siri voices, and this would allow more voices to be used.
Thank you very much for your reply to my question about the premium voices. I'm just wondering if at some point in the future, even if it's a long time for now, there might be a way to purchase just an individual voice. For example, I found one male voice and one female voice that I really liked the sound of, but honestly wouldn't have any desire for additional voices besides those two. I guess I'm imagining something like the voice list you have now, but maybe you could purchase a single voice for $9.99 or something like that. Maybe keep the subscription model for people who enjoy having numerous voice options as well. Maybe this isn't a realistic option for you as a developer, but I figured that it couldn't hurt to make the suggestion. Thanks for all of your hard work, and I hope that you're having a really great day!
I'm not sure if I'm doing something wrong on my end, but I can't seem to get the app to do anything. It runs through the tutorial and then Voiceover announces a "next" button. When I double tap the button, nothing happens and the app seems to freeze up. I flick around the screen and nothing seems to be detected. Voiceover also seems to detect nothing when I slide my finger around the screen.
I checked the App Store page before purchasing and it said the app was compatible with my device. For reference, I'm running IOS 13.7 on an iPhone SE first generation.
I have both restarted my phone and also deleted and reinstalled the app, but neither made any difference as far as I can tell. I'm pretty much at a loss at this point.
The issue is just that you're running the application with an iPhone SE, which Voice OCR does not support anymore. In fact, it just supports iPhone 7's and higher.
However, good news– I have decided that I will add support for phones lower than 7, so you can expect an update within the next few days that will make sure the application works well on your device. Sorry about the poor experience and hope you can try it out again soon!
Hello, this looks like a really wonderful app, and an excellent update. I do want to respectfully clarify that Voice Dream Scanner does already offer on-device, offline ocr. Not a knock against this app, its wonderful that that is a feature now available in multiple apps. I'd love to see it become the standard
I know that the first generation SE is a pretty old model at this point and maybe not many people are still using it, so I really appreciate that you're willing to add support for it. Thank you.
I'm running it on a 6S and so far it's working fine, the only problem, which is to be expected is the phone warms up.
I personally removed the app, I use Seeing AIs' short text mode a lot more, but I upgraded before removing the app and am glad I did, paying for those voices was totally worth it.
The UK Male 1 voice is amazing! If I could; I'd buy it for my screen reader, NVDA, I'd use eliquence for most things but deffinetly use that voice for articles. The inflection and is amazing along with the tone.
Thank you for putting the app on sale. I was able to get it.
I have had voice ocr since it was first released. What I can't figure out is why the app will appear in the sharesheet for some pdf files and some apps and then not for other apps for a pdf file. The most useful apps would be the files, google drive and dropbox. Apps like acrobat reader would also be helpful since many apps that use pdf can export to acrobat reader. While this is not a direct support for many apps, it may prove a workaround for many apps. Just something to consider. Thank you for the app. It was an amazing feat for a high school student (first release).
This is a great app!! I now have it on my iPhone 11 and it's awesome!! I especially enjoy the premium voices. they sound awesome!! Keep up the amazing work.
Wow! Glad that you were able to rebuilt this app from the ground up. Congrats. Looking forward to the updates coming soon. Please continue to keep updating us. :-)
Hello Shalin: This is the only OCR reading app I use. Continues to be a great app! Keep up the good work! I appreciate it more than you know!
hi wish had irish english voices, and south african, that are premium, but love the other english voices Wish we could use them on the mac and or IOS.
I've not used PrismoGo although I have it on my iPhone. Does this app work better than that? Is there anything that could help people learn to use the app right now?
Response to Ashley: The app is very easy to use. Just open it, position your phone camera over what you want to read, and say, :"Capture". The phone will take a picture. Then say, "Read". Wait a few seconds and your phone will automatically start reading what it saw. I forgot to say that when you say "capture" or "edit", and nothing happens, you can perform these manually on the phone too.