This is Shalin Shah — the creator of Voice. For those of you who don't know me, I’m a senior studying computer science at the University of California, Berkeley. Voice has been a project of mine since I was in high school, and I’ve been working hard to make it better over the last 6 years.
Voice was hand-crafted for people like you. It can help you quickly read items like product labels and magazine pages in your day-to-day. But you can also use it for more advanced reading like mail or books. Here's the link to the app: [https://apps.apple.com/us/app/voice-ocr-document-reader/id903772588](https://apps.apple.com/us/app/voice-ocr-document-reader/id903772588).
My goal with Voice was to create the most simple and intuitive interface to help you read things. I’ve built some new features in this version that I'm really excited to share with you today.
- Voice’s OCR engine is perhaps one of the best in the world. You don't need to worry about low lighting and bad focus, Voice corrects it automatically and gives you pixel-perfect accuracy every time. Voice can even read scribbles and handwritten text with incredible accuracy.
- The simplest way to use Voice is by simply tapping the button labeled “Camera. Button.” This will take a picture. Then tap the button labeled “Next. Button.” and Voice will perform OCR on your image and read it aloud.
- You can also control the app using your voice if you find that tapping buttons shakes your camera. Simply say “capture” to snap a picture, and “read” to start processing the image.
- Batch mode is enabled by default. To read more than one page, just keep taking photos using the “Camera. Button.” or by saying the word “capture” many times. Voice will read all the documents one after another.
- Good OCR detection does not depend on the corners of a document to be visible. But if corner detection is important to you, Scan Tone plays a tone when it sees all 4 corners visible. A louder scan tone means better visibility of your document.
- Voice also supports real-time scanning. Toggle this on, then simply hold your phone in front of any document with text and Voice will read it out loud in real-time. Voice also automatically turns on flash when it detects sub-par lighting and turns it off for objects that would glare.
- Voice supports 47 languages and offers 180 reading voices. 52 voices are the standard iOS voices, and 128 of them are premium AI-generated voices with extremely fluent intonations.
- Photo library picker lets you pick multiple images at a time from any of your albums with full VoiceOver capabilities.
- Voice now completely works without wifi. If privacy is a concern, you can use Voice in offline mode.
- Once your document has been scanned, it takes one tap to copy your detected text to clipboard, export it as an accessible PDF, or export it as a Text file.
- If you have “save photos to camera roll“ toggled on in Settings, then all the photos you snap will be added under the "Voice OCR" album in your phone’s photo library.
- Voice allows you to import both images and PDFs from other apps. It automatically detects the document format and performs OCR.
- The entire app was crafted with VoiceOver in mind, so everything is fully accessible.
- Voice is only 6.9-megabytes, which is 34 times smaller than Seeing AI, and 11 times smaller than Voice Dream Scanner.
- I have made some changes to the pricing model. Previously, the app cost $4.99 on the App Store and there was an additional subscription that cost $4.99 per month and unlocked access to premium reading voices. Now, the app is free on the app store and the pricing model is a subscription. Basically, you get 20 free scans per month. Once those 20 scans are up, you can purchase the Elite plan for $9.99 per month or the Believer plan for $99.99 per year. You save $20 a year, or 17%, by upgrading to the Believer plan.
- Feel free to reach out to [email@example.com](mailto:firstname.lastname@example.org) for any questions, feedback, or concerns. You can also text me at any time on my personal phone number: +1 949-939-6619. Critical feedback and ideas are super welcome, so please reach out!
Here is what's on my To-do list for the next version of Voice:
- The ability to scan a barcode and have Voice read out the product information.
- The ability to do object detection for all objects.
- The ability to play and pause with gestures when Voice is reading a document.
- Gamify the experience, so using the app feels more delightful to use.
- Reduce the app size even more so people get a faster app download time and it takes up less storage.
I have had your app on my phone since the first released. I have enjoyed it. For barcode reading, here's an app that really works well with the beeps. Similar to what you have in this app for corner recognition. Link below. I hope this is helpful for an idea for making scanning barcodes easy. It works on major products.
Interested to try the live feature
I'm so glad to see this app being updated. I am very interested to try the quick text feature. I'm actually a gamer, and while standard OCR can be really helpful for big huge blocks of text (such as in-game lore) sometimes, all you want to scan is a small text box which constantly changes (like during unvoiced dialogue). So quicktext is very useful for this. I'm currently using Envision AI and purchased their 1-time access package for around $60 hwen it was on sale. It's now my go-to app. But I am always interested in seeing how other apps compare. So I'll certainly give this a try. I do think it would be nice to have a 1-time fee. These monthly subscriptions are usually pretty reasonable, but the more you have the more they start to add up.
thanks for sharing that!
First of all, thanks for your continued support of Voice over the last few years. This community's continued support is what allows me to continue investing time into making the product better. Also, thanks for sharing the link to the barcode scanning app! I will take a look at it and it will certainly help me when I make the barcode scanning feature!
Thanks for trying it out!
Thanks for checking the app out! It's really fantastic to understand how real-time mode can be helpful to you in gaming. I'd be curious to hear how Voice OCR compares to Envision AI. I'm always looking for different ways to improve the product and make it as good as possible so would love to hear any suggestions for making real-time mode better.
The VOICE update
I've been using your app on and off and it works great!
Unfortunetly at the moment, I can't seam to connect to the internet with it and I did have to close the app and open it again to tab on the next button after entering my name.
If I were you; I'd download seeing AI and see what you can do to improve upon that. The barcode feature there beaps when it comes across a barcode, I don't know if anything can be done to improve upon that but sending unidentified barcodes to a surver or something to be identified or improved upon some how would be nice.
The internet connection seams to work now.
I prefer Seeing AI if I need to read stuff, but am glad this appp is still being worked on for those that use it.
Thanks for the suggestions!
Glad the internet seems to be working for you, although that was definitely weird since Voice doesn't actually need wifi in order to work. It's a good idea to think about ways I could make things that other apps do a lot better. I will definitely listen to your suggestions and improve the app even more. Thanks again for trying out the new update and feel free to let me know if you have any more suggestions. Thanks!
Testflight as a IAP benefit
I admire your work on this app and look forward to trying out the update, but have to say that seeing access to the Testflight version of the app as one of the benefits of an in-app purchase really doesn't sit comfortably.
I hope that you will reconsider this choice, and revert to using Testflight with a pool of experienced and committed users to test and report on development snapshots of the app rather than selling access to Testflight releases.
Can’t get past the log on stage.
Just downloaded the app. And when is asked to insert my name. Inserted Michael. But that is as far as it goes. Cannot find the option to click done or anything else press return but that just remove the keyboard. And I’m stuck on that logon screen?
I had the same problem as Michael...
After I added my name and hit Return, I went back to my home screen... not force closing the app but just swiping up to go home. Then double-tapped on the app again, and the Next option was visible. Hopefully that works for anyone stuck in the same place.
That's what I did.
SO hopefully it works for everyone else.
Problem with login screen is fixed
Hi Brooke, Brad, and Michael,
Thanks for highlighting the issue with button not appearing during on the login screen. I've shipped another update which should fix that bug. Thanks for bringing it to my attention.
I never thought about it that way
Thanks for pointing that out, that does seem a bit ridiculous, doesn't it?
In all honesty, I had never looked at it that way. I thought about it more as an added benefit to subscribing. Like early subscribers get access to features earlier than most people.
But you're right this was my mistake. I will revise this in an update. Thanks for bringing it up!
I have revised the pricing model to make Testflight something that all get access to! Thanks for bringing this to my attention and looking forward to hearing more feedback from you.