Free OCR App Voice Major Update
UPDATE - Here is a video of a blind user demonstrating Voice: https://www.youtube.com/watch?v=RAqbwxij5xA
I am a high school student and app developer and I posted about Voice earlier in this post: http://www.applevis.com/forum/ios-ios-app-discussion/free-new-ocr-app-c….
For those who don't know, this app is free and I plan to keep it free forever. I welcome any feedback so that I can further improve Voice. I am not blind or low vision, but I really want to help the community. So if you could take the time to give any feedback or suggestions, I would really appreciate it.
Last time I posted, I got some really insightful and helpful feedback and I utilized that feedback to improve Voice.
So here are some of the new features of Voice that were added in the update:
1. Now Voice can detect and read in over 30 languages!
2. Most, if not all, of the VoiceOver incompatibilities have been fixed (including the annoying introduction at the beginning). So, Voice is fully compatible with VoiceOver!
3. Export your detected text! You can now export it as a pdf, png, or txt file, and you can send it to Dropbox, Google Drive, OneDrive, Email, and many more third party services!
4. Improved field of view report and document detection.
5. iPhone vibrates when four corners are detected.
6. Picking photos from iPhone photo library now is compatible with VoiceOver.
7. Made OCR processing faster.
8. Better automatic photo capture.
9. And other small improvements here and there.
The supported languages are Arabic (ar-SA), 3 types of Chinese (zh-CN, zh-HK, zh-TW), Czech (cs-CZ), Danish (da-DK), 2 types of Dutch (nl-BE, nl-NL), 5 types of English (en-AU, en-GB, en-IE, en-US, en-ZA), Finnish (fi-FI), 2 types of French (fr-CA, fr-FR), German (de-DE), Greek (el-GR), Hebrew (he-IL), Hindi (hi-IN), Hungarian (hu-HU), Indonesian (id-ID), Italian (it-IT), Japanese (ja-JP), Korean (ko-KR), Norwegian (no-NO), Polish (pl-PL), 2 types of Portuguese (pt-BR, pt-PT), Romanian (ro-RO), Russian (ru-RU), Slovak (sk-SK), 2 types of Spanish (es-ES, es-MX), Swedish (sv-SE), Thai (th-TH), and Turkish (tr-TR).
Here are some of the already existing features of Voice:
1. Fully compatible with Voice-Over.
2. Field of view report where it detects whether the document is in view, and if it is, it will say "Four corners detected".
3. If the automatic capture feature is turned on in settings, the app will find the document using the camera, and automatically take a picture when the document is in view, without any user interaction.
4. If an image is taken at a slight angle, it will fix the perspective distortion and align the image properly.
5. If the image is of a curved surface, it will adjust the image to straighten it.
6. Book mode which allows the user to take multiple photos, and it will read the photos one after another. The great thing with this feature is that while the first photo is being read, the second is being processed and so on, so it does not use additional processing time.
7. If the image is too bright, or not bright enough, the app will correct the brightness of the image before processing.
8. Vertical and horizontal column detection for reading different columns in a newspaper; the app does this automatically.
9. Only a 3 megabyte install size.
Also, since in the last post, many people were concerned with privacy. Let me clarify it.
The OCR engine uses an internet connection to work. I am not using the offline Tesseract engines for OCR that Prizmo and KNFB Reader use because Tesseract has a lot worse OCR results. So an internet connection is required for the App to work. The photos are sent to a server for detection. But the photo is encrypted so no one is able to gain access to the photo, including me. Once the photo has been finished processing, the photo is deleted from the server. Then an offline Text-to-speech reads the text. Once you decide to go back to the main menu after Voice has finished reading, both the text and the photo are permanently deleted from the device.
So those are some of the features of Voice.
Here is the link to the app on the iTunes App Store: https://itunes.apple.com/us/app/voice-take-picture-have-it/id903772588?…
My email is firstname.lastname@example.org
I hope you find this application helpful. I am open to any suggestions and even though I may not reply to email very quickly, I do read all the emails and suggestions and I do try and bring what you want the most into Voice. Thank you for reading.
I like your app, and I'm wondering if you could create an OCR-BASED screen reader that could tell us through OCR what we're on? There are many apps that just can't be read with VO, and something like this would augment VO a lot in some apps.
Hi there, Would it at all be possible to add a video mode to Voice, rather like Goggles? This mode would be used to read displays on modern devices such as coffee machines, washing machines, midi keyboards etc in real time so that we could work with those? That would be a killer feature in my opinion.
I read your mention with regards to the offline Tesseract engine and the fact KNFB Reader utilises it. Considering you said it delivers worse results, does this mean KNFB Reader delivers inferior results compared to Voice? If so, it seems the premium price is not worth the results contrary to all the positive feedback in a recent thread. If this delivers more precise output OCR than KNFB Reader everybody who uses KNFB Reader should theoretically have more success with Voice provided the device and lighting conditions are identical.
Great work on this app. Very useful for many people.
One suggestion, although the search engine on your server might give superior results, one intesting idea is to have the Tesseract engine available off-line to process the material on the phone itself for folks who either don't have immediate access to WiFi or those who are still concerned about privacy. This could be a user setting and at least give users a fall back in case your server is down or WiFi is not available.
I don't know how hard that would be to implement as an option.
Anyway, keep up the good work.
Currently Voice is not able to do that, but an alternative that you could use would be to take a screenshot of whichever app you want to be read, and select the image using Voice. So that way, Voice can use it's OCR to read to you all the elements on the screenshot. This is a simple work around for the time being.
Apps like KNFB Reader and Prizmo use the Tesseract engine. Although the results on two identical raw images may not be the same, after a variety of image processing, there can be differences in the OCR quality.
What I mean is... Take Voice for example. Before actually doing the OCR, Voice de-skews the image, fixes the lighting, auto-crops the image, and does various other alterations to the image. This is to achieve optimal recognition even if the user takes a bad photo. Although I do not know exactly what image processing algorithms KNFB Reader and Prizmo use, I am sure they run some fixes on the image before the actual OCR processing. Since I do not know exactly what those apps do, I can make no such claims whether or not these apps would provide better results.
What I am saying, however, is that the offline Tesseract engine provides worse results than the OCR engine that is in Voice. Maybe KNFB reader uses image processing to make the image better before the Tesseract OCR reads it. But for identical unadulterated images, Tesseract would provide inferior results.
Additionally, I believe the price that KNFB charges is very high compared to a similar availability of features in apps such as Voice. And Voice is free. The reason I think people write such great reviews about KNFB Reader is because anyone that spent such a great amount of money on it are going to be very hesitant in giving it a bad review even if they are extremely unhappy with it, because that would damage their justification for buying that app in the first place.
I just think everyone should save the hundreds of dollars that all these other apps are exploiting you to buy. Great and helpful technology should be kept free and available in the hands of everyone. Especially when so many people need it. Although I go to school, I will continue to make any updates to Voice whenever I can to try and make your experience using Voice better.
I was actually planning on incorporating that in the next update. As you said, this would provide the best alternative for people that have privacy concerns. Even though Voice is completely secure, some people may still choose to use the offline version. Thanks for your suggestion!
Thank you so much for your suggestion! I will look into doing what you stated as a possibility for the next update.
Hello, Shalin! First of all thank you too much for your great job! And I have one question. How can I force Voice to detect non-English language (in my case it's russian). I see in languages section of settings russian, check it by double tapping but when I try to recognize any russian text, even in screenshots, the app tells nothing or just slashs, asterisks and so on. And when i go to settings again russian is not checked. Perhaps I do something wrong?
Is it going to work if I install it on an iPad running the beta version of ios 9? Or maybe this is a question you would like answered too.
Hello, I congratulate your for your choise in this app. How ever, when I run the app, after the tutorial, I see an empty screen. The phone acts like it is freezing and I can't do anything excep getting out of the app via app switcher.
I tried Voice a couple days ago with my iPhone 6+ but i am running into problems. When i have taken a photo of some mail i want to scan and read, even though Voice claims that 4 corners are detected either nothing gets read at all or there's a message saying something like "No words are detected on this page, i'm moving on". That message is ok, but as i said sometimes Voice won't speak at all. How should i be holding the phone in order for text to be detected. I think and this is my opinion and it can very well be wrong so please correct me if it is, but one reason KNFB reader gets rave reviews is that it actually seems to be very forgiving when you try taking an image. I so want this app to work well for me so any tips on how best to use it is welcome.
hi. congratulations. let me ask you a question.
this app works on the iPad?
you need flash for this app work on the iPad?
only works on the iPhone or supports iPad ?
Hi. I actually got Voice to speak and i noticed that if you hold the phone too high, you won't get anything that Voice can recognize. Now centimeters, millimeters, inches and stuff is something i'm worthless at translating, so i can't give you a hint on how high you should hold the phone, but it seems to work well if you hold it quite low. One thing i also noticed during my experiments today is that if you have a paper that has been folded and the line or edge or what we might call it is sharp or maybe even not so sharp, Voice will identify it as a corner, thus a big document that once had been folded could end up becoming 2 documents. I hope this at all made sense despite my not so good English.:-)
I just updated to the new version of Voice.
I am using an iPhone 6 with the latest iOS release.
Problem: When I double tap to start the Voice app, the app does not seem to open, but I am left on the home screen. Interestingly, the Voice app does appear in my list of running apps.
this did not happen in the last version of Voice - When I double tapped, the app opened as it should.
I deleted the Voice app, re-installed from the App Store, but I stil get the same result - i.e., the app does not open but appears to be running in the list of running apps.
I finally re-booted the phone, but that did not help either.
Has anyone else seen this problem? Any ideas or suggestions?
I have the exact same problem on an iPhone 4s.
I'm not sure if Voice already does this or not, but I think Voice needs a field of view report similar to KNFB Reader. This would tell you if the page was correctly aligned to take the best possible picture. The four corners feature is nice, but what about telling if the page is properly oriented and whether all of the page will be taken with the picture. I think page orientation detection would be great since I do not always know if I'm placing the page in the correct orientation. I really commend you on your efforts and hope this improves as I don't feel like shelling out $100 just to read printed material. It comes back to the idea that I shouldn't have to pay extra to access the same material that you or other sighted people can access. Keep up the great work.
I'm wondering how many other folks are experiencing the problem of Voice starting and appearing in the app switcher but not actually opening so that it can be used. I deleted the app again and re-installed on my iPhone 6 running iOS 8.4.1 and still, each time I double tab the Voice icon, rather than opening, VoiceOver says "Voice -double tap to open". But the app is running in the app switcher. I never saw this type of behavior before with any other app.
Any update on this problem, if others are seeing it too, and if the developer is aware of the problem and able to reproduce it?
Is there any way of making the device work without internet connection. Let's say maybe for use at a grocery store?
Also are there any alternatives on Google Play?