Description of App
Transform your iPhone into a visual assistant with real-time object description and visual understanding powered by on-device AI.
HuggingSnap brings smart vision AI to your iPhone. Using our smolvlm2 vision model, the app understands what your camera sees in real-time without sending any data to the cloud.
Just point your camera, ask a question or request a description, and HuggingSnap will identify objects, explain scenes, read text, and make sense of what you're looking at. It's helpful when shopping, traveling, studying, or just exploring your surroundings.
Key Features:
- Processes everything on your phone - your data stays private
- Works in real-time with no delay
- No internet needed - works offline anywhere
- Easy on your battery
- Reads and translates text in images
- Describes scenes for better accessibility
- Search using your camera
- Choose what types of objects to recognize
HuggingSnap turns your iPhone into a helpful visual companion that sees the world with you!
Terms of use & privacy policy:
Terms of use: https://huggingface.co/terms-of-service
Privacy policy: https://huggingface.co/privacy
Comments
I don't need this but...
For those that do try it, or want another tool, i'd recommend contacting the dev and seeing where things go.
Their email is: [email protected]
Has potential…
If the developer can get the accessibility sorted, this could potentially be a great addition to the majority of tools that blind folks use on a daily basis. :-)
My App Store Review where I Listed some Suggestions
We should have the ability to;
• Select HuggingSnap from the Share sheet or access previously captured photos/videos within the app to have them described
• Know whether the front or back camera is currently selected
• View the size of the downloaded model, delete it and download other models to strike a balance between performance and quality, taking into account memory and hardware requirements
• Ask follow-up questions
• Take advantage of dictation/voice mode using the system voice, with customizable parameters, and even Siri integration, if applicable
• Have OCR functionality in multiple languages and scan PDFs
• Have HuggingSnap describe the content on the screen without having to take a screenshot and export it to HuggingSnap manually, to supplement VoiceOver’s Screen Recognition
• Teach HuggingSnap faces and objects and have them labeled by name wherever they’re encountered
• Explore images/scenes by touch (i.e., by moving the finger around the screen), and get audio cues in 3-D to get a better sense of the position/distance/depth of each object
• Enter a “system prompt” to be used for every description
A couple of notes:
1. Just copied the whole thing from the text field and I don't feel like adding all the HTML tags to convert it to a list, unless requested.
2. I figured out soon enough that we could already ask follow-up questions. Also, I thought of suggesting that we be able to capture photos or videos in portrait or landscape mode after submitting the review.
Um it is fully accessible
So I just downloaded the app and the buttons are labeled and the app seems to be accessible from my end.