Myself, I would like to see the triple tap and hold gesture which was introduced to describe images become more accurate, as it is often inconsistent in behaviour. Maybe it could also describe .jpg and .png files?
Another thing I’d like to see is better support on webpages which have embedded video content or images, as currently, these can cause voice-over to freeze up.
One other thing is more accurate dictation, I’m not sure if it’s just my accent but dictation doesn’t seem to get most of my words right, or misses out some words entirely.
What other features would you like to see if voice-over was to be updated?