I agree completely with what you said, but think about why it is blatantly obvious to you. It's obvious because you know the web page is a document with images overlaid with text.
Imagine instead that you are viewing an image in a photo viewer that has some text. Would you think to ctrl+f that? I wouldn't. The end-user you're referring to doesn't know the difference between the two scenarios.
Imagine instead that you are viewing an image in a photo viewer that has some text. Would you think to ctrl+f that? I wouldn't. The end-user you're referring to doesn't know the difference between the two scenarios.