Another common unintended consequence I've seen is conflating crawling and index...

rafaelm · 2025-07-17T17:36:29 1752773789

There are cases where Google might find a URL blocked in robots.txt (through external or internal links), and the page can still be indexed and show up in the search results, even if they can't crawl it. [1].

The only way to be sure that it will stay out of the results is to use a noindex tag. Which, as you mentioned, search engine bots need to "read" in the code. If the URL is blocked, the "noindex" cannot be read.

[1] https://developers.google.com/search/docs/crawling-indexing/... (refer to the red "Warning" section)

EPendragon · 2025-07-17T16:28:08 1752769688

It is an interesting tidbit. I personally don't need Google to remove it from indexing. It is more of a "I don't care if they index it". I mostly care about the scrapping and not indexing. I do understand that these terms could be used interchangeably. In the past I might have conflated them.