Glad to see this on the front page. One of those posts I reread every now and then. Better yet it’s written by Anna Patterson, who in addition to the mentioned searches at the bottom wrote chunks of Cuil (interesting even if it failed) and works on parts of Googles index both before Cuil and I think now.
Sadly it’s a little out of date. I’d love to see a more modern post by someone. Perhaps the authors of mojeek, right dao or someone Elise running their own custom index. Heck I’d pay for some by Matt Wells of Gigablast or those behind Blekko. The whole space is so secretive that for those really interested in the space only crumbs of information are ever really released.
Thanks for the mention @boyter. Maybe we should write something like "Why Writing Your Own Search Engine to Index Billions of Pages is Hard". It would make interesting reading about challenges we have overcome. For us it's not being secretive as much as finding bandwidth. Getting awareness is a massive challenge too, which is why we've written a lot more content in the last two years. I'm sure you are looking for something more meaty than this: https://blog.mojeek.com/2021/05/no-tracking-search-how-does-...
Sadly it’s a little out of date. I’d love to see a more modern post by someone. Perhaps the authors of mojeek, right dao or someone Elise running their own custom index. Heck I’d pay for some by Matt Wells of Gigablast or those behind Blekko. The whole space is so secretive that for those really interested in the space only crumbs of information are ever really released.
If you are into this space or just curious the videos about bitfunnel which forms parts of the bing index are an excellent watch https://www.youtube.com/watch?v=1-Xoy5w5ydM and https://www.clsp.jhu.edu/events/mike-hopcroft-microsoft/#.YT...