I don't know how they are doing it, but Google Scholar does not have an API, and...

userbinator · on March 21, 2018

or try to access them using a method other than the interface

Unless this actually exploits something and hacks into Google's servers to get to the content, which would be something quite different, it wouldn't really be distinguishable from someone manually visiting the site in a browser, volume aside.

IMHO the pervasive attitude today of somehow requiring permission or an explicitly sanctioned "API" to access what is otherwise publicly accessible data is rather troubling for the freedom and flexibility of the Web as a whole. It encourages walled-garden content models and centralisation.

shakna · on March 21, 2018

I absolutely agree. If something is publicly accessible then the public should be able to use it as they see fit, from my viewpoint. (A HTTP response has already authorised you to copy the data to a machine. How can it be bound by a TOS that you need to access the original page to find?)

However, Google doesn't agree and the current court precedent doesn't either. So I tried to address the parent's concern from that viewpoint.

TeMPOraL · on March 21, 2018

Yup. I don't believe web hosts should be entitled to that much control.

My browser is my User Agent. The way it renders or interprets the data is my business.

crispyporkbites · on March 21, 2018

Http is an interface with implicit instructions (especially if restful), provided by google