I don't know how they are doing it, but Google Scholar does not have an API, and scraping is against their TOS.
> Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Despite this, there is scholar.py [0], which can extract files from Google Scholar, though it explicitly doesn't work around the rate limits.
or try to access them using a method other than the interface
Unless this actually exploits something and hacks into Google's servers to get to the content, which would be something quite different, it wouldn't really be distinguishable from someone manually visiting the site in a browser, volume aside.
IMHO the pervasive attitude today of somehow requiring permission or an explicitly sanctioned "API" to access what is otherwise publicly accessible data is rather troubling for the freedom and flexibility of the Web as a whole. It encourages walled-garden content models and centralisation.
I absolutely agree. If something is publicly accessible then the public should be able to use it as they see fit, from my viewpoint. (A HTTP response has already authorised you to copy the data to a machine. How can it be bound by a TOS that you need to access the original page to find?)
However, Google doesn't agree and the current court precedent doesn't either. So I tried to address the parent's concern from that viewpoint.
> Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
Despite this, there is scholar.py [0], which can extract files from Google Scholar, though it explicitly doesn't work around the rate limits.
[0] https://github.com/ckreibich/scholar.py