Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
matheusmoreira
on March 21, 2018
|
parent
|
context
|
favorite
| on:
Sci-Bay: Google Scholar plus Sci-Hub
The page's HTML is the API. It's pretty easy to download a web page, parse the HTML and then extract specific bits of information from it. The browser does the same thing on the user's behalf, which is why it is called the user agent.
hrasyid
on March 21, 2018
[–]
An API is a contract. HTML can be tweaked and become incompatible with your parser at the developer's whim.
_pfxa
on March 21, 2018
|
parent
|
next
[–]
Oh luckily major APIs never change. /s
hrasyid
on March 24, 2018
|
root
|
parent
|
next
[–]
Not as easily as an HTML page
matheusmoreira
on March 21, 2018
|
parent
|
prev
|
next
[–]
That just means your code must be maintained. You can verify that the HTML has a given structure and log a failure if it doesn't.
amelius
on March 21, 2018
|
parent
|
prev
[–]
Use Deep Learning to circumvent that.
Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: