Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The founder of 3Taps posted this on Quora last year, in response to a question about disrupting Craigslist:

... the postings in question are public facts about exchanges. Just as any price/supply/demand in a marketplace is open for any and all to notate and republish, so too is the entire set of Craigslist data -- as these offers between seekers and providers are clearly in the public domain. Historically, Craigslist has attempted to block access by others to the comprehensive use of this data. They block many 3rd parties who try to gain access to the data, and sometimes threaten to sue and bankrupt others as if they themselves created the underlying data and hold copyright like property rights over the same.

But public facts are public property. And while some think that predatory Terms of Use demanding that you hand over the Brooklyn Bridge in liquidated damages if you don't comply with some obscure (and potentially constitutionally void) constraint will stand in court -- such absurdities will break if exposed.

A fear, uncertainty,dread approach over access to data breaks down in a world where Google already indexes all of Craigslist data and caches that information all over the internet (for search performance results). If its possible and legal for Google, then why not for any and everyone else to also index and offer access to the same data. In short, Google doesn't get special secondary property rights to privatize public data to the exclusion of anyone else. Equal access to exchange data and search data is a principle in parallel to the notions of net neutrality.

The points above are not a theoretical discourse. Look at 3taps.com/developers to see the execution of this concept. And look at what a 3rd party application (craiggers.com) can do in recreating the whole of Craigslist in a format that gives access to data in a way that is not remotely possible in the legacy Craigslist offering. Craiggers is a perfect example that the function of displaying Craigslist data (rather than gathering it) is a totally distinct (and competitive) marketplace, even if there are still huge network effects in the gathering of Craigslist postings.

Note, Craiggers does NOT disrupt the existing Craigslist revenue model for Craigslist. It simply opens up the field (along with any other developer building on 3taps assisted access to Craigslist data) that wants to build on top of (rather than compete with) the network effects of Craigslist. Think Kayak and Indeed, but now for the whole body of data covered by Craigslist accessible, rather than just a single vertical.

http://www.quora.com/Craigslist/Why-hasnt-another-product-di...



His post is chock full of BS:

> If its possible and legal for Google, then why not for any and everyone else to also index and offer access to the same data.

Because Craigslist isn't suing Google. If they didn't want Google indexing their stuff, Google would comply.

> Google doesn't get special secondary property rights to privatize public data to the exclusion of anyone else.

They can and do. That's why it's beneficial to set your user agent to Google Bot when browsing news sites.

> Equal access to exchange data and search data is a principle in parallel to the notions of net neutrality.

Full of shit. It takes effort to provide the service that Craigslist does. Claiming that you have a right to that data is wrong, and moreover has already been decided by case law (you can find links in other comments on this page).

> And look at what a 3rd party application (craiggers.com) can do in recreating the whole of Craigslist in a format that gives access to data in a way that is not remotely possible in the legacy Craigslist offering.

There is a reason Craigslist doesn't make that data available - it would reduce the utility for sellers of their website, reducing their revenue and potentially drying up their business. Trying to squeeze it out of Google's cache is still copyright infringement.

> Note, Craiggers does NOT disrupt the existing Craigslist revenue model for Craigslist.

Yes, it does. By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there.


All of your points rely upon the assumption that Craisglist "owns" all of the posts submitted. I'm not saying that's right or wrong, but if that is true then wouldn't that extend to Facebook owning all content submitted to their service, Twitter owning all tweets, Flickr owning all hosted photos, and Stack Overflow owning all submitted answers?


Craigslists owns the unique compilation of their listings. That's what is at stake here.


And there's no copyright infringement as long as your compilation, based on their publicly available data, is also unique, which PadMapper's is. This has lots of precedents dating back to services derived from phonebooks.


I am interested in this. Can you point me to one or more of these precedents? Thanks in advance.



Is it?

In my interpretation it is the content of the listings, not the compilation. He could compile the data from several different sources and present the same result.


What constitutes a "unique compilation" is subject to interpretation in a court of law.

Adding/removing/modifying and changing the arrangement or display of items in a dataset sufficiently constitutes a dataset unique from the one Craigslist offers even if it is largely derivative from the Craigslist dataset.

Craigslist could argue a trespass to chattels tort or file a ToS civil suit, but there isn't much they can do to protect a dataset.

The reason Facebook makes Facebook content largely available only to those who are logged in is to hide behind their ToS and prevent scraping whether centralized or distributed.


As far as I know, in the US they don't.


So what you're suggesting is that if a service put "noindex" in its robots/metatags, they would be somehow be overstepping the bounds of what they can do with their users' content?


Not at all... that's a bit of a strawman since robots.txt/metatags are used by search engines and not the public in general.

Not wanting your data on Google != Denying access to said content


> By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there.

This is exactly what Craigslist will argue, but it's hard to imagine how that could be true if the eventual downstream destination of the data always sends the user back to Craigslist to finish the process.


I really don't get the whole argument and all this Craigslist bashing. It looks like someone built his house on the grounds of somebody else's ground and now tries to rally up the internet because the owner of the land found out about the illegal house.

Is it not just like asking "why doesn't Google, Ebay etc let me scrape their database"? Just because someone found a nicer way to display, sort, relate the data, does give them neither a legal or nor a moral right to use the data.


The startups who are Craigslist bashing just have a vested interest in getting access to Craigslist content without any ToS. It is more like demanding that a property open up because there is gold on it and they are demanding that it be made available to them because they can serve gold to the market better than the land owner. From what I am reading though, the businesses are offering nothing in the way of compensation and this is more of a property grab.


I'm not disagreeing with you. I'm disagreeing with the foregone conclusion that a more liberal ToS would spell doom for Craigslist. A nicer way to display, sort, relate the data doesn't confer rights, but--as long as the end destination is Craigslist--I don't think providing access is a necessarily a losing proposition for either party.


It doesn't matter. It's a copyright issue. Craigslist has the right to protect their copyright, it can be proven that they have a unique database, and it can also very likely be proven that the scrapers knew they were doing something illegal (e.g. willful infringement) when the scraping took place.


I agree, it doesn't matter, but I think the reasoning is probably wrong.

It would have been better for everyone if Craigslist gave away the data minus whatever is needed to finish the transaction, and include in the TOS that that you must send the user back to Craigslist to actually finish whatever they're trying to do.


It doesn't matter. It's a copyright issue. Craigslist has the right to protect their copyright, it can be proven that they have a unique database, and it can also very likely be proven that the scrapers knew they were doing something illegal (e.g. willful infringement) when the scraping took place.

As far as I know, there is no database copyright in the US.


And you'd be proven wrong by a simple google search on "database copyright"... first hit [1] shows a compilation of laws and court rulings that support copyrighting collection of records... and this stuff has been around for a decade or more.

[1] http://www.bitlaw.com/copyright/database.html


Like Amazon's and Yelp's reviews, Craigslist would probably claim copyright on the posts themselves, not just the aggregation thereof.


Yes, but as far as I can tell, that gives them exactly bupkis on PadMapper.


I think you mean "bupkis." Dick Butkus is a hall-of-fame NFL linebacker.


Thanks. I want to get stuff like that right.


> They can and do. That's why it's beneficial to set your user agent to Google Bot when browsing news sites.

Jonathan, you can't be serious about this one. Why on Earth would you go on a public record and suggest someone to illegally impersonalize another company? Don't you know it may be a jailable offense, if Google (or any other company in that matter) decides to go after you??

> It takes effort to provide the service that Craigslist does. Claiming that you have a right to that data is wrong [...]

It is equally wrong to claim you have no right, just because it takes effort to provide Craigslist-level service.

> There is a reason Craigslist doesn't make that data available - it would reduce the utility for sellers of their website, reducing their revenue and potentially drying up their business. Trying to squeeze it out of Google's cache is still copyright infringement.

Uhm, providing one company data in a different template would reduce said company revenue? Excuse me, but under which rock have you been living for the last 20 years? You ever heard of "social sharing"? "like" button? Does "API" ring any bell?? Why do you think most companies provide those?? Most would die for other developers to actually spend their time and effort to build front gates based on their data. I hope you are not working on any startup bro, because you seriously have a shitty point of view!

> Yes, it does. By scraping craigslist in an attempt to undermine their platform [...]

What do you mean by "undermine their platform" ?? Why would you automatically assume that all the OP wants to do is to kill or undermine someones business?? Further, he would have to be retarded to do so! Why would he work on a project that the core is based off of external data and the same time wanted to... kill that data source?? You lack logic here, again.

> By scraping craigslist in an attempt to undermine their platform you are eliminating their site's relatively utility for buyers, which eliminates the impetus for sellers to list there

Yes definitely -- A website where I could see all the pictures of furniture posted on Craiglist on one page which gives me an easy access to scroll down in less than 2 minutes (to actually see what I want to buy), versus spending 45 minutes clicking on each and every post on the original Craigslist website -- yes, definitely because that evil website stealing Craigslist data made me save 40 minutes I will never use Craiglist as either buyer or seller ever again.

What a bullshit!


Holy shit, is this satire? If so, this is a perfect imitation of some of the commenters who go around here pretending that the entire world agrees with their minority viewpoint that neither property nor effort have any established value.

If this isn't satire, well all I can say is i hope you build something awesome one day. I presume your above manifesto can be construed as an invitation to insert myself between you and the userbase you invested so much to accumulate?


If I have a product and a set of users, and you have a way of increasing both my revenue and my users satisfaction by inserting yourself between us, please do. The OP's point is that undermining Craigslist makes no sense for Padmapper, since that's where the data comes from. If people who list apartments get them rented faster because of Padmapper, they like it. If people looking for apartments find them faster because of Padmapper, they like it. If it increases Craigslist's revenue, they like it. If all of these points are positive, it's hard to understand why Craigslist would be against it.

Please recognize the distinction. I'm not saying they're outside their rights to deny this access or that Padmapper is somehow within theirs by violating the ToS, just that if it's a win for everyone, shutting it down doesn't make sense.


Disintermediation -- getting your users used to coming to my site to look at your data is the first step to making you irrelevant and forgettable. PM knows who the renters and the listers are; it's a small step to convince some listers to list with PM first. Maybe PM will agree to repost on CL as well, but as long as listers come to PM first, the relationship between PM and CL is flipped.

Eric might protest that all he wants is to save apartment hunters from minutes of work. The reality is that PM's long-term viability depends on him getting acces to some of that listing fee cash, which almost defintely means reducing CL's revenue. Unless you think he's going to convince agents to pay twice to get in front of an overlapping set of viewers.


> PM knows who the renters and the listers are; it's a small step to convince some listers to list with PM first.

You need to learn a little bit about Craiglist. Every single section of the site has been copied over and over again, including big names like Angie's List, Ebay, etc. But yet its been so many years and Craiglist traffic and revenue continue to grow.

CL has such an incredible solid network effect that me and most users couldnt care less how awesome the website to quickly browse CL photos grow. The moment they go solo and decide to disconnect, the moment I come back to CL to post/browse. Simple.


I still think it would be possible to work this out with a more creative ToS. Problem one could be managed by making it part of the ToS that you won't aggregate from other sites. Problem two could be managed either by making it part of the ToS that you don't post your own listings, or by making you pay to have them listed at Craigslist at the same time.


It sounds great on paper, but it's not going to stop Craigslist from C&Ding all the developers using the 3Taps API. IANAL and not sure they even could indemnify their customers, but I know I'd want more than a reasoned argument and bold words from my upstream provider if I were to build a business on the 3Taps-provided Craigslist API.


I know the 3Taps people pretty well. They're seasoned entrepreneurs and they definitely know what they're doing with the 3Taps API. One of the founders is a grey haired attorney who's thought this through pretty carefully. They're also prepared to handle any legal issues that arise since they need to be, but also because it would be the best PR ever.


As gray hair as they might be, they come across as rather morally bankrupt and I can't imagine this ending remotely well for anyone in this industry.

Either Google will have to begin applying constraints to cacheing, or the websites will have to apply constraints to Google indexing, or the websites will have to enter into an arms race with the likes of 3taps.

The end result will be a great deal of wasted money, and if 3taps is actually successful, the decimation of the businesses whose data they rely on.

Then what? Is 3taps going to build another Craigslist, for free, and open up their data silo to the internet?


I think this is great. Ownership of user contributed content is a big open question. Now everything is becoming social and crowd-sourced. If this pushes the question to the forefront so an actual judgment is made it will put businesses on some sort of solid ground. Right now a lot of people are building on potential quicksand.

I like flickr's approach. Users own the content and define the copyright. Flickr provides an open API and a TOU which makes it tricky and/or illegal to source the content directly from their servers.

If a billion dollar business relies on people giving them content for free and them keeping the copyright and re-selling the content, then I think maybe they're the morally bankrupt ones.

APIs are everywhere now and API use is becoming quite a big sticking point. I think content API platforms are the future of the Internet and a lot of the big names (Facebook, Yelp, Twitter, Foursquare, etc.) But yes, someone needs to figure out how to monetize being a content platform (in a non-Wikipedia way). Assuming you have the right to take the content, resell it and own the copyright seems like a weak assumption that has only worked so far while people were less savvy about it.


Yes, but are they prepared to handle everyone else's legal issues too? Is doing so in the agreement between them and Padmapper? I'm sure they're great guys, but that doesn't obligate them to protect Padmapper from Craigslist.


Or more likely CL will just tell Google to stop caching their pages.


The argument hinges on Google getting access to Craigslist listings while others are denied. However, that doesn't seem to be true:

http://www.craigslist.org/robots.txt

https://www.google.com/search?q=site%3Aseattle.craigslist.co...

Update: The comment you quoted is a year old and Craigslist only recently blocked Google. http://www.seomoz.org/blog/craigslist-blocks-search-engines Perhaps they decided to block everyone to bolster a legal case against aggregators like Padmapper.


... the postings in question are public facts about exchanges. Just as any price/supply/demand in a marketplace is open for any and all to notate and republish, so too is the entire set of Craigslist data -- as these offers between seekers and providers are clearly in the public domain.

But public facts are public property.

Are they though? In the absence of any terms that specify otherwise, I would expect to retain copyright over anything I post to Craigslist. To play devil's advocate, maybe I wrote a literary masterpiece or uploaded some great photos trying to sell my [random item] - just because I put them on Craigslist doesn't mean I'm releasing them into the public domain. For whatever reason, I may want to retain control over my post, and only give permission to Craigslist to use it.

To take this "public domain" argument to the extreme, does this mean any pictures I put up on Craigslist can be used royalty-free by anyone for any purpose, forever? That's absurd.

Edit: most of the arguments here seem to be around whether Craigslist owns the listings, but I think that's missing the point. The user owns the listing, and none of the aggregators have any way to get the users' consent before reusing their listings. In most cases, yes, of course I'm happy to have my listing show up on aggregators and increase its exposure, but that's not really the point - the point is end users have no control over it.


> does this mean any pictures I put up on Craigslist can be used royalty-free by anyone for any purpose

I think what he's saying is, that the fact that you are publicly selling your picture is in the public domain, a public fact about the economy, and unmonopolizable by any one private subject.


They can be linked to royalty-free forever, because the "publisher" is the server hosting the data.

Even if they did copy and host the image themselves, they would be liable but still be okay because the burden is on the user who owns the data to file a DMCA takedown notice for that content. The worst Craigslist could do here is inform the users (who own the content) that their listing is being shown on Site X and provide a link to the page where the user can file a takedown notice. I don't know about you, but I probably would be happy that the listing is getting more exposure on more sites. I think most users would.

Markets are natural monopolies. Liquidity begets liquidity. The breaking down of these silos is a great thing for price discovery and a better functioning market. The creation of decentralized markets where all listings are available and discoverable is really only a matter of time. When that happens buy and sell offers will be fully free and the idea of a centralized site with free listings like Craigslist will be a quaint notion and we'll all be better off because of it.


Information wants to be free, brother!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: