I constantly find myself in arguments/debates about indexing vs. scraping of real estate listings from websites. Despite my blue face, the debate goes on. In an effort to provide evidence that indexing is scraping, I have created this summary and video that will provide evidence of my case. The importance of this discussion is to key into two conflicting issues related to MLS rules and regulations. Specifically, the MLS rules and regulations for IDX websites indicate that the website must try to prevent scraping, but indexing by recognized search engines is ok. I say potato, potato. They are the same thing and you cannot allow indexing without allowing scraping. They are, in essence, identical from a technology perspective – although different from an intent perspective.
To see if your IDX listings are being “indexed” by a recognized search engine, go to google.com and type in the following search term site:domain.com – for example site:mlslistings.com. If you see thousands of pages, your listings are being indexed. Go to page 5 or 6 of the search results and you will begin to see listing information appear.
To prove the case that the data is being scraped – taken from your website and stored on the server of a third party (in this case google), click the blue link that says Cached. Cached is fancy language for stored on our servers.
Conclusion: Indexing is scraping. Listings are being copied from websites without a license or agreement.