Data Mining with Node.js

Posted on 16 Apr 2013 in Data Mining Mapping Scraping Location Node.js Google Maps API Mustache

If you have ever done any car shopping online, you've probably seen the badge and slogan "Website by". What you may not know is that this company is based right here in Vermont. About six months ago I was experimenting with scraping web pages with Node.js and It is pretty neat because you can use jQuery style selectors to grab data from pages. I decided to put this to use and scrape for any website containing the text "Website by" to get an idea for how many dealerships use them. This was about six months ago and the code is pretty brutal and woefully out of date (Node v0.6), so to avoid confusion I am not going to post it. The Wiki on Github is done well, so check that out for up to date info.

The basic work flow:

  1. Scrape Google, Bing, and Yahoo for search results containing "Website by" to get the dealer's URL
  2. Scrape the dealership sites to get physical address
  3. Geocode the address to get the latitude and longitude to put on the below map

Again, this was six months ago, so this data is probably out of date.