September 2, 2006
MarkBernstein.org
 

Screen Scraping

What is the state of the art in screen scraping API's -- tools for grabbing nuggets out of Web pages? If you know, Email me.. Thanks.

Update: a new Web service, Dappit, is widely mentioned. Screen scraping bothers some people, just as linking does. Specious republication is a problem, but it's naive to suppose that spammers can't write their own perl scripts. Meanwhile, people need ways to have their agents (human or automatic) go out and get information they require.

Anthracite is an application-level screen scraper. Perl and python have interesting scraping libraries as well.