[ajug-members] html / website screen scraper API?
Curt Smith
csmith at javadepot.com
Mon Feb 19 14:30:57 EST 2007
Greetings ajug'ers,
I need to scrape info off a dozen different public websites and incoming
email that's in html format. The info is typically a table or single
values next to labels but it'll get more complex I'm sure. Some info
will require logging via custom login pages, cookies etc.
There's two sourceforge projects: HtmlUnit and httpunit. Both would be
good for simple scraping values and tables.
googling: "scraping public websites" finds commercial APIs (links on
the right side of the google results page).
Is there any experience or discussion on this technology or APIs?
Thanks, Curt Smith
More information about the ajug-members
mailing list