Screen scraping of 6 websites using open source

Completato Pubblicato Dec 26, 2008 Pagato alla consegna
Completato Pagato alla consegna

I require a very simple application which scrap data from 6 different sites and create an xml output.

The program should use an open source scrapping tool which called

WebHarvest (you can find it in : [url removed, login to view])

What i need from you is a Web Harvest script files which creates variable contains the XML and a small java application which execute the script and print the XML (Example: [url removed, login to view]).

There should not be any code in the java main except running the script and sending parameters value and output the XML (all the logic and the creation of the XML will reside in the scripts)

There will be a total of 6 urls that we require web scraping. Here they are and the requirements. Each site would require its own script:

[url removed, login to view]

Takes a state as a search criteria. Returns pages of results. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Returns results in a flash outputted view. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

Takes a state as a search criteria. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view]

[url removed, login to view] (list view)

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

[url removed, login to view] with the real estate plugin

Takes a zip code AND a price range. Each result should be converted (for all pages) should be converted into an xml file called [url removed, login to view] when the run is complete.

Because all of these are real estate websites, you will be required to first do a post search on them in order to scrape the results. The post search query typically requires a zip code, state and/or city

The scripts should be able to be called via java code You will provide both the scripts and the java code

Ingegneria Java MySQL PHP Architettura Software Testare Software Web Hosting Gestione Siti Web Collaudo Siti Web

Rif. progetto: #3498408

Info sul progetto

5 proposte Progetto a distanza Attivo Dec 27, 2008

Assegnato a:

abhay78

See private message.

$212.5 USD in 14 giorni
(94 valutazioni)
6.4

5 freelance hanno fatto un'offerta media di $315 per questo lavoro

smartsallar

See private message.

$425 USD in 14 giorni
(26 valutazioni)
5.4
brainwithstorm

See private message.

$425 USD in 14 giorni
(30 valutazioni)
4.9
ashwinc

See private message.

$212.5 USD in 14 giorni
(4 valutazioni)
3.6
cicdev

See private message.

$297.5 USD in 14 giorni
(3 valutazioni)
0.0