Find Jobs
Hire Freelancers

data crawler to login & spider inventory data from distributor website to csv file

$30-100 USD

In corso
Pubblicato più di 18 anni fa

$30-100 USD

Pagato al completamento
We need to create a automated crawler that will log into a distributor warehouse website and download inventory data from tables to a delimtered file. The website we will be crawling is the login/search catalog section of www.electrograph.com. I have saved copies of their site locally to demonstrate what needs to be done. After closing of project we will provide actual login details to the live site for the job to be completed. Walk through process of what needs to be done: Login Home Page [login to view URL] Goto main website and login using the form in the uppler left hand corner of page. User name and password should be definable. Successfully Logged In [login to view URL] After the login has been processed successfully the page is refreshed now including a "My Account" section in the upper left hand corner. Additionlly, The "keyword/ Item# search" form is now enabled for our specific account. It will display the specific pricing, and inventory quantities available for our account when submitted. Currently their web site allows you browse through the inventory of items by category, and then paginate through the results (cannot show all products in one iteration). We need to follow each category link through the select menu "ddlCategory" individually, download all the data in the page to specified format, and continue on to the next page of results if another page exists. Crawling first result page of the first category searched "Accessories" [login to view URL] This page displays the information that we are looking to store in a delimitered file format. We need to trim & store Model #, Manufacturer, Description, Availability, Reseller Price columns. Each table row, a new line in the delimtered file created. Take note of the Availability column, it provides a total quantity number in stock and then a "I" icon. When you hover above this "I" icon it displays the breakup of which warehouse locations that product is stored in. For example: 18 (I says: 14 - NY, 4, NV, meaning 14 units in stock in New York, 4 units in stock in Nevada). We need to store both the total quantity available as well as those individual location listings. A column for each warehouse location. Crawling second/additional result page(s) of the first category searched "Accessories" (page 2+) [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. (Note on the saved version of the this page i povided you; the javascript is not working to show the individual warehouse splitup, it will of course be operating on the live site) Crawling first result page of additional LARGE category searched "Plasma Displays" [login to view URL] (interim refine page) [login to view URL] (actual results page) Some categories of their website that contain a substation amount of products, when you first click on "SEARCH" it does not display results. It brings you to another "search plasma displays" form where you can refine your results, and search by attributes. We do not care to do this, we simply want to select the "GO" button, which will display all the products under that category in the same manner as step2. Crawling second/additional result page(s) of additional LARGE category searched "Plasma Displays" [login to view URL] Perform the same process as Step2 downloading & storing all the inventory data, and continue onto the next page if it exists. The end result needs to create a file that is Delimitered by Comma Example result for parsing of example link [login to view URL] Model Number, Manufactuer, Description, Reseller Price, Total Available Qty, Location NY Qty, Location NV Qty, Location XX Qty ACE615, ADCOM, ACE-615 ILS SURGE (120V), 315.00, 12, 12, 0, 0 TRAVEL CS/42"PANASON, CALZONE CASE CO, TRAVEL CASE 42" PANASONIC, 345.33, 0, 0, 0, 0 FSD-4100, CHIEF MANUFACTURING, FSD-4100, 97.39, 0, 0, 0, 0 CMA-0608, CHIEF MANUFACTURING, 6'-8' ADJUSTABLE PLATE, 93.39, 0, 0, 0, 0 RC-1PXL, ELECTROGRAPH SYSTEMS, 24-BUTTON SWITCH PANEL FOR VS-1XL, 104.76, 0, 0, 0, 0 RC-1XL, ELECTROGRAPH SYSTEMS, NEW MODEL NUMBER (WAS VS-1XL) REMO, 104.76 0, 0, 0, 0 FRAME-O, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 245.35, 0, 0, 0, 0 FRAME-W, ELECTROGRAPH SYSTEMS, SINGLE GANG FRAME TO HOLD UP TO 3 W, 14.89, 5, 5, 0, 0 Notice on the website, some products it gives a quantity, some it says "call for availability". We need to be able to map whatever text is in that field to a text/numerical equivalent. For example in this impelentation we define "Call for availability" as 0. Also, because they are always adding and changing warehouse locations we need to leave room at the end of the delimitered file for new locations that are added. When text is found in the quantity available field, and we compare it to find its equivalency and apply that to all the other location columns. For example: "call for availabiilty" will result in 0, 0, 0, 0 (Total Quantity Available, Location 1 Qty, Location 2 Qty, Location 3 Qty). We should make room for up to 10 warehouse locations (0, 0, 0, 0, 0, 0, 0, 0, 0, 0). When a quantity is not defined for a warehouse that is indexed we will replace it with zero. In this example Call for availbility means the product is not in stock, thus we are marking it and all subsequent warehouse locations as 0. I also need to able to control the delimiter used in the output file (I have used comma in this illustration for ease). I also need to be able to control the delay between page navigation (milliseconds) A database should not be necessary; a simple config file is fine. Need to get this project completed ASAP. We have several data crawlers that need to be created: Winner of this project can expect future work in the development of similar crawlers.
Rif. progetto: 29093

Info sul progetto

1 proposta
Progetto a distanza
Attivo 19 anni fa

Hai voglia di guadagnare un po'?

I vantaggi delle offerte su Freelancer

Imposta il tuo budget e le scadenze
Fatti pagare per il lavoro svolto
Delinea la tua proposta
La registrazione e le offerte sui lavori sono gratuite
1 freelance ha fatto un'offerta media di $95 USD per questo lavoro
Avatar dell'utente
I have developed site crawlers in past. These crawlers are able to handle Cookie based sessions, Javascript URLs and http/html redirects. I can use existing codebase to complete this project. This poject can be implemented with Java. With Java you can run it on your desktop and move it off to a server if you want to automate it in future. Please let me know if I can provide you more information.
$95 USD in 5 giorni
5,0 (2 valutazioni)
4,2
4,2

Info sul cliente

Bandiera: UNITED STATES
brooklyn, United States
5,0
19
Metodo di pagamento verificato
Membro dal giu 23, 2005

Verifica del cliente

Grazie! Ti abbiamo inviato tramite email il link per richiedere il tuo bonus gratuito.
Non è stato possibile inviarti l'email. Riprova per piacere.
di utenti registrati di lavori pubblicati
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Caricamento anteprima
Autorizzazione per la geolocalizzazione concessa.
La tua sessione è scaduta ed è stato effettuato il log out. Accedi nuovamente per piacere.