Project brief:
Scrape a suppliers website for all available product details, including images, with the output in both text and HTML format. The data then needed to be organised into a format suitable for upload to the clients database.
Process:
- The suppliers website provided product details from behind a login, so a scraping system was created to automatically login and access this data.
- All applicable fields (title, description, price, sku code, image url’s, options, delivery details) were extracted in both text and HTML format. Only one format selected for upload.
- Images then downloaded in bulk using the location URL’s, with these files uploaded to the cloud for easy access.
- Final data organised into required upload format.