Scraping Amazon.co.uk using Powershell

Scarpe Amazon with powershelll

Helloo! I was asked by one of my colleague whether i can create a Web scrape tool in Powershell like its in Python using BeautifulSoup. I told him you don’t be needing any more extra modules to be added into to Powershell to scrape a website for data. Obviously i use Python as much as Powershell too.

The Beauty of the Powershell is that you don’t need additional tools(Python) or modules(BeautifulSoup) to be installed rather you can use this Command which is inbuilt in Powershell “Invoke-Webrequest“. believe me its so powerful – now wonder why its called PowerShell 😛

Using a bit of Source code checks from the https://www.amazon.co.uk site using the Inspect Element i was able to create a script that basically pulls all the necessary information of the products that’s in a specific search criteria value. Present it out as a Table where you can basically see all the details where you can build a database of products if you ran it every few hours and feed the data into the Database for data-mining.

The Challenge of doing is that the Amazon Page is so dynamic in presenting the products where they even change HTML codes structure for even product wise. which was a fun weekend project to create the web-scraper.

Since i haven’t seen anyone paying much importance to the Powershell’s inbuilt tools like invoke-webrequest command. i decided to share this fun weekend project with you all. It took just less than 1 hour to fully program this 41 lines of code.

Let me show you the the result this time before going to the script.

in the above report you can see the following details that’s been pulled from the Amazon using Powershell:
1, Name
2, Link
3, Price
4, Delivery
5, Rating

Using these fields you can create your own Product data repository for further invest or if you are planning to do a start-up if you can think of a potential way.

Lets reveal the fun part the Script.

$Search = "camera"
$BaseURL = "https://www.amazon.co.uk"
$SearchLink = "https://www.amazon.co.uk/s?k=$Search&ref=nb_sb_noss_2"
$result = @()

$Response = Invoke-WebRequest -Uri $SearchLink
$productNames=$Response.AllElements | where {$_.class -eq "a-link-normal a-text-normal"} | select outertext,href # Product
$images = $Response.AllElements | where {$_.Class -eq "s-image"} | select src
$prices= $Response.AllElements | where {$_.class -eq "a-price-whole"} | select outertext # Price
$cents = $Response.AllElements | where {$_.class -eq "a-price-fraction"} | select outertext # Price cents
$deliveries = $Response.AllElements | where {$_.class -eq "a-row a-size-base a-color-secondary s-align-children-center"} | select outertext # Delivery
$ratings = $Response.AllElements | where {$_.class -eq "a-row a-size-small"} | select outertext # Rating

$count = $productNames.Count 

For ($i=0;$i -le $count; $i++){

    $Details = New-Object -TypeName psobject -Property @{
    
    Name = $productNames[$i].outertext
    Link = $BaseURL+$productNames[$i].href
    Price = $prices[$i].Outertext+$cents[$i].outertext
    Image = $images[$i].src
    Rating = $ratings[$i].Outertext
    Delivery = $deliveries[$i].Outertext
    }
    
    $result += $Details | Select-Object Name,Link,price,image,rating,Delivery
}

   $Style = @"
<style>
BODY{font-family:Calibri;font-size:10pt;}
TABLE{border-width: 1px;border-style: solid;border-color: black;border-collapse: collapse; padding-right:5px}
TH{border-width: 1px;padding: 5px;border-style: solid;border-color: black;color:black;background-color:#FFFFFF }
TH{border-width: 1px;padding: 5px;border-style: solid;border-color: black;background-color:green}
TD{border-width: 1px;padding: 5px;border-style: solid;border-color: black}
</style>
"@

$result | select name,Link,Price,Delivery,rating | Sort-Object Price | ConvertTo-Html -Head $Style -Body "<H2>Amazon Product result for $Search</H2>" |Out-File 'C:\temp\Report.html' 

I was looking for Camera as a example you can obviously go with any product you would like to search.. the best thing is you get a report in less than 12 seconds. you can see below highlighted.

You might need a bit of HMTL Understanding to modify the script but its not that hard and never late to learn, Like i say its all about learning. Happy Scarping 🙂

Note: Be-caution on using this script for any other purpose/sites as this demonstration is only for educational purpose and to show the potential of Powershell apart of the System management in an organization.

Published by iamfazul

Author of the site

Leave a comment