1
votes

I'm trying to write an AutoIT script that would automatically navigate to a URL (https://www.macrotrends.net/1476/copper-prices-historical-chart-data) via Internet Explorer and click on a button (red coloured, labeled as 'Download Historical Data') that would download an excel data file. By inspecting this button, I've found that it has an id ("dataDownload"), but my attempts of using the _IEGetObjById function in AutoIT have been unsuccessful. Here is my script:

#include <IE.au3>

Local $oIE = _IECreate("https://www.macrotrends.net/1476/copper-prices-historical-chart-data", 1)
Local $oButton = _IEGetObjById($oIE, "dataDownload")
_IEAction($oButton, "click")

The script is opening Internet Explorer, but I think it's not detecting the button as intended. My suspicion is that it could be because the button is located within a frame or container or whatever you might call it?

Also, my intention is to have this AutoIT script called by a batch file on Windows, to be run on a schedule. Please feel free to enlighten if you have any other recommended methods for such automated web-scraping.

Would appreciate all help I could get here please.

2
Are you familiar with any other proper scripting language? Python, Javascript, PHP, Powershell etc.?abdusco
I do know Python and R (which I've attempted with RSelenium package but unsuccessful). I'm just thinking it might not be due to a choice of programming language, since I sense the issue here is that the download button is not being detected. Please do enlighten if I'm wrong. Meanwhile, guess I'll be trying out Powershell, since it should be easiest to learn and attempt.John
Oh to add on, I also know VBS scripting, which has given me the same outcome as my AutoIT script. It simply opened up Internet Explorer, navigated to the URL, but didn't perform the required click on the download button.John
The problem is that Download historical data button is not present in HTML, it's added later via Javascript and it generates CSV file on-the-fly (there's no network requests). This mean you cannot "click" it without loading the page in a browser. The easiest way I know is using Python to drive an actual (headless - i.e. without a visible window) browser and let it load the page and click the button for me.abdusco
If you're willing to go a bit out of your comfort zone, I can provide you a python script to download the fileabdusco

2 Answers

0
votes

The Download historical data button is inside an iframe that points to this URL.

https://www.macrotrends.net/assets/php/chart_iframe_comp.php?id=1476&url=copper-prices-historical-chart-data

But it needs Javascript to work, meaning, you have to load the page in a browser (that executes javascript) to click that button [^1].

Luckily, there are tools to automate browser interactions. Playwright is a library available in many languages, but we'll use it with Python.

  1. Install the latest version of Python.
  2. Update the pip (Python package manager) to the latest version:
    python -m pip install --user --upgrade pip
    
  3. Install playright
    python -m pip install --user playwright
    
  4. Use playwright to install a browser. It will be used by playwright only.
    python -m playwright install chromium
    
  5. Open the URL with playwright inspector:
    python -m playwright open 'https://www.macrotrends.net/1476/copper-prices-historical-chart-data'
    

A browser window should pop up. Change the Target to python and click Record button in Playwright inspector window.

playwright inspector window

Then do what you need to do to click & download the file from the page.

Playwright will generate some code for us.

generated script

Stop recording and copy the script.

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    with p.chromium.launch(headless=False) as browser:
        page = browser.new_page(accept_downloads=True)
        page.goto('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
        
        # paste the script here

In my runs, playwright generated this script for me:

# Click button:has-text("Download Historical Data")
with page.expect_download() as download_info:
    page.frame(name="chart_iframe").click("button:has-text(\"Download Historical Data\")")

Combining both and adding the code to save the file, we have:

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    print('launching the browser')
    with p.chromium.launch(headless=False) as browser:
        page = browser.new_page(accept_downloads=True)
        print('visiting the page')
        page.goto('https://www.macrotrends.net/1476/copper-prices-historical-chart-data')
        
        print('clicking download button')
        # Click button:has-text("Download Historical Data")
        with page.expect_download() as download_info:
            page.frame(name="chart_iframe").click("button:has-text(\"Download Historical Data\")")
        download = download_info.value
        
        print('saving the file')
        download.save_as('prices.csv')

One important point you should look out for is that Python is a whitespace-sensitive language, meaning you have to align the code indentation properly to make it run.

Save this file as download_copper_prices.py, and when we run the script,

python download_copper_prices.py

A browser should pop up, visit the page, download and save the prices as prices.csv in the working directory.


[^1]: Actually, we can scrape the prices right off the page, as they're serialized as JSON. With a bit of regex, we can extract the JSON from the HTML. But that solution seemed to me a bit too job-specific, so I opted for a more hands-on solution that he can apply for other occasions. Scripting browser interactions is an important tool every power user should learn.

That said, I'm not knowledgeable in AutoIt, but assuming it has capable Regex tools, it's entirely possible to achieve the same result in AutoIt by downloading the HTML of the URL that the iframe points to (https://www.macrotrends.net/assets/php/chart_iframe_comp.php?id=1476&url=copper-prices-historical-chart-data) then using this regex to extract the JSON:

var originalData = (.*);

Running this code on the browser console returns an array of 15k+ entries:

JSON.parse(document.body.innerHTML.match('var originalData = (.*);')[1])

consisting of date and closing prices:

[{
    "date": "1959-07-02",
    "close": "0.2930"
}, {...}, ...]

Then it should be easy to convert this JSON to CSV.

References

0
votes

Here is a working sample using https://www.autoitscript.com/forum/topic/153520-iuiautomation-ms-framework-automate-chrome-ff-ie/ You can ignore most of the Global variables. I use this as a template for another script I wrote.

```
#include <GUIConstantsEx.au3>
#include <TabConstants.au3>
#include <WindowsConstants.au3>
#include <MsgBoxConstants.au3>
#include <EditConstants.au3>
#include <WinAPIGdi.au3>
#include <WinAPIGdiDC.au3>
#include <WinAPIHObj.au3>
#include <WinAPISysWin.au3>
#include <AD.au3>
#include <Array.au3>
#include "UIAWrappers.au3"
#include <FileConstants.au3>
#include <WinAPIFiles.au3>
#include <GuiListBox.au3>
#include <Word.au3>
#include <File.au3>
#include <Excel.au3>


Opt("WinWaitDelay", 150)
Opt("WinTitleMatchMode", 2)
Opt("WinDetectHiddenText", 1)
Opt("MouseCoordMode", 2)
Opt("SendKeyDelay", 10)
Opt("GUIResizeMode", 1)
HotKeySet("^!x", "MyExit")
Global $ver = "1.0.2.7"
Global $__goError = ObjEvent("AutoIt.Error", "__Excel_COMErrFunc") ; At the top of your script
Global $oCOMErrorHandler = ObjEvent("AutoIt.Error", "_User_ErrFunc")
Global $Guih, $MAINTab, $MAIN, $button1, $button2, $button3, $button4, $idListBox1, $inputEmployeeUsername, $inputSAName, $ADusername, $iLastTab, $editOU
Global $inputSAPassword, $editEmail, $editEmployeeID, $editLastName, $editCity, $editRegion, $editTitle, $editManager, $ADpassword, $editState, $DF
Global $C9, $hWnd, $g_szVersion, $msg, $employeeName, $aResult, $aManager, $aProperties1, $aProperties2, $aString, $aString1, $aString2, $iCurrTab, $editFirstName
Global $ManagerUN, $OU, $Line_Split, $iIndex, $GulfCoast, $SouthEast, $MountainWest, $MidAtlantic, $SouthTexas, $NorthTexas, $MidWest, $MidEast, $editZip
Global $file1, $OU_Name_Return_R, $OU_Name_Trim_L, $OU_Name_Split, $OU_Name_Trim_R, $Search, $State, $TD, $hWnd2, $oWord1, $button11, $editStreet, $CSVName, $Web

$Web = "https://www.macrotrends.net/1476/copper-prices-historical-chart-data"

ShellExecute("C:\Program Files (x86)\Google\Chrome\Application\chrome.exe", "--new-window --force-renderer-accessibility " & $Web, "", "")
WinWait("Copper Prices - 45 Year Historical Chart | MacroTrends - Google Chrome", "", 4)
$hWnd = WinGetHandle("Copper Prices - 45 Year Historical Chart | MacroTrends - Google Chrome")
WinActivate($hWnd)
WinSetState($hWnd, "", @SW_MAXIMIZE)

Local $oP8=_UIA_getObjectByFindAll($UIA_oDesktop, "Title:=Copper Prices - 45 Year Historical Chart | MacroTrends - Google Chrome;controltype:=UIA_PaneControlTypeId;class:=Chrome_WidgetWin_1", $treescope_children)
_UIA_Action($oP8,"setfocus")
Local $oP7=_UIA_getObjectByFindAll($oP8, "Title:=Copper Prices - 45 Year Historical Chart | MacroTrends;controltype:=UIA_DocumentControlTypeId;class:=Chrome_RenderWidgetHostHWND", $treescope_children)
_UIA_Action($oP7,"setfocus")

;~ First find the object in the parent before you can do something
Local $oUIElement=_UIA_getObjectByFindAll("  DownloadHistoricalData.mainwindow", "title:=  Download Historical Data;ControlType:=UIA_ButtonControlTypeId", $treescope_subtree)
Local $oUIElement=_UIA_getObjectByFindAll($oP7, "title:=  Download Historical Data;ControlType:=UIA_ButtonControlTypeId", $treescope_subtree)
_UIA_Action($oUIElement, "setfocus")
_UIA_Action($oUIElement, "highlight")
_UIA_Action($oUIElement, "activate")
_UIA_Action($oUIElement, "leftclick")

Func MyExit()
    Exit
EndFunc   ;==>MyExit

````