I have recently moved from excel VBA automation to try out the autohotkey automation based on http://the-automator.com/web-scraping-intro-with-autohotkey/ tutorial, but I can't seem to understand well the code, could someone please point me in the right direction?
I am trying to make my F1 key to scrape some data on the current active.
F1::
pwb := ComObjCreate("InternetExplorer.Application") ;create IE Object
pwb.visible:=true ; Set the IE object to visible
pwb := WBGet()
;************Pointer to Open IE Window******************
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) { ;// based on ComObjQuery docs
static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
, IID := "{0002DF05-0000-0000-C000-000000000046}" ;// IID_IWebBrowserApp
;// , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}" ;// IID_IHTMLWindow2
SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
if (ErrorLevel != "FAIL") {
lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
}
}
}
I understand this code creates a new IE application, but what if I don't want to create one? Which is just to get the current active window? I saw a few codes that allow me to get the current active browser URL, but I can't seem to get the current active browser elements.
So far I have tried this. Can someone tell me how do I get it to point to the active page and get some of its data?
F1::
wb := WBGet()
if !instr(wb.LocationURL, "https://www.google.com/")
{
wb := ""
return
}
doc := wb.document
h2name := rows[0].getElementsByTagName("h2")
FileAppend, %h2name%, Somefile.txt
Run Somefile.txt
return
WBGet(WinTitle="ahk_class IEFrame", Svr#=1) { ;// based on ComObjQuery docs
static msg := DllCall("RegisterWindowMessage", "str", "WM_HTML_GETOBJECT")
, IID := "{0002DF05-0000-0000-C000-000000000046}" ;// IID_IWebBrowserApp
;// , IID := "{332C4427-26CB-11D0-B483-00C04FD90119}" ;// IID_IHTMLWindow2
SendMessage msg, 0, 0, Internet Explorer_Server%Svr#%, %WinTitle%
if (ErrorLevel != "FAIL") {
lResult:=ErrorLevel, VarSetCapacity(GUID,16,0)
if DllCall("ole32\CLSIDFromString", "wstr","{332C4425-26CB-11D0-B483-00C04FD90119}", "ptr",&GUID) >= 0 {
DllCall("oleacc\ObjectFromLresult", "ptr",lResult, "ptr",&GUID, "ptr",0, "ptr*",pdoc)
return ComObj(9,ComObjQuery(pdoc,IID,IID),1), ObjRelease(pdoc)
}
}
}
Try to test if the variable would write onto the somefile.txt, not too sure how it should test with msgbox. It kept writing the whole script instead of showing the result.
UrlDownloadToFile, http://www.example.com, sourcecode.html
to save the (whole) code in your PC then you could parse the text outside of the labels<>
(excluding the text between<style></style>
and<script></script>
) to get the page's innertext. – Le____