1
votes

I want to download a web site with the WebBrowser Control (WPF) in the Background and parse the site afterwards. The download should be in a "tight loop". I only need the source as a string. I tried it with the following code which didn't give me the results. If I don't use it with StaTaskScheduler the program seems to freez during the loop. Any Ideas ?

Thank you

StaTaskScheduler sta = new StaTaskScheduler(numberOfThreads: 1); 
private void Button1_Click(object sender, RoutedEventArgs e)
        { 
   for (int i = 0; i < 2; i++)
            {
                Task.Factory.StartNew(() =>
                {
                    WebBrowser wb3 = new WebBrowser();
                    wb3.Source = new Uri("MyURL");
                    n++;
                    wb3.LoadCompleted += new LoadCompletedEventHandler(wb_LoadCompleted);
                }, CancellationToken.None, TaskCreationOptions.None, sta);
            }
        }

void wb_LoadCompleted(object sender, NavigationEventArgs e)
    {

 WebBrowser w = sender as WebBrowser;
  HtmlDocument document = new HtmlDocument(w.Document);

blockingCollection.Add(document.Body.OuterHtml);

        Task.Factory.StartNew(
           () =>
           {
               while (!blockingCollection.IsCompleted)
               {
                   string dlcode;
                   Thread.Sleep(500);
                   if (blockingCollection.TryTake(out dlcode))
                   {
    // tb is a TextBox
                         Dispatcher.BeginInvoke(new Action(() => { tb.Text = dlcode; }));
                   }
               }
           }, CancellationToken.None, TaskCreationOptions.None, TaskScheduler.Default); 

}

1

1 Answers

2
votes

I would recommend not using the WebBrowser for this, but rather just use a WebClient directly. This is easiest done by making a routine to wrap the download data in a Task:

Task<string> DownloadStringAsync(Uri address)
{
     TaskCompletionSource<string> tcs = new TaskCompletionSource<string>();
     WebClient client = new WebClient();

     // Note that you can add error checking here by looking at e.Error/etc, and setting the cancel/error in tcs appropriately...
     client.DownloadStringCompleted += (o,e) => tcs.SetResult(e.Result);
     client.DownloadStringAsync(address);

     return tcs.Task;
}

With this, you should be able to just use these tasks directly, and set their results into the BC on completion. This would be far simpler than trying to spin of a WebBrowser control, which is intended for visual use.