Bagaimana untuk Dapatkan Kod HTML Dijana Secara Dinamik daripada Kawalan Pelayar Web?

DDD
Lepaskan: 2024-10-18 08:35:03
asal
272 orang telah melayarinya

How to Retrieve Dynamically Generated HTML Code from Web Browser Controls?

How to Dynamically Generate HTML Code Using .NET's WebBrowser or mshtml.HTMLDocument?

Problem:

Retrieving dynamically generated HTML code from a web page using the WebBrowser class or mshtml.HTMLDocument interface can be a challenge. The WebBrowser class fails to capture the rendered HTML, and mshtml.HTMLDocument returns raw HTML that differs from the actual page content.

Solution:

Using WebBrowser Class:

Although the WebBrowser class does not provide a direct method for obtaining the rendered HTML, it is possible to implement a workaround. Add a WebBrowser control to a form, have it navigate to the desired URL, and then use the following steps to retrieve the HTML:

  1. Send "CTRL + A" keys to select all content.
  2. Use the Copy method to copy the selection to the clipboard.
  3. Paste the HTML from the clipboard and parse it as needed.

Using mshtml.HTMLDocument Interface:

  1. Create an instance of mshtml.HTMLDocument and pass it the downloaded HTML using write.
  2. Check for changes in the HTML snapshot by polling the all property and the IsBusy property of the WebBrowser control.
  3. Once the IsBusy property becomes false and there are no changes in the HTML snapshot, consider the page to be fully rendered and retrieve the HTML.

Additional Considerations:

  • Ensure HTML5 rendering is enabled using Browser Feature Control.
  • Use a timeout to prevent infinite rendering.
  • Async/await can simplify the implementation of the polling logic.

Example Code:

<code class="C#">using Microsoft.Win32;
using System;
using System.Threading;
using System.Threading.Tasks;
using mshtml;

public async Task<string> LoadDynamicPage(string url, CancellationToken token)
{
    var doc = new HTMLDocument();
    doc.write(new System.Net.WebClient().DownloadString(url));

    // Poll for changes in HTML snapshot
    var html = doc.documentElement.outerHTML;
    while (true)
    {
        await Task.Delay(500, token);
        var htmlNow = doc.documentElement.outerHTML;
        if (html == htmlNow)
            break;

        html = htmlNow;
    }

    return html;
}</code>
Salin selepas log masuk

Atas ialah kandungan terperinci Bagaimana untuk Dapatkan Kod HTML Dijana Secara Dinamik daripada Kawalan Pelayar Web?. Untuk maklumat lanjut, sila ikut artikel berkaitan lain di laman web China PHP!

sumber:php
Kenyataan Laman Web ini
Kandungan artikel ini disumbangkan secara sukarela oleh netizen, dan hak cipta adalah milik pengarang asal. Laman web ini tidak memikul tanggungjawab undang-undang yang sepadan. Jika anda menemui sebarang kandungan yang disyaki plagiarisme atau pelanggaran, sila hubungi admin@php.cn
Tutorial Popular
Lagi>
Muat turun terkini
Lagi>
kesan web
Kod sumber laman web
Bahan laman web
Templat hujung hadapan
Tentang kita Penafian Sitemap
Laman web PHP Cina:Latihan PHP dalam talian kebajikan awam,Bantu pelajar PHP berkembang dengan cepat!