Download Bitcoin prices using Html Agility Pack written in C#
P粉156532706
P粉156532706 2023-09-05 17:17:03
0
1
503

I need to get Bitcoin price from https://coinmarketcap.com/currencies/bitcoin/ using Html Agility Pack. I'm using this example and it works fine:

var html = @"http://html-agility-pack.net/"; HtmlWeb web = new HtmlWeb(); var htmlDoc = web.Load(html); var node = htmlDoc.DocumentNode.SelectSingleNode("//head/title"); Console.WriteLine("Node Name: " node.Name "\n" node.OuterHtml);

XPath is: //*[@id="__next"]/div/div[1]/div[2]/div/div[1]/div[2]/div/ div[2]/div[1]/div

HTML code:

$17,162.42

I tried the following code but it returns "Object reference not set to an instance of an object":

var html = @"https://coinmarketcap.com/currencies/bitcoin/"; HtmlWeb web = new HtmlWeb(); var htmlDoc = web.Load(html); var node = htmlDoc.DocumentNode.SelectSingleNode("//div[@class='priceValue']/span"); Console.WriteLine("Node Name: " node.Name "\n" node.InnerText);`

P粉156532706
P粉156532706

reply all (1)
P粉729518806

TLDR:

  1. You need to tellHtmlWebto decompress the response (or use a suitable HTTP client)
  2. You need to fix the XPath selector

Apparently, theSelectSingleNode()call returnsnullbecause it cannot find the node.

In this case, it is helpful to inspect the loaded HTML. You can do this by getting the value ofhtmlDoc.DocumentNode.InnerHtml. I've tried doing this and the "HTML" generated is meaningless.

The reason is thatHtmlWebdoes not decompress the response it receives by default. Seethisgithub issue for details. If you used a proper HTTP client (likethis), or if the HtmlAgilityPack developers were more proactive, I don't think you would run into this problem.

If you insist on usingHtmlWeb, your code should look like this:

const string html = @"https://coinmarketcap.com/currencies/bitcoin/"; var web = new HtmlWeb { AutomaticDecompression = DecompressionMethods.GZip }; HtmlDocument doc = web.Load(html); HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[@class='priceValue ']/span");

Please note that the class of the element you are looking for is actuallypriceValue(with a space character at the end), there is anotheron the page with classpriceValuediv. That's another question, though, and you should eventually be able to find a more robust selector. Maybe try this:

HtmlNode node = doc.DocumentNode.SelectSingleNode("//div[contains(@class, 'priceSection')]//div[contains(@class, 'priceValue')]/span");
    Latest Downloads
    More>
    Web Effects
    Website Source Code
    Website Materials
    Front End Template
    About us Disclaimer Sitemap
    php.cn:Public welfare online PHP training,Help PHP learners grow quickly!