I need to get Bitcoin price from https://coinmarketcap.com/currencies/bitcoin/ using Html Agility Pack. I'm using this example and it works fine:
var html = @"http://html-agility-pack.net/"; HtmlWeb web = new HtmlWeb(); var htmlDoc = web.Load(html); var node = htmlDoc.DocumentNode.SelectSingleNode("//head/title"); Console.WriteLine("Node Name: " node.Name "\n" node.OuterHtml);
XPath is: //*[@id="__next"]/div/div[1]/div[2]/div/div[1]/div[2]/div/ div[2]/div[1]/div
HTML code:
$17,162.42
I tried the following code but it returns "Object reference not set to an instance of an object":
var html = @"https://coinmarketcap.com/currencies/bitcoin/"; HtmlWeb web = new HtmlWeb(); var htmlDoc = web.Load(html); var node = htmlDoc.DocumentNode.SelectSingleNode("//div[@class='priceValue']/span"); Console.WriteLine("Node Name: " node.Name "\n" node.InnerText);`
TLDR:
HtmlWeb
to decompress the response (or use a suitable HTTP client)Apparently, the
SelectSingleNode()
call returnsnull
because it cannot find the node.In this case, it is helpful to inspect the loaded HTML. You can do this by getting the value of
htmlDoc.DocumentNode.InnerHtml
. I've tried doing this and the "HTML" generated is meaningless.The reason is that
HtmlWeb
does not decompress the response it receives by default. Seethisgithub issue for details. If you used a proper HTTP client (likethis), or if the HtmlAgilityPack developers were more proactive, I don't think you would run into this problem.If you insist on using
HtmlWeb
, your code should look like this:Please note that the class of the element you are looking for is actually
priceValue
(with a space character at the end), there is anotheron the page with class
priceValuediv
. That's another question, though, and you should eventually be able to find a more robust selector. Maybe try this: