List of HTML elements from a string
-
Before I start reinventing the wheel and come out with some Codethulu monstrosity, is there any standard way or built standard library way to to this? (in .Net)
Given a string that's HTML formatted ( possibly including some text outside of tags), can I break it down into a collection of the top level tag elements?
So something like:
Some text<br/> <table> <tr> <th>Firstname</th> <th>Lastname</th> <th>Age</th> </tr> <tr> <td>Jill</td> <td>Smith</td> <td>50</td> </tr> <tr> <td>Eve</td> <td>Jackson</td> <td>94</td> </tr> </table> <span>some more text</span>
I'd like a collection of 4 strings: the plain text, the
<br/>
, the table and the span. Is it possible without going a decent amount down the path of creating an HTML parser?
-
@jaloopa don't reinvent the wheel. Just run it through an HTML parser and loop through the top-level nodes.
https://code.msdn.microsoft.com/How-to-parse-html-in-NET-2660026c
-
@anotherusername said in List of HTML elements from a string:
https://code.msdn.microsoft.com/How-to-parse-html-in-NET-2660026c
Did they really need to make a 300KB-zipped full ASP.Net solution for demonstrating what basically amounts to:
using HtmlAgilityPack; // ... var doc = new HtmlDocument(); doc.LoadHtml(htmlString); // ...error checking left as an exercise... var childNodes = doc.DocumentNode.ChildNodes;
-
@zecc *shrug* I don't know, all I know is that "terget" made me cringe but I was too lazy to find anything else.