>>105846510Any actual HTML parsing library. It doesn't need to be a full featured one that would weigh several megabytes as the other anon suggested.
The problem with using regular expressions for HTML is that HTML is not a regular language, it's context-free. Regular expressions cannot, for instance, tell where closing tags are meant to apply to, nor if close tags are evenly matched. Your ability to parse within a given HTML document is quite limited with regular expressions.