c# - how to extract form tag using htmlagility pack? -
i'm using htmlagilitypack
in 1 of c#
projects scraping. need scrap <form>
tag web page. i've searched how extract form tag using htmlagilitypack couldn't find answer. can tell me how extract <form>
tag using htmlagilitypack
?
private void testing() { var gethtmlweb = new htmlweb(); var document = gethtmlweb.load(@"http://localhost/final_project/index.php"); htmlnode.elementsflags.remove("form"); var atags = document.documentnode.selectnodes("//form"); int counter = 1; stringbuilder buffer = new stringbuilder(); if (atags != null) { foreach (var atag in atags) { buffer.append(counter + ". " + atag.innerhtml + " - " + "\t" + "<br />"); counter++; } } }
here code sample. i'm scraping page localhost
. count of atags
1 because there 1 form on page. when use stringbuilder
object doesn't contain innerhtml of form. where's error :(
here html source want scrap form
<!doctype html> <html> <head> <!-- stylesheet section --> <link rel="stylesheet" type="text/css" media="all" href="./_include/style.css"> <!-- title of page --> <title>login</title> <!-- php section --> <!-- creating connection database--> <!-- end of php sectoin --> </head> <body> <!-- we'll check error variable print warning --> <!-- we'll submit data same page avoid excessive pages --> <form action="/final_project/index.php" method="post"> <!-- ============================== fieldset 1 ============================== --> <fieldset> <legend>log in credentials:</legend> <hr class="hrzntlrow" /> <label for="input-one"><strong>user name:</strong></label><br /> <input autofocus name="username" type="text" size="20" id="input-one" class="text" placeholder="user name" required /><br /> <label for="input-two"><strong>password:</strong></label><br /> <input name="password" type="password" size="20" id="input-two" class="text" placeholder="password" required /> </fieldset> <!-- ============================== fieldset 1 end ============================== --> <p><input type="submit" alt="submit" name="submit" value="submit" class="submit-text" /></p> </form> </body> </html>
since form tags allowed overlap , hap handles them differently, treat form tags other element remove form flag calling:
htmlagilitypack.htmlnode.elementsflags.remove("form");
now form tags handled expect, , can work way work other tags.
Comments
Post a Comment