Using Html Agility Pack, I have to scrape the innerText from all //dd tags which are set between //h2 tags (in this case between h2 tags named "Applicant" and "Agent"). How can this be done?
The following is just a piece of HTML code from which I have to scrape data:
<!-- Applicants section -->
<h2 class="GridTitle">Applicant</h2>
<h3 class="DataTitle">1</h3>
<dl class="Grid LeftCol">
<dt>Name:</dt>
<dd>Some name here</dd>
<dt>Legal Form:</dt>
<dd></dd>
<dt>From:</dt>
<dd>06/08/2020</dd>
</dl>
<dl class="Grid RightCol">
<dt>Address:</dt>
<dd>Some address here</dd>
<dt>To:</dt>
<dd></dd>
</dl>
<h3 class="DataTitle">2</h3>
<dl class="Grid LeftCol">
<dt>Name:</dt>
<dd>Some name here1</dd>
<dt>Legal Form:</dt>
<dd></dd>
<dt>From:</dt>
<dd>04/08/2010</dd>
</dl>
<dl class="Grid RightCol">
<dt>Address:</dt>
<dd>Some address here1</dd>
<dt>To:</dt>
<dd>06/08/2020</dd>
</dl>
<!-- Agents section -->
<h2 class="GridTitle">Agent</h2>
This is something I have tried, but it takes first //dd above //h2(Agent)
var h2Tags = doc.DocumentNode.SelectNodes("//h2[text() = 'Applicant']");
var h2Tags1 = doc.DocumentNode.SelectNodes("//h2[text() = 'Agent']");
var lineNum = h2Tags[0].Line;
var lineNum1 = h2Tags1[0].Line;
var Applicants = doc.DocumentNode.SelectNodes("//dd").Where(x => x.Line > lineNum).Where(x => x.Line < lineNum1);
foreach (HtmlNode g in Applicants)
{
TMOwner = g.InnerText;
}
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…