ruby - Parsing HTML from a defined start point to a defined end point? -


i have html:

<hr noshade> <p><a href="#1">some text here</a></p> <p style="margin-top:0pt;margin-bottom:0pt;line-height:120%;"><span style="color:#000000;font-weight:bold;">this description</span></p> <hr noshade> <!-- <hr noshade> delimiter me --> <p><a href="#2">some more text here</a></p> <p style="margin-top:0pt;margin-bottom:0pt;line-height:120%;"><span style="color:#000000;font-weight:bold;">this description more text</span></p> <hr noshade> 

while parsing using nokogiri, want print information between each of these set of tags separated own delimiter <hr noshade>. so, first block should print information between "p" tags lie between 2 hr noshade tags , on.

i'm using accepted answer on xpath select elements between 2 specific elements

i have semi-safisfactory solution

you can use xpath expression:

.//hr[1][@noshade]   /following-sibling::*[not(self::hr[@noshade])]                        [count(preceding-sibling::hr[@noshade])=1] 

for first group between <hr noshade> 1 , 2,

then,

.//hr[2][@noshade]   /following-sibling::*[not(self::hr[@noshade])]                        [count(preceding-sibling::hr[@noshade])=2] 

for elements between <hr noshade> 2 , 3, etc.

what these expressions select:

  1. all siblings of <hr noshade>, specified position n
  2. that have n <hr noshade> previous siblings, i.e. positionned in n'th group
  3. and not <hr noshade> themselves

as select several elements between 2 <hr noshade>, may have loop on results , extract data each sibling element.

anyone on more generic solution?


Comments

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -