XPATH (Scrapy): select text between 2 certain keywords

Question

I'm trying to extract text between 2 keywords 商品詳細 and 支払詳細 in this HTML

        <TR>
            <TD BGCOLOR=#336600><BR></TD>
            <TD COLSPAN=3 BGCOLOR=#FFFFCC><FONT COLOR=#336600 SIZE=4><B>　商品詳細 </B></FONT></TD>
        </TR>
        <TR>
            <TD COLSPAN=4 HEIGHT=10>
                <LI STYLE=><SPAN STYLE=>鍵付きで盗難を防止できます。</SPAN>
                <LI STYLE=><SPAN STYLE=>商品サイズ：約28*36*12cm</SPAN>
                <LI STYLE=><SPAN STYLE=>素材：鉄製</SPAN>
                <LI STYLE=><SPAN STYLE=>※柄は、ランダムにて発送なります</SPAN>
                <LI STYLE=><SPAN STYLE=></SPAN>
                <LI STYLE=>
                    <SPAN STYLE=></SPAN>
            </TD>
        </TR>
        <TR>
            <TD><BR></TD>
            <TD COLSPAN=2 ALIGN=left><BR></TD>
            <TD><BR></TD>
        </TR>
        <TR>
            <TD COLSPAN=4 HEIGHT=25><BR></TD>
        </TR>
        <TR>
            <TD BGCOLOR=#336600><BR></TD>
            <TD COLSPAN=3 BGCOLOR=#FFFFCC>
                <FONT COLOR=#336600 SIZE=4><B>　支払詳細 </B></FONT>
            </TD>
        </TR>

I tried the solutions in these 2 links but they didn't work for me

Scrapy xpath between 2 keywords

Xpath text extraction between 2 keywords

This is the result I have when run in scrapy shell:

In [21]: response.xpath("//text()[preceding-sibling::*[text()='商品詳細'] and following-sibling::*[text()='支払詳細']]").extract()
Out[21]: []

Granitosaurus Granitosaurus · Accepted Answer · 2017-04-25T07:20:21

With xpath you can navigate the document in any direction,so in this case you want to find a key node that you know some info about and navigate to related nodes.

//td[contains(.//text(),'商品詳')]   # find td that contains some text
/../following-sibling::tr//li/span/text()"  # find text in it's father's sibling

I've tried this in a shell:

>[1]: sel.xpath("//td[contains(.//text(),'商品詳')]/../following-sibling::tr//li/span/text()").ex
       tract()
<[1]: ['鍵付きで盗難を防止できます。', '商品サイズ：約28*36*12cm', '素材：鉄製', '※柄は、ランダムにて発送なります']

XPATH (Scrapy): select text between 2 certain keywords

1 Answers