python - Xpath get data if conditions is satisfied in scrapy -


i using scrapy extract data. there thousands of product scraping problem data on these pages not consistent ie.

<table class="c999 fs12 mt10 f-bold">                                         <tbody><tr>                             <td width="16%">type</td>                             <td class="c222">kurta</td>                         </tr>                                                     <tr>                                 <td>fabric</td>                                 <td class="c222">cotton</td>                             </tr>                                                                                                     <tr>                                 <td>sleeves</td>                                 <td class="c222">3/4th sleeves</td>                             </tr>                                                                                                     <tr>                                 <td>neck</td>                                 <td class="c222">mandarin collar</td>                             </tr>                                                                                                     <tr>                                 <td>wash care</td>                                 <td class="c222">gentle wash</td>                             </tr>                                                                                                     <tr>                                 <td>fit</td>                                 <td class="c222">regular</td>                             </tr>                                                                                                     <tr>                                 <td>length</td>                                 <td class="c222">knee length</td>                             </tr>                                                                                                                         <tr>                                 <td>color</td>                                 <td class="c222">brown</td>                             </tr>                                                                                                     <tr>                                 <td>fabric details</td>                                 <td class="c222">cotton</td>                             </tr>                                                                                                                                                                                 <tr>                             <td>                                 style                            </td>                             <td class="c222"> printed</td>                         </tr>                                                                                                     <tr>                         <td>                             sku                        </td>                         <td id="qa-sku" class="c222"> sr227wa70rojindfas</td>                     </tr>                                                      <tr>                         <td></td>                      </tr>                             </tbody></table> 

so these rows not consistent . "type" @ first position , @ second. wrote code loop through values , compare value of 1st td if "type" value of corresponding td not working here code.

table_data = response.xpath('//*[@id="productinfo"]/table/tr')         data in table_data:             name = data.xpath('td/text()').extract() 

what should do??

you can try using following xpath :

name = data.xpath("td[position()=(count(../../tr/td[.='type']/preceding-sibling::td)+1)]/text()").extract() 

above xpath filters <td> position, returning <td> in position equal position of <td>type</td>. getting position of <td>type</td> done counting number of it's preceding sibling <td> plus one.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -

utf 8 - split utf-8 string into bytes in python -