Start a new topic

How to combine Xpath and Regex-- in the same String---for field items?

Is there a way to combine an xpath search string with an regular expression--in the same string? From the scrapy manual, I do know that you can chain xpath and re commands..In my case, a json object inside a javascript may/may not have the item I need..

At present, I'm loading the JSON object and then iterating over that....then iterating over the xpath string. This works but seems kludgy...with to for loops.
if i could include everything in company_item_fields...would be more elegant


company_items_fields = {


    'name': {

    # combining xpath with json...is what i would like to do....

    '//script[@type="application/ld+json"].re(r"(?:"name":\s+")(.*?)(?:"))',


    # standard/working xpaths

    '//div[@class="profile-title"]//a/text()',

    '//div[@class="profile-full-name"]/text()'

    },
}


selector = Selector(response)

loader = ItemLoader(CompanyItem(), response=response)

for field, xpath in self.company_item_fields.items():

    loader.add_xpath(field, xpath)

yield loader.load_item()

Login to post a comment