extract Meta tags from website using portia (scrapy)
i want to use portia to extract the meta tags from some website but its not showing head tag , it is starting from body tag only
i am only able to extract data from body tag
You need to annotate an element within the body
and then navigate to the element in the head
that you want to map.
html
element. You will get a warning that you will lose any mapped attributes to the annotation, click OK.head
element.head
.+ Field
button to create a new field and then map the desired attribute value to the target field.you can use this for meta names:
meta_name = hxs.select('//meta/@name').extract()
and this for meta contents:
meta_content = hxs.select('//meta/@content').extract()
and this for content of a meta with a particular name like description:
meta = hxs.select('//meta[@name=\'description\']/@content').extract()