4
votes

Can someone point me to how to extract the results _source from the generator when using the scan API in the elasticsearch dsl python client?

for example, i'm using (from this example, elasticsearch-dsl scan)

for hit in s.scan():
    print(hit)

I get the following

<Hit(beacon/INDEX/_Mwt9mABoXXeYV0uwSC-): {'client_number': '3570', 'cl...}>

How do I extract the dictionary from the hit generator?

2

2 Answers

6
votes

Every Hit has to_dict(), hence you can just do hit.to_dict():

for hit in s.scan():
    print(hit.to_dict())

Note: hit.to_dict() doesn't convert meta info, you can get the meta from the meta object, i.e.:

hit_dict = hit.to_dict()
hit_dict['meta'] = hit.meta.to_dict()
3
votes

In addition to @ami-hollander answer - .to_dict() did not convert meta info (id for example), if you need this info you can do something like:

hit_dict = hit.to_dict()
hit_dict['meta'] = hit.meta.to_dict()