We're using Django MarkupField to store Markdown text and it works quite well.
However, when we try to index these fields in Wagtail we get serialization errors from Elasticsearch, like this:
File "/usr/local/lib/python3.5/dist-packages/wagtail/wagtailsearch/management/commands/update_index.py", line 120, in handle
self.update_backend(backend_name, schema_only=options.get('schema_only', False))
File "/usr/local/lib/python3.5/dist-packages/wagtail/wagtailsearch/management/commands/update_index.py", line 87, in update_backend
index.add_items(model, chunk)
File "/usr/local/lib/python3.5/dist-packages/wagtail/wagtailsearch/backends/elasticsearch.py", line 579, in add_items
bulk(self.es, actions)
File "/usr/local/lib/python3.5/dist-packages/elasticsearch/helpers/__init__.py", line 195, in bulk
for ok, item in streaming_bulk(client, actions, **kwargs):
File "/usr/local/lib/python3.5/dist-packages/elasticsearch/helpers/__init__.py", line 162, in streaming_bulk
for bulk_actions in _chunk_actions(actions, chunk_size, max_chunk_bytes, client.transport.serializer):
File "/usr/local/lib/python3.5/dist-packages/elasticsearch/helpers/__init__.py", line 61, in _chunk_actions
data = serializer.dumps(data)
File "/usr/local/lib/python3.5/dist-packages/elasticsearch/serializer.py", line 50, in dumps
raise SerializationError(data, e)
elasticsearch.exceptions.SerializationError: ({'_partials': [<markupfield.fields.Markup object at 0x7faa6e238e80>, <markupfield.fields.Markup object at 0x7faa6dbc4da0>], 'pk': '1', 'research_interests': <markupfield.fields.Markup object at 0x7faa6e238e80>, 'bio': <markupfield.fields.Markup object at 0x7faa6dbc4da0>}, TypeError("Unable to serialize <markupfield.fields.Markup object at 0x7faa6e238e80> (type: <class 'markupfield.fields.Markup'>)",))
One workaround is to index callables that return field.raw
but then we'd have to write one such callable for each and every Markdown field property we have in our models. I thought we could get around this by extending the field property (i.e., the django-markupfield Markup
class that replaces the MarkupField
) with a get_searchable_content(value)
method but the serialization errors persist.
Does anyone have any tips for indexing custom Django fields in Wagtail + elasticsearch?