La recherche (searchv2/
)¶
Module situé dans zds/searchv2/
.
Fichiers documentés :
Modèles (models.py
)¶
-
class
zds.searchv2.models.
AbstractESDjangoIndexable
(*args, **kwargs)¶ Version of AbstractESIndexable for a Django object, with some improvements :
- Already include
pk
in mapping ; - Match ES
_id
field andpk
; - Overide
es_already_indexed
to a database field. - Define a
es_flagged
field to restrict the number of object to be indexed ; - Override
save()
to manage the field ; - Define a
get_es_django_indexable()
method that can be overridden to change the queryset to fetch object.
-
classmethod
get_es_django_indexable
(force_reindexing=False)¶ Method that can be overridden to filter django objects from database based on any criterion.
Paramètres: force_reindexing (bool) – force to return all objects, even if they may be already indexed. Retourne: query Type retourné: django.db.models.query.QuerySet
-
classmethod
get_es_indexable
(force_reindexing=False)¶ Override
get_es_indexable()
in order to use the Django querysets and batch objects.Retourne: a queryset Type retourné: django.db.models.query.QuerySet
-
classmethod
get_es_mapping
()¶ Overridden to add pk into mapping.
Retourne: mapping object Type retourné: elasticsearch_dsl.Mapping
-
save
(*args, **kwargs)¶ Override the
save()
method to flag the object if saved (which assumes a modification of the object, so the need to reindex).Note
Flagging can be prevented using
save(es_flagged=False)
.
- Already include
-
class
zds.searchv2.models.
AbstractESIndexable
¶ Mixin for indexable objects.
Define a number of different functions that can be overridden to tune the behavior of indexing into elasticsearch.
You (may) need to override :
get_indexable()
;get_mapping()
(not mandatory, but otherwise, ES will choose the mapping by itself) ;get_document()
(not mandatory, but may be useful if data differ from mapping or extra stuffs need to be done).
You also need to maintain
es_id
andes_already_indexed
for bulk indexing/updating (if any).-
get_es_document_as_bulk_action
(index, action='index')¶ Create a document formatted for a
_bulk
operation. Formatting is done based on action.See https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html.
Paramètres: Retourne: the document
Type retourné:
-
get_es_document_source
(excluded_fields=None)¶ Create a document from the variable of the class, based on the mapping.
Attention
You may need to override this method if the data differ from the mapping for some reason.
Paramètres: excluded_fields (list) – exclude some field from the default method Retourne: document Type retourné: dict
-
classmethod
get_es_document_type
()¶ value of the
_type
field in the index
-
classmethod
get_es_indexable
(force_reindexing=False)¶ Return objects to index.
Attention
You need to override this method (otherwise nothing will be indexed).
Paramètres: force_reindexing (bool) – force to return all objects, even if they may already be indexed. Type retourné: list
-
classmethod
get_es_mapping
()¶ Setup mapping (data scheme).
Note
You will probably want to change the analyzer and boost value. Also consider the
index='not_analyzed'
option to improve performances.See https://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#mappings
Attention
You may want to override this method (otherwise ES choose the mapping by itself).
Retourne: mapping object Type retourné: elasticsearch_dsl.Mapping
-
class
zds.searchv2.models.
ESIndexManager
(name, shards=5, replicas=0, connection_alias='default')¶ Manage a given index with different taylor-made functions
-
analyze_sentence
(request)¶ Use the anlyzer on a given sentence. Get back the list of tokens.
See http://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html.
This is useful to perform “terms” queries instead of full-text queries.
Paramètres: request (str) – a sentence from user input Retourne: the tokens Type retourné: list
-
clear_es_index
()¶ Clear index
-
clear_indexing_of_model
(model)¶ Nullify the indexing of a given model by setting
es_already_index=False
to all objects.Use full updating for
AbstractESDjangoIndexable
, instead of saving all of them.Paramètres: model (class) – the model
-
delete_by_query
(doc_type='', query=MatchAll())¶ Perform a deletion trough the
_delete_by_query
API.See https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html
Attention
Call to this function must be done with great care!
Paramètres: - doc_type (str) – the document type
- query (elasticsearch_dsl.query.Query) – the query to match all document to be deleted
-
delete_document
(document)¶ Delete a given document, based on its
es_id
Paramètres: document (AbstractESIndexable) – the document
-
es_bulk_indexing_of_model
(model, force_reindexing=False)¶ Perform a bulk action on documents of a given model. Use the
objects_per_batch
property to index.See http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch.Elasticsearch.bulk and http://elasticsearch-py.readthedocs.io/en/master/helpers.html#elasticsearch.helpers.parallel_bulk
Attention
- Currently only implemented with “index” and “update” !
- Currently only working with
AbstractESDjangoIndexable
.
Paramètres: - model (class) – and model
- force_reindexing (bool) – force all document to be returned
Retourne: the number of documents indexed
Type retourné:
-
refresh_index
()¶ Force the refreshing the index. The task is normally done periodically, but may be forced with this method.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-refresh.html.
Note
The use of this function is mandatory if you want to use the search right after an indexing.
-
reset_es_index
(models)¶ Delete old index and create an new one (with the same name). Setup the number of shards and replicas. Then, set mappings for the different models.
Paramètres:
-
setup_custom_analyzer
()¶ Override the default analyzer.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis.html.
Our custom analyzer is based on the “french” analyzer (https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html#french-analyzer) but with some difference
- “custom_tokenizer”, to deal with punctuation and all kind of (non-breaking) spaces, but keep dashes and other stuffs intact (in order to keep “c++” or “c#”, for example).
- “protect_c_language”, a pattern replace filter to prevent “c” from being wiped out by the stopper.
- “french_keywords”, a keyword stopper prevent some programming language from being stemmed.
Avertissement
You need to run
manage.py es_manager index_all
if you modified this !!
-
setup_search
(request)¶ Setup search to the good index
Paramètres: request (elasticsearch_dsl.Search) – the search request Retourne: formated search Type retourné: elasticsearch_dsl.Search
-
update_single_document
(document, doc)¶ Update given fields of a single document.
See https://www.elastic.co/guide/en/elasticsearch/guide/current/partial-updates.html.
Paramètres: - document (AbstractESIndexable) – the document
- doc (dict) – fields to update
-
-
exception
zds.searchv2.models.
NeedIndex
¶ Raised when an action requires an index, but it is not created (yet).
-
zds.searchv2.models.
delete_document_in_elasticsearch
(instance)¶ Delete a ESDjangoIndexable from ES database. Must be implemented by all classes that derive from AbstractESDjangoIndexable.
Paramètres: instance (AbstractESIndexable) – the document to delete
-
zds.searchv2.models.
get_django_indexable_objects
()¶ Return all indexable objects registered in Django
Vues (views.py
)¶
-
class
zds.searchv2.views.
SearchView
(**kwargs)¶ Research view
-
get
(request, *args, **kwargs)¶ Overridden to catch the request and fill the form.
-
get_queryset_chapters
()¶ Find in chapters.
-
get_queryset_posts
()¶ Find in posts, and remove result if the forum is not allowed for the user or if the message is invisible.
Score is modified if :
- Post is the first one in a topic ;
- Post is marked as “useful” ;
- Post has a like/dislike ratio above (more like than dislike) or below (the other way around) 1.0.
-
get_queryset_publishedcontents
()¶ Find in PublishedContents.
-
get_queryset_topics
()¶ Find in topics, and remove result if the forum is not allowed for the user.
Score is modified if :
- topic is solved ;
- Topic is sticky ;
- Topic is locked.
-
-
zds.searchv2.views.
opensearch
(request)¶ Generate OpenSearch Description file