-
Story
-
Resolution: Done
-
Medium
-
None
-
None
-
None
-
40
-
RHD-Sprint 30
The stackoverflow river setup has critical issues which cause not reliable data in index !
Needs to be revisited ASAP.
Initial PR: https://github.com/searchisko/configuration/pull/40
THe most critical part: https://github.com/searchisko/configuration/blob/master/mappings/data_stackoverflow_question/stackoverflow_question.json - it has missing mapping for space_key as described here:
https://github.com/searchisko/elasticsearch-river-remote#notes-for-index-and-document-type-mapping-creation
Result is:
https://dcp2.jboss.org/v2/rest/sys/es/search/stats_stackoverflow_question_river/_search
{ "query" : { "bool" : { "must" : [ { "range" : { "documents_deleted" : { "gte" : 1 } } } ] } }, "sort" : [ { "start_date" : {"order" : "desc" } } ] }
... { "_index": "stats_stackoverflow_question_river", "_type": "remote_river_indexupdate", "_id": "AVpGxbUZDRkKLW-QapWc", "_score": null, "_source": { "river_name": "stackoverflow_question", "space_key": "jboss", "update_type": "FULL", "start_date": "2017-02-16T11:51:17.409Z", "documents_updated": "8049", "documents_deleted": "2964", "comments_deleted": "0", "documents_with_error": "0", "result": "OK", "time_elapsed": "169074ms" }, "sort": [ "1487245877409" ] }, ...
"documents_deleted": "2964", - river is simply broken.