-
Story
-
Resolution: Done
-
Medium
-
None
-
None
-
None
-
40
-
RHD-Sprint 30
The stackoverflow river setup has critical issues which cause not reliable data in index !
Needs to be revisited ASAP.
Initial PR: https://github.com/searchisko/configuration/pull/40
THe most critical part: https://github.com/searchisko/configuration/blob/master/mappings/data_stackoverflow_question/stackoverflow_question.json - it has missing mapping for space_key as described here:
https://github.com/searchisko/elasticsearch-river-remote#notes-for-index-and-document-type-mapping-creation
Result is:
https://dcp2.jboss.org/v2/rest/sys/es/search/stats_stackoverflow_question_river/_search
{
"query" : {
"bool" : {
"must" : [
{ "range" : { "documents_deleted" : { "gte" : 1 } } }
]
}
},
"sort" : [
{ "start_date" : {"order" : "desc" } }
]
}
...
{
"_index": "stats_stackoverflow_question_river",
"_type": "remote_river_indexupdate",
"_id": "AVpGxbUZDRkKLW-QapWc",
"_score": null,
"_source": {
"river_name": "stackoverflow_question",
"space_key": "jboss",
"update_type": "FULL",
"start_date": "2017-02-16T11:51:17.409Z",
"documents_updated": "8049",
"documents_deleted": "2964",
"comments_deleted": "0",
"documents_with_error": "0",
"result": "OK",
"time_elapsed": "169074ms"
},
"sort": [
"1487245877409"
]
},
...
"documents_deleted": "2964", - river is simply broken.