java - How to search millions of documents with an average of 8000 words each? -


i have odd problem , need on this.

i have dataset of 6 million documents, composed of close 8000 words each. each word number , words/numbers separated whitespace.

after indexing data (i'm indexing subset now) need perform queries list of numbers want documents have all numbers (the condition and).

i came this:

string [] codes_vec = array_with_500_strings_all_numbers;  boolquerybuilder qbuilder = querybuilders.boolquery();  (int =0; < codes_vec.length; ++i) {     qbuilder = qbuilder.must(querybuilders.matchquery("code", codes_vec[i]));  } 

the problem is, seems inefficient, how can speed search ? there better way of querying elasticsearch faster in case ?

kind regards, zé maria

using filter faster match query. here's documentation on boolean filters: http://www.elasticsearch.org/guide/reference/query-dsl/bool-filter/

here's how create , use one:

// create filter // cache results, add .cache(true); boolfilterbuilder filterbuilder = filterbuilders.boolfilter();  // did mean skip first one? (int = 0; < codes_vec.length; ++i)     filterbuilder.must(filterbuilders.termfilter("code", codes_vec[i]));  // add filter search searchresponse response = client.preparesearch("index1")             .setfilter(filterbuilder)             .setfrom(0).setsize(10)             .execute()             .actionget(); 

Comments

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -