solr - langid UpdateRequestProcessor only mapping first field -
i trying use solr's langid updaterequestprocessor. here config:
<updaterequestprocessorchain name="languages"> <processor class="solr.langdetectlanguageidentifierupdateprocessorfactory"> <lst name="invariants"> <str name="langid.fl">focus, expertise, platforms, partners, participation, additional</str> <str name="langid.whitelist">en,fr</str> <str name="langid.fallback">en</str> <str name="langid.langfield">detectedlang</str> <bool name="langid.map">true</bool> <bool name="langid.map.keeporig">false</bool> </lst> </processor> <processor class="solr.runupdateprocessorfactory" /> </updaterequestprocessorchain>
my fields this:
<fields> <field name="_root_" type="string" indexed="true" stored="false"/> <field name="_version_" type="long" indexed="true" stored="true" multivalued="false"/> <field name="id" type="string" indexed="true" stored="true" required="true" /> <!-- raw fields sql db --> <field name="expertise_id" type="int" indexed="true" stored="true" /> <field name="person_id" type="int" indexed="true" stored="true" /> <field name="mod_date" type="date" indexed="true" stored="true" /> <field name="lang" type="string" indexed="true" stored="true" /> <field name="focus" type="text_general" indexed="true" stored="true" /> <field name="expertise" type="text_general" indexed="true" stored="true" /> <field name="platforms" type="text_general" indexed="true" stored="true" /> <field name="partners" type="text_general" indexed="true" stored="true" /> <field name="participation" type="text_general" indexed="true" stored="true" /> <field name="additional" type="text_general" indexed="true" stored="true" /> <field name="tag" type="text_general" termvectors="true" multivalued="true" /> <field name="facet_tag" type="string" stored="false" indexed="false" docvalues="true" multivalued="true" default=""/> <!-- language detected solr --> <field name="detectedlang" type="string" indexed="true" stored="true" /> <!-- defined locale fields --> <dynamicfield name="*_en" type="text_en" indexed="true" stored="true" /> <dynamicfield name="*_fr" type="text_fr" indexed="true" stored="true" /> <copyfield source="tag" target="facet_tag"/> </fields>
when run update or dataimport know "languages" update chain used because focus
mapped focus_en
, detectedlang set. however, none of other fields in langid.fl
mapped. why?
an example update query:
{ "additional": "here other information me.", "expertise_id": "10000", "id": "foo_10000", "focus": "this new focus. exciting. when done expect super experienced." }
and here result of query expertise_id=10000
. note additional
has not been moved additional_en
:
"response":{"numfound":1,"start":0,"docs":[ { "additional":"here other information me.", "expertise_id":10000, "id":"foo_10000", "detectedlang":"en", "focus_en":"this new focus. exciting. when done expect super experienced.", "_version_":1447088846110982144}] }
turns out problem syntax error. line:
<str name="langid.fl">focus, expertise, platforms, partners, participation, additional</str>
must
<str name="langid.fl">focus,expertise,platforms,partners,participation,additional</str>
the docs state field list should comma or space separated values. evidently, comma , space screws things (though works fine in other solr contexts fl
in requesthandler langid.fl supposedly modelled on). tried space-separated syntax well, did not fix issue.
i hope helps someone.
Comments
Post a Comment