開啟 solrconfig.xml
找到
<requestHandler name="/update/extract"
startup="lazy"
class="solr.extraction.ExtractingRequestHandler" >
<lst name="defaults">
<str name="lowernames">true</str>
<str name="uprefix">ignored_</str>
<!-- capture link hrefs but ignore div attributes -->
<str name="captureAttr">true</str>
<str name="fmap.a">links</str>
<str name="fmap.div">ignored_</str>
</lst>
</requestHandler>
並將其註解掉改為
<requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="fmap.Last-Modified">last_modified</str>
<str name="uprefix">ignored_</str>
</lst>
<!--Optional. Specify a path to a tika configuration file. See the Tika docs for details.-->
<!-- <str name="tika.config">/my/path/to/tika.config</str> -->
<!-- Optional. Specify one or more date formats to parse. See DateUtil.DEFAULT_DATE_FORMATS for default date formats -->
<!-- <lst name="date.formats">
<str>yyyy-MM-dd</str>
</lst> -->
</requestHandler>
其中
<str name="tika.config">/my/path/to/tika.config</str>
和
<lst name="date.formats">
<str>yyyy-MM-dd</str>
</lst>
皆為 Optional ,暫時先不設,未來才加入設定及可。