Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,50 @@ After the model has been trained, you can submit questions to it.

If the command to ask questions to either Solr or NLC fails you can rerun it and it will pick up where it left off.


To ask questions of RnR we must first train a model using the truth file downloaded from XMGR as training data.

create corpus.json:
themis answer rnr corpus_json CORPUS.CSV-FILE

add commit:{} at the end of the json file created to index solr

create cluster:
themis answer rnr cluster RNR-URL USERNAME PASSWORD
(note the cluster id from response)

check cluster status:
themis answer rnr cluster_status RNR-URL USERNAME PASSWORD CLUSTER-ID

upload solr schema:
themis answer rnr schema RNR-URL USERNAME PASSWORD CLUSTER-ID SCHEMA-ZIP-FILE

associate config:
themis answer rnr config RNR-URL USERNAME PASSWORD CLUSTER-ID

upload corpus.json :
themis answer rnr corpus_upload RNR-URL USERNAME PASSWORD CLUSTER-ID CORPUS.JSON-FILE

test corpus:
themis answer rnr corpus_test RNR-URL USERNAME PASSWORD CLUSTER-ID

modify truth file to add relevance:
themis answer rnr truth TRUTH-FILE

upload truth file: (the train.py script is given by RnR team and recommended not be modified)
python train.py -u USERNAME:PASSWORD -i TRUTH-FILE -c CLUSTER-ID -x "example_collection" -n "themis-ranker"
(note the ranker id from the response)

check ranker status:
themis answer rnr ranker_status RNR-URL USERNAME PASSWORD RANKER-ID

query trained ranker:
themis answer rnr ranker_query RNR-URL USERNAME PASSWORD CLUSTER-ID RANKER-ID SAMPLE-QUESTIONS-FILE

untrained:
themis answer rnr untrained_ranker_query RNR-URL USERNAME PASSWORD CLUSTER-ID SAMPLE-QUESTIONS-FILE


### Submit Answers to Annotation Assist

A human annotator needs to judge whether the answers to the questions returned by the various systems are correct.
Expand Down
Binary file added rnr/example_schema.zip
Binary file not shown.
67 changes: 67 additions & 0 deletions rnr/example_schema/currency.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
<?xml version="1.0" ?>
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

<!-- Example exchange rates file for CurrencyField type named "currency" in example schema -->

<currencyConfig version="1.0">
<rates>
<!-- Updated from http://www.exchangerate.com/ at 2011-09-27 -->
<rate from="USD" to="ARS" rate="4.333871" comment="ARGENTINA Peso" />
<rate from="USD" to="AUD" rate="1.025768" comment="AUSTRALIA Dollar" />
<rate from="USD" to="EUR" rate="0.743676" comment="European Euro" />
<rate from="USD" to="BRL" rate="1.881093" comment="BRAZIL Real" />
<rate from="USD" to="CAD" rate="1.030815" comment="CANADA Dollar" />
<rate from="USD" to="CLP" rate="519.0996" comment="CHILE Peso" />
<rate from="USD" to="CNY" rate="6.387310" comment="CHINA Yuan" />
<rate from="USD" to="CZK" rate="18.47134" comment="CZECH REP. Koruna" />
<rate from="USD" to="DKK" rate="5.515436" comment="DENMARK Krone" />
<rate from="USD" to="HKD" rate="7.801922" comment="HONG KONG Dollar" />
<rate from="USD" to="HUF" rate="215.6169" comment="HUNGARY Forint" />
<rate from="USD" to="ISK" rate="118.1280" comment="ICELAND Krona" />
<rate from="USD" to="INR" rate="49.49088" comment="INDIA Rupee" />
<rate from="USD" to="XDR" rate="0.641358" comment="INTNL MON. FUND SDR" />
<rate from="USD" to="ILS" rate="3.709739" comment="ISRAEL Sheqel" />
<rate from="USD" to="JPY" rate="76.32419" comment="JAPAN Yen" />
<rate from="USD" to="KRW" rate="1169.173" comment="KOREA (SOUTH) Won" />
<rate from="USD" to="KWD" rate="0.275142" comment="KUWAIT Dinar" />
<rate from="USD" to="MXN" rate="13.85895" comment="MEXICO Peso" />
<rate from="USD" to="NZD" rate="1.285159" comment="NEW ZEALAND Dollar" />
<rate from="USD" to="NOK" rate="5.859035" comment="NORWAY Krone" />
<rate from="USD" to="PKR" rate="87.57007" comment="PAKISTAN Rupee" />
<rate from="USD" to="PEN" rate="2.730683" comment="PERU Sol" />
<rate from="USD" to="PHP" rate="43.62039" comment="PHILIPPINES Peso" />
<rate from="USD" to="PLN" rate="3.310139" comment="POLAND Zloty" />
<rate from="USD" to="RON" rate="3.100932" comment="ROMANIA Leu" />
<rate from="USD" to="RUB" rate="32.14663" comment="RUSSIA Ruble" />
<rate from="USD" to="SAR" rate="3.750465" comment="SAUDI ARABIA Riyal" />
<rate from="USD" to="SGD" rate="1.299352" comment="SINGAPORE Dollar" />
<rate from="USD" to="ZAR" rate="8.329761" comment="SOUTH AFRICA Rand" />
<rate from="USD" to="SEK" rate="6.883442" comment="SWEDEN Krona" />
<rate from="USD" to="CHF" rate="0.906035" comment="SWITZERLAND Franc" />
<rate from="USD" to="TWD" rate="30.40283" comment="TAIWAN Dollar" />
<rate from="USD" to="THB" rate="30.89487" comment="THAILAND Baht" />
<rate from="USD" to="AED" rate="3.672955" comment="U.A.E. Dirham" />
<rate from="USD" to="UAH" rate="7.988582" comment="UKRAINE Hryvnia" />
<rate from="USD" to="GBP" rate="0.647910" comment="UNITED KINGDOM Pound" />

<!-- Cross-rates for some common currencies -->
<rate from="EUR" to="GBP" rate="0.869914" />
<rate from="EUR" to="NOK" rate="7.800095" />
<rate from="GBP" to="NOK" rate="8.966508" />
</rates>
</currencyConfig>
54 changes: 54 additions & 0 deletions rnr/example_schema/lang/stopwords_en.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# a couple of test stopwords to test that the words are really being
# configured from this file:
stopworda
stopwordb

# Standard english stop words taken from Lucene's StopAnalyzer
a
an
and
are
as
at
be
but
by
for
if
in
into
is
it
no
not
of
on
or
such
that
the
their
then
there
these
they
this
to
was
will
with
21 changes: 21 additions & 0 deletions rnr/example_schema/protwords.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

#-----------------------------------------------------------------------
# Use a protected word file to protect against the stemmer reducing two
# unrelated words to the same base word.

# Some non-words that normally won't be encountered,
# just to test that they won't be stemmed.
dontstems
zwhacky

Loading