mirror of https://github.com/apache/lucene.git
LUCENE-9354: Sync French stop words with latest version from Snowball. (#1474)
* Sync French stop words with latest version from Snowball. This new version removed some French homonyms from the list * Use latest master commit from snowball-website * LUCENE-9354: regenerate with 'gradle snowball * LUCENE-9354: add CHANGES.txt entry
This commit is contained in:
parent
242f48a1ca
commit
7a849f6943
|
@ -31,7 +31,7 @@ configure(project(":lucene:analysis:common")) {
|
||||||
// git commit hash of source code https://github.com/snowballstem/snowball/
|
// git commit hash of source code https://github.com/snowballstem/snowball/
|
||||||
snowballStemmerCommit = "53739a805cfa6c77ff8496dc711dc1c106d987c1"
|
snowballStemmerCommit = "53739a805cfa6c77ff8496dc711dc1c106d987c1"
|
||||||
// git commit hash of stopwords https://github.com/snowballstem/snowball-website
|
// git commit hash of stopwords https://github.com/snowballstem/snowball-website
|
||||||
snowballWebsiteCommit = "ff891e74f08e7315523ee3c0cad55bb1b7831b9d"
|
snowballWebsiteCommit = "5a8cf2451d108217585d8e32d744f8b8fd20c711"
|
||||||
// git commit hash of test data https://github.com/snowballstem/snowball-data
|
// git commit hash of test data https://github.com/snowballstem/snowball-data
|
||||||
snowballDataCommit = "9145f8732ec952c8a3d1066be251da198a8bc792"
|
snowballDataCommit = "9145f8732ec952c8a3d1066be251da198a8bc792"
|
||||||
|
|
||||||
|
|
|
@ -93,6 +93,9 @@ Improvements
|
||||||
Nepali, Serbian, and Tamil. New stoplist: Indonesian. Adds gradle 'snowball'
|
Nepali, Serbian, and Tamil. New stoplist: Indonesian. Adds gradle 'snowball'
|
||||||
task to regenerate and ease future upgrades. (Robert Muir, Dawid Weiss)
|
task to regenerate and ease future upgrades. (Robert Muir, Dawid Weiss)
|
||||||
|
|
||||||
|
* LUCENE-9354: Improvements to snowball french stopwords list, so that it is less
|
||||||
|
aggressive. (Philippe Ouellet)
|
||||||
|
|
||||||
* LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation (Atri Sharma, David Smiley)
|
* LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation (Atri Sharma, David Smiley)
|
||||||
|
|
||||||
* LUCENE-9074: Introduce Slice Executor For Dynamic Runtime Execution Of Slices (Atri Sharma)
|
* LUCENE-9074: Introduce Slice Executor For Dynamic Runtime Execution Of Slices (Atri Sharma)
|
||||||
|
|
|
@ -51,7 +51,7 @@ qui | who
|
||||||
sa | his, her (fem)
|
sa | his, her (fem)
|
||||||
se | oneself
|
se | oneself
|
||||||
ses | his (pl)
|
ses | his (pl)
|
||||||
son | his, her (masc)
|
| son | his, her (masc). Omitted because it is homonym of "sound"
|
||||||
sur | on
|
sur | on
|
||||||
ta | thy (fem)
|
ta | thy (fem)
|
||||||
te | thee
|
te | thee
|
||||||
|
@ -79,15 +79,15 @@ t | t'
|
||||||
y | there
|
y | there
|
||||||
|
|
||||||
| forms of être (not including the infinitive):
|
| forms of être (not including the infinitive):
|
||||||
été
|
| été - Omitted because it is homonym of "summer"
|
||||||
étée
|
étée
|
||||||
étées
|
étées
|
||||||
étés
|
| étés - Omitted because it is homonym of "summers"
|
||||||
étant
|
étant
|
||||||
suis
|
suis
|
||||||
es
|
es
|
||||||
est
|
| est - Omitted because it is homonym of "east"
|
||||||
sommes
|
| sommes - Omitted because it is homonym of "sums"
|
||||||
êtes
|
êtes
|
||||||
sont
|
sont
|
||||||
serai
|
serai
|
||||||
|
@ -118,7 +118,7 @@ soyez
|
||||||
soient
|
soient
|
||||||
fusse
|
fusse
|
||||||
fusses
|
fusses
|
||||||
fût
|
| fût - Omitted because it is homonym of "tap", like in "beer on tap"
|
||||||
fussions
|
fussions
|
||||||
fussiez
|
fussiez
|
||||||
fussent
|
fussent
|
||||||
|
@ -130,13 +130,13 @@ eue
|
||||||
eues
|
eues
|
||||||
eus
|
eus
|
||||||
ai
|
ai
|
||||||
as
|
| as - Omitted because it is homonym of "ace"
|
||||||
avons
|
avons
|
||||||
avez
|
avez
|
||||||
ont
|
ont
|
||||||
aurai
|
aurai
|
||||||
auras
|
| auras - Omitted because it is also the name of a kind of wind
|
||||||
aura
|
| aura - Omitted because it is also the name of a kind of wind and homonym of "aura"
|
||||||
aurons
|
aurons
|
||||||
aurez
|
aurez
|
||||||
auront
|
auront
|
||||||
|
@ -147,7 +147,7 @@ auriez
|
||||||
auraient
|
auraient
|
||||||
avais
|
avais
|
||||||
avait
|
avait
|
||||||
avions
|
| avions - Omitted because it is homonym of "planes"
|
||||||
aviez
|
aviez
|
||||||
avaient
|
avaient
|
||||||
eut
|
eut
|
||||||
|
|
Loading…
Reference in New Issue