mirror of https://github.com/apache/lucene.git
LUCENE-9354: Sync French stop words with latest version from Snowball. (#1474)
* Sync French stop words with latest version from Snowball. This new version removed some French homonyms from the list * Use latest master commit from snowball-website * LUCENE-9354: regenerate with 'gradle snowball * LUCENE-9354: add CHANGES.txt entry
This commit is contained in:
parent
242f48a1ca
commit
7a849f6943
|
@ -31,7 +31,7 @@ configure(project(":lucene:analysis:common")) {
|
|||
// git commit hash of source code https://github.com/snowballstem/snowball/
|
||||
snowballStemmerCommit = "53739a805cfa6c77ff8496dc711dc1c106d987c1"
|
||||
// git commit hash of stopwords https://github.com/snowballstem/snowball-website
|
||||
snowballWebsiteCommit = "ff891e74f08e7315523ee3c0cad55bb1b7831b9d"
|
||||
snowballWebsiteCommit = "5a8cf2451d108217585d8e32d744f8b8fd20c711"
|
||||
// git commit hash of test data https://github.com/snowballstem/snowball-data
|
||||
snowballDataCommit = "9145f8732ec952c8a3d1066be251da198a8bc792"
|
||||
|
||||
|
|
|
@ -93,6 +93,9 @@ Improvements
|
|||
Nepali, Serbian, and Tamil. New stoplist: Indonesian. Adds gradle 'snowball'
|
||||
task to regenerate and ease future upgrades. (Robert Muir, Dawid Weiss)
|
||||
|
||||
* LUCENE-9354: Improvements to snowball french stopwords list, so that it is less
|
||||
aggressive. (Philippe Ouellet)
|
||||
|
||||
* LUCENE-9114: Improve ValueSourceScorer's Default Cost Implementation (Atri Sharma, David Smiley)
|
||||
|
||||
* LUCENE-9074: Introduce Slice Executor For Dynamic Runtime Execution Of Slices (Atri Sharma)
|
||||
|
|
|
@ -51,7 +51,7 @@ qui | who
|
|||
sa | his, her (fem)
|
||||
se | oneself
|
||||
ses | his (pl)
|
||||
son | his, her (masc)
|
||||
| son | his, her (masc). Omitted because it is homonym of "sound"
|
||||
sur | on
|
||||
ta | thy (fem)
|
||||
te | thee
|
||||
|
@ -79,15 +79,15 @@ t | t'
|
|||
y | there
|
||||
|
||||
| forms of être (not including the infinitive):
|
||||
été
|
||||
| été - Omitted because it is homonym of "summer"
|
||||
étée
|
||||
étées
|
||||
étés
|
||||
| étés - Omitted because it is homonym of "summers"
|
||||
étant
|
||||
suis
|
||||
es
|
||||
est
|
||||
sommes
|
||||
| est - Omitted because it is homonym of "east"
|
||||
| sommes - Omitted because it is homonym of "sums"
|
||||
êtes
|
||||
sont
|
||||
serai
|
||||
|
@ -118,7 +118,7 @@ soyez
|
|||
soient
|
||||
fusse
|
||||
fusses
|
||||
fût
|
||||
| fût - Omitted because it is homonym of "tap", like in "beer on tap"
|
||||
fussions
|
||||
fussiez
|
||||
fussent
|
||||
|
@ -130,13 +130,13 @@ eue
|
|||
eues
|
||||
eus
|
||||
ai
|
||||
as
|
||||
| as - Omitted because it is homonym of "ace"
|
||||
avons
|
||||
avez
|
||||
ont
|
||||
aurai
|
||||
auras
|
||||
aura
|
||||
| auras - Omitted because it is also the name of a kind of wind
|
||||
| aura - Omitted because it is also the name of a kind of wind and homonym of "aura"
|
||||
aurons
|
||||
aurez
|
||||
auront
|
||||
|
@ -147,7 +147,7 @@ auriez
|
|||
auraient
|
||||
avais
|
||||
avait
|
||||
avions
|
||||
| avions - Omitted because it is homonym of "planes"
|
||||
aviez
|
||||
avaient
|
||||
eut
|
||||
|
|
Loading…
Reference in New Issue