Last modified: 2014-11-12 20:34:21 UTC
Hi. The search http://fr.wiktionary.org/w/api.php?action=opensearch&format=json&namespace=0&search=gr%C3%AAlo doesn't return "grelotter" whereas http://en.wiktionary.org/w/api.php?action=opensearch&format=json&namespace=0&search=gr%C3%AAlo yes. Why?
Its a reasonably easy thing to turn on but right now its only on in English. We can turn it on if its the right thing.
Another thing - the behavior if I turn on accent ignoring is wider then just autocomplete - its in search as well. When that is done perfect accent matching pulls the result higher in search but accent mismatching results still show up.
In search, this behavior seems to be ever turned on: https://fr.wiktionary.org/w/index.php?title=Sp%C3%A9cial%3ARecherche&profile=default&search=gr%C3%AAlotter&fulltext=Search&searchengineselect=mediawiki
ever->already*
Hmmm...... I'll investigate that - since no one has complained about that behavior I imagine its correct or at least ok. Either way - if prefix search should have it I'll file this bug and see about getting it in there. Won't be super soon - but I'll get to it.
I'm not sure what you mean by "no one complained" but I opened this bug after this point: https://fr.wiktionary.org/wiki/Wiktionnaire:Wikid%C3%A9mie/septembre_2014#A_propos_du_moteur_de_recherche
Sorry, I mean search flattening the accents didn't receive any complaints when we turned CirrusSearch on for frwiktionary a few months ago. At least I don't think anyone did. Anyway - I'll have a look at turning on accent squashing for frwiktionary soon.
Change 160990 had a related patch set uploaded by Manybubbles: Add asciifolding to some French analyzers https://gerrit.wikimedia.org/r/160990
I've added a proposal to flatten all accented characters into non-accented ones for prefix search and exact title matches. It'll require rebuilding the index but that is no big deal. Note: I found out where the other normalization comes from. The French stemmer we use for inexact matches performs the following mappings: 'à', 'á', 'â' -> 'a' 'ô' -> 'o' 'è', 'é', 'ê' -> 'e' 'ù', 'û' -> 'u' 'î' -> 'i' 'ç' -> 'c' I could, if you believe it is more correct, only perform those mappings for the prefix and exact title matching.
I can't be sure it's more correct, could you tell me what is done for fr.wikipedia please? Accents are flatened for this site too.
Change 160990 merged by jenkins-bot: Add asciifolding to some French analyzers https://gerrit.wikimedia.org/r/160990
All patches mentioned in this report were merged or abandoned - is there more work left to do here (if yes: please reset the bug report status to NEW or ASSIGNED), or can you close this ticket as RESOLVED FIXED?