Last modified: 2012-03-09 15:55:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T37083, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 35083 - OpenSearchXml first sentences extraction produces bad results
OpenSearchXml first sentences extraction produces bad results
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
OpenSearchXml (Other open bugs)
unspecified
All All
: Unprioritized normal (vote)
: ---
Assigned To: Brion Vibber
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-09 15:05 UTC by Max Semenik
Modified: 2012-03-09 15:55 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Max Semenik 2012-03-09 15:05:11 UTC
The current regex, roughly "end capture after the second dot followed by whitespace produces wildly inaccurate results for sentences with dots in the middle, for example if article title contains dots:

https://en.wikipedia.org/w/api.php?action=opensearch&format=xmlfm&search=.s.p.%20v&limit=10

    <Item>
      <Text xml:space="preserve">S. P. Venkatesh</Text>
      <Description xml:space="preserve">S. P. </Description>
      <Url xml:space="preserve">https://en.wikipedia.org/wiki/S._P._Venkatesh</Url>
    </Item>
    <Item>
      <Text xml:space="preserve">S. P. Velumani</Text>
      <Description xml:space="preserve">S. P. </Description>
      <Url xml:space="preserve">https://en.wikipedia.org/wiki/S._P._Velumani</Url>
    </Item>

It should be something like "first dot followed by whitespace after a certain number of characters".
Comment 1 Max Semenik 2012-03-09 15:55:31 UTC
Fixed (well, improved) in r113475.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links