Last modified: 2013-12-26 14:39:54 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57329, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55329 - showDiff() highlighting limitation due to difflib design
showDiff() highlighting limitation due to difflib design
Status: NEW
Product: Pywikibot
Classification: Unclassified
General (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Pywikipedia bugs
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-05 05:09 UTC by Kunal Mehta (Legoktm)
Modified: 2013-12-26 14:39 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-05 05:09:08 UTC
Originally from: http://sourceforge.net/p/pywikipediabot/bugs/509/
Reported by: cosoleto
Created on: 2007-09-28 07:35:32
Subject: showDiff() highlighting limitation due to difflib design
Assigned to: cosoleto
Original description:
showDiff\(\) can fail to highlight a char-by-char difference because Python difflib seems don't support fully char-by-char comparison. 

Please see in Python tracker:

\* issue \#1528074: "difflib.SequenceMatcher.find\_longest\_match\(\)  wrong result" \(http://bugs.python.org/issue1528074\)

\* issue \#1678345: "A fix for the bug \#1528074 \[warning: quite slow\]" \(http://bugs.python.org/issue1678345\)
Comment 1 Kunal Mehta (Legoktm) 2013-10-05 05:09:12 UTC
Logged In: YES 
user\_id=181280
Originator: YES

File Added: difflib\_test.py
Comment 2 Kunal Mehta (Legoktm) 2013-10-05 05:09:14 UTC
- **priority**: 5 --> 6
Comment 3 Kunal Mehta (Legoktm) 2013-10-05 05:09:15 UTC
Logged In: NO 

Guess this is an example
http://bildr.no/view/146822
Comment 4 Kunal Mehta (Legoktm) 2013-10-05 05:09:17 UTC
Assigned before somebody certainly steals this issue to me. I am going to add a modified difflib version. Unless the lack of feature is fixed in recent Python builds or, of course, anyone makes an objection. I am not sure about a config option to enable or disable line-by-line/char-by-char comparision.
Comment 5 Kunal Mehta (Legoktm) 2013-10-05 05:09:19 UTC
- **priority**: 6 --> 7
- **assigned_to**: nobody --> cosoleto
Comment 6 Kunal Mehta (Legoktm) 2013-10-05 05:09:20 UTC
Actually, I'd very much like to see better diff support for pywikipedia. I dont know why I missed that bug =\)

I see in those bugs several comments about complexity changes, saying that a patch could change complexity from O\(n\*m\) to O\(n+m\), which certainly looks interesting. If char-by-char comparison provides better diffs, at a lower cost, what exactly is the reason for not supporting in Python? :s

Two things to look at during implementation:
\* Would it provide interesting diffs for all cases? \(if one case is improved while other matches get worse, it's not so interesting anymore\)
\* Performance changes for big diffs.

Good luck =\)
Comment 7 Kunal Mehta (Legoktm) 2013-10-05 05:09:22 UTC
I haven't need luck because I am not going to do big works, just silly adaptation of already written code \(with loss of performance\). If you are interested to work on this problem in a different way you are welcome \(and not only in this open project\). Anyway it's nice to see you have analysed the situation a bit.

The changed version should be safe, without regression cases. I will see to document performace loss.
Comment 8 Kunal Mehta (Legoktm) 2013-10-05 05:09:24 UTC
- **Group**:  --> confirmed
Comment 9 Strainu 2013-10-18 16:58:19 UTC
This appears to have been fixed upstream, right?
Comment 10 Andre Klapper 2013-11-19 17:02:11 UTC
Both links in comment 0 (http://bugs.python.org) have been fixed, indeed.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links