Last modified: 2014-11-13 22:07:43 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T72743, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 70743 - Parsoid base URL should be independent of page
Parsoid base URL should be independent of page
Status: RESOLVED FIXED
Product: Parsoid
Classification: Unclassified
DOM (Other open bugs)
unspecified
All All
: Highest normal
: ---
Assigned To: ssastry
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-09-11 22:13 UTC by Roan Kattouw
Modified: 2014-11-13 22:07 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2014-09-11 22:13:57 UTC
Currently Parsoid sets the URL of the page itself as the base URL. This means that [[OS/2]] has <base href="//en.wikipedia.org/wiki/OS/2">, which means that links on that page have to look like <a href="../Unix"> in order to point to the right place.

This practice is evil and should die in a fire. Instead, the base URL should be set to the base URL of the wiki, e.g. <base href="//en.wikipedia.org/wiki">.

The fact that the base URL currently depends on the page name causes lots of problems. 

Mixing content from multiple pages (like in Flow) is hard, because you have to normalize away all the <base> differences. Even embedding content from one page standalone is difficult because Parsoid's (variable) choice for the base URL is not a reasonable choice for your entire UI's base URL.

Creating new content (like in VE) would be hard if it weren't for the fact that Parsoid tolerates <a href="Foo"> where it would really expect <a href="../Foo">. If VE had to actually produce correct hrefs in its output, it would have to do some pretty evil analysis of the base URL.

Copying content from one page to another is hit by both issues: you have to process the hrefs of copied links based on both the source's base URL and the destination's base URL, which is quite error-prone.

This change has been discussed before and everyone seems to agree that it should happen, but it hasn't happened yet, so let's start tracking it here.
Comment 1 ssastry 2014-09-24 21:42:25 UTC
Let us tackle this soon since this seems to get in the way of Flow dropping data-parsoid usage.
Comment 2 Gerrit Notification Bot 2014-10-31 17:42:14 UTC
Change 170359 had a related patch set uploaded by Subramanya Sastry:
WIP: (Bug 70743): Point base href to the wiki + fix wikilink hrefs

https://gerrit.wikimedia.org/r/170359
Comment 3 Gerrit Notification Bot 2014-11-12 18:29:42 UTC
Change 170359 merged by jenkins-bot:
(Bug 70743): Point base href to wiki base; update link, img, tpl hrefs

https://gerrit.wikimedia.org/r/170359

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links