Last modified: 2013-07-12 17:19:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49518, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 47518 - Thumbnails that are "inline" should break paragraphs


Summary:	Thumbnails that are "inline" should break paragraphs

Status:	RESOLVED WONTFIX

Product:	Parsoid
Classification:	Unclassified
Component:	General (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Low normal
Target Milestone:	---
Assigned To:	C. Scott Ananian

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-04-22 21:49 UTC by Mark Holmquist
Modified:	2013-07-12 17:19 UTC (History)
CC List:	6 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Mark Holmquist 2013-04-22 21:49:11 UTC

e.g. "Lorem [[File:Blargh|thumb]] ipsum" should come out as

<p>Lorem</p>
<!--Thumbnail goes here-->
<p>ipsum</p>

We currently output

Lorem <!--Thumbnail--> ipsum

which is useless.

Comment 1 Gerrit Notification Bot 2013-04-23 09:54:43 UTC

Related URL: https://gerrit.wikimedia.org/r/60378 (Gerrit Change I47ac2c0e73331bb6829427d58138984337df14e0)

Comment 2 ssastry 2013-04-23 09:56:14 UTC

Ignore this -- I used the wrong bug id in the commit message.

Comment 3 Gabriel Wicke 2013-04-23 14:49:56 UTC

There is a trade-off between round-tripping and ease of editing / conformance to the PHP parser behavior here. You are right that the PHP parser creates two paragraphs, so it might make sense to emulate that behavior. We will however need to find a way to still cleanly round-trip content like this.

To me it is also not self-evident why the current behavior is 'useless'. Could you elaborate on that?

Comment 4 Mark Holmquist 2013-05-01 21:47:03 UTC

It's useless because the figure will break out of the inline context and do unexpected things. Sorry for being less than clear about it.

Adding InezK because he was the original reporter IIRC.

Comment 5 Inez Korczyński 2013-05-01 22:06:37 UTC

I agree with what Mark said in the first comment:

"Lorem [[File:Blargh|thumb]] ipsum"

should produce

<p>Lorem</p>
<!--Thumbnail goes here-->
<p>ipsum</p>

Actually the most common usage that I noticed is putting thumb image at the beginning or the end of the paragraph, for instance:

"[[File:Wiki.png|thumb]] Lorem ipsum"

which in my opinion definitely should produce two block elements (as siblings).

Comment 6 Gabriel Wicke 2013-05-01 22:12:38 UTC

I tried ignoring figure and figcaption elements in the ParagraphWrapper:
		var isBlockToken = Util.isBlockToken(token) &&
			token.name !== 'figure' &&
			token.name !== 'figcaption';

This wraps the entire line into a single paragraph, but the treebuilder (sadly not our own, but FF's) then breaks it into <p>foo </p><figure></figure> bar<p></p>.

Comment 7 C. Scott Ananian 2013-05-01 22:20:31 UTC

Doesn't the tree builder do this automagically if we use <figure> elements for the image?

Comment 8 Gabriel Wicke 2013-05-01 22:35:54 UTC

See my comment above: Ours doesn't (HTML5 lib), FF's does.

Comment 9 ssastry 2013-05-02 04:07:48 UTC

This should be possible to fix -- but requires additional tweaking of the p-wrapper. 

Tweaking the 'isBlockToken' by itself is insufficient since the output is wrapping everything in a single para whereas we effectively need p-breaks before and after the figure.

We will probably need a special case in the p-wrapper for <figure>.

But, the question is if we need <p>Lorem</p> <figure> .. </figure> <p>Ipsum</p> really if browsers treat the <figure> tag as a block element.  Doesn't "Lorem <figure> .. </figure> Ipsum"? give you the same thing since the browser should break the flow before/after figure?  Or is figure a flow element?

Comment 10 Inez Korczyński 2013-05-02 06:33:02 UTC

In VE we create new DOMDocument from htmldom that we receive from Parsoid and we don't want browser to modify it (break the flow) cause based on that DOMDocument we create our data model - then we create view nodes and render them.

Btw. as far as I know breaking the flow before/after figure will vary in different browser and also based on CSS (display: inline, etc.).

Comment 11 Gabriel Wicke 2013-05-02 16:53:31 UTC

(In reply to comment #10)
> In VE we create new DOMDocument from htmldom that we receive from Parsoid and
> we don't want browser to modify it (break the flow) cause based on that
> DOMDocument we create our data model - then we create view nodes and render
> them.
> 
> Btw. as far as I know breaking the flow before/after figure will vary in
> different browser and also based on CSS (display: inline, etc.).

Our current output for this example is plain text (not inside a paragraph) interspersed with the figure. This is not broken up in the DOM by any browser.

Can you describe why this result is not acceptable for VE?

Comment 12 Inez Korczyński 2013-05-03 00:28:31 UTC

I so no reason to keep Parsoid and Parser outputs inconsistent for that case.

Output "123<figure/>456" and "<p>123</p><figure/><p>456</p>" will render differently in browser - and we want to avoid that.

Comment 13 Gabriel Wicke 2013-05-03 04:13:29 UTC

(In reply to comment #12)
> I so no reason to keep Parsoid and Parser outputs inconsistent for that case.
> 
> Output "123<figure/>456" and "<p>123</p><figure/><p>456</p>" will render
> differently in browser - and we want to avoid that.

The difference is hardly noticeable unless there is custom styling though. Without the paragraphs, this round-trips already and is valid HTML.

Comment 14 ssastry 2013-05-03 07:51:02 UTC

To further clarify:

123 <figure/> 456 and <p>123</p> <figure/> <p>456</p> render identically in the browser modulo any CSS styles set on p-tags.  We verified this on en.wp.org and the difference is in a small bottom margin on p-tags.  However, this can fixed in two ways:

1. gwicke indicated that VE wraps all plain text into p-tags anyway, so would this take care of it?  Can we verify this on the VE end?

2. Since figure only shows up in Parsoid's output, not PHP parser output, could this be equivalently fixed by adding some appropriate top/bottom margin on figure as required?

We could fix this in the p-wrapper, but it seems unnecessary and we are keen to avoid adding special cases in addition to what already exists in the code to deal with PHP parser oddities and old hacks.  Special cases add longer-term maintenance headaches.  So, if that could be avoided by (1) and/or (2) above, that would be the best solution all around given that current Parsoid output RTs correctly and is valid HTML as gwicke indicated.

Comment 15 ssastry 2013-06-05 18:14:00 UTC

Please reopen if this is still an issue.

Comment 16 Andre Klapper 2013-07-04 10:34:24 UTC

[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]

Comment 17 C. Scott Ananian 2013-07-12 17:19:45 UTC

Adopting this bug, just in case it is re-opened.  I am king of all images!

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links