Login
or
register
Overview
Introduction
Features
Credits
License
This Wiki
Status
News
Releases
Change Log
Recent Wiki Edits
User Guide
Installation
Command Line
Using as a Module
Integration
Reporting Bugs
Advanced
Mailing List
Source Code
Test Suite
Extensions
Extensions
Writing Extensions
Available Extensions
Related Projects
Tickets
▹ 000062
`</li>\n</ul>` is wrapped in `p` tags, producing invalid HTML
Anonymous users must enter
captcha
below.
Don't put anything here
Ticket Information
Ticket Title
This is a _very_ strange bug. Markdown: <p>Paragraph.</p> <ul> <li> <p>Paragraph.</p> </li> </ul> Desired HTML output: <p>Paragraph.</p> <ul> <li> <p>Paragraph.</p> </li> </ul> Actual HTML output: <p>Paragraph.</p> <ul> <li> <p>Paragraph.</p> <p></li> </ul></p> I've simplified the input text as much as possible – correct behaviour results if either paragraph is removed, or if the first paragraph is replaced by its raw Markdown equivalent (i.e. "Paragraph.\n" rather than "<p>Paragraph.</p>"). **The input text is valid HTML and should be left as is.** ###Comments __By Waylan on 7/14/2010__ First, let's run the given sample through the parser >>> md = markdown.Markdown() >>> md.convert("<p>foo</p>\n<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>") u'<p>foo</p>\n<ul>\n<li>\n<p>bar</p>\n\n<p></li>\n</ul></p>' >>> md.htmlStash.rawHtmlBlocks [(u'<p>foo</p>\n<ul>\n<li>\n<p>bar</p>', False), (u'</li>', False), (u'</ul>', False)] Notice that the raw html is seen as three separate blocks and the second two are closing tags only. Thats why they are getting wrapped in a `<p>` tag. Now let's remove that first paragraph from the input text: >>> md.reset() >>> md.convert("<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>") u'<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>' >>> md.htmlStash.rawHtmlBlocks [(u'<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>', False)] This time everything is seen as one block so it works correctly. So the trick is to figure out why the rawhtml parser is thrown off by the first paragraph so that is doesn't include the closing tags in the last two lines in the same block -- and then to fix it. Now one final test: >>> md.reset() >>> md.convert("<p>foo</p>\n\n<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>") u'<p>foo</p>\n\n<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>' >>> md.htmlStash.rawHtmlBlocks [(u'<p>foo</p>', False), (u'<ul>\n<li>\n<p>bar</p>\n</li>\n</ul>', False)] Here I added a blank line between the first paragraph and the list and everything works as it should. In the input that breaks I suspect that the parser is matching the first `<p>` with the last `</p>` and completely ignores the `</p><ul><li><p>` in the middle. So to fix it we need the parser to stop on the first `</p>`, then restart on the list. Doesn't really matter if everything is all in one block or two as long as each block contains a valid block of html.
Reported by
Assigned to
Status
open
someday
resolved
closed
Resolution
n.a.
fixed
wontfix
Don't put anything here
Advanced Fields
Priority
unassigned
high
medium
low
Resolution Explanation
Component
About This Edit
Minor Edit
Edit Summary
Don't put anything here
Don't put anything here
Don't put anything here
Don't put anything here
save
preview
cancel
Powered by
Sputnik
|
XHTML 1.1