Bug Fix to Preserve Encoded HTML Characters in Post Body Between Edits

November 2nd, 2019

A bug was introduced in the previous mp-wp patch that caused encoded ("escaped") html characters to be converted back to their symbols after being loaded into the post editor and re-saved.

#ossasepia 2019-10-31 18:33 whaack: lol til after hitting 'save draft' on a mpwp post with html characters escaped, the escaped characters are evaluated to their html characters, thus hitting 'save draft' twice in succession can alter the content of a post.

It was a bug I had run into myself, and apparently others as well. Diana Coman encouraged further investigation and helped to isolate the problem.

#ossasepia 2019-11-01 20:26 billymg: diana_coman: steps to reproduce: 1) create a new post, include some encoded html character, e.g. &lt; 2) save post, 3) reload post edit page, 4) the &lt; you previously saved will render as a < in the editor text area, so a subsequent save inserts the < into the db, not the &lt; you had before
#ossasepia 2019-11-01 20:26 diana_coman: billymg: did that and...nope?

[...]

#ossasepia 2019-11-01 20:32 diana_coman: billymg: I start suspecting it's something you broke with the latest /trimming patches really; iirc on ossasepia I have the original mpwp, ie not pressed to last vpatch

And indeed it was!

#trilema 2019-11-01 20:49 billymg: yeah, when originally ripping it out, my brain parsed a "if (! $richedit ) ..." as "if ( $richedit ) ..." and so removed the line

So let's see exactly the lines of the patch that caused this bug and also what may have caused my original confusion.


diff -uNr a/mp-wp/wp-includes/formatting.php b/mp-wp/wp-includes/formatting.php
--- a/mp-wp/wp-includes/formatting.php 10526d3d84bbe766511af25287f01bbe1125a06d18babb93e1b7c33488c1f1df15ea3e706848a67d36c30edac25d3a4dd360bc563ac21093f6bfe0183f9c4388
+++ b/mp-wp/wp-includes/formatting.php 96c9b9b0b8da6f255e10fd47fd4053958417127524a5b8be961e126cce7fde0476bd5c2b20c4580517a75b01e9751636c8dc0c6f250ab8f94dbbe8097057cf67
@@ -775,20 +775,16 @@
 /**
  * Acts on text which is about to be edited.
  *
- * Unless $richedit is set, it is simply a holder for the 'format_to_edit'
- * filter. If $richedit is set true htmlspecialchars() will be run on the
- * content, converting special characters to HTMl entities.
+ * Holder for the 'format_to_edit'
+ * filter.
  *
  * @since 0.71
  *
  * @param string $content The text about to be edited.
- * @param bool $richedit Whether or not the $content should pass through htmlspecialchars(). Default false.
  * @return string The text after the filter (and possibly htmlspecialchars()) has been run.
  */
-function format_to_edit($content, $richedit = false) {
+function format_to_edit($content) {
 	$content = apply_filters('format_to_edit', $content);
-	if (! $richedit )
-		$content = htmlspecialchars($content);
 	return $content;
 }

Notice the contradiction in the comment:


If $richedit is set true htmlspecialchars() will be run on the content, converting special characters to HTMl entities.

And the code it describes:


if (! $richedit )
	$content = htmlspecialchars($content);

Which does precisely the opposite of what the comment describes. Now, the introduction of the bug was still 100% my fault, and I honestly can't say if I even bothered reading the comment at the time. Perhaps it wasn't the comment that misled me but the "!" wedged in there, which, after reading so many lines after however many days, simply slipped past me. Either way, having a test in place would've caught the bug before it went into production and so along with a patch to fix the bug I also added a test to assert the correct behavior.

MP-WP Patch and Signature

mp-wp_apply-htmlspecialchars-to-post-edit-content.vpatch
mp-wp_apply-htmlspecialchars-to-post-edit-content.vpatch.billymg.sig

MP-WP-Tests Patch and Signature

mp-wp-tests_add-test-for-encoded-html-characters-in-edit-ui.vpatch
mp-wp-tests_add-test-for-encoded-html-characters-in-edit-ui.vpatch.billymg.sig

« Additional Tests for MP-WP, Now According to Spec
Travelog: Costa Rica »

7 Comments

  1. whaack says:

    Welcome back billymg.com!

    Nice catch. Hopefully the future time spent writing about html will be ~0.

    And lol re the comment for $richedit.

  2. billymg says:

    Feels good to be live again (ty BingoBoingo)!

    Yeah, I thought that was pretty funny too. I imagine there are many such cases throughout the codebase. Looking forward to being able to give this more of my attention in the next few months, I'm starting to get jealous of the fun that's being had over in #ossasepia.

  3. spyked says:

    Thanks for pushing this, billymg!

    The decision to call htmlspecialchars there seems pretty arbitrary to me, I don't know why the WP developers made that choice. In my own code I decided to do this right before the call to the_editor (in wp-admin/edit-form-advanced.php), which admittedly, is also arbitrary, but at least it was relatively easy to spot by poking at the code.

  4. billymg says:

    @spyked No problem, and my apologies that it happened in the first place. Hopefully when the rest of the tests are in place future snips will come with more confidence.

  5. [...] the separation between the main MP-WP and this stray branch, this V patch also includes billymg's latest changes. Furthermore, I have mirrored billymg's V patch and seal, to which I have added my [...]

  6. [...] .vpatch sits on top of billymg's current v-tree head and hopefully doesn't break anything but I'm sure he'll let me know if it does2 [...]

  7. [...] mp-wp_apply-htmlspecialchars-to-post-edit-content.vpatch (billymg, spyked) (sauce) [...]

Leave a Reply

*
*

You can use the following HTML tags in your comment: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>