Hey, I’m not even using WordPress on my website, so how could it annoy me enough to make me write an entry about it? Maybe you guessed by now. It’s related to PHP Markdown, the text-to-HTML converter I maintain, that also happens to be a WordPress plugin. What it does as a WordPress plugin is that it applies a filter to each entry and comment of your weblog so you can write your text using Markdown instead of plain HTML. WordPress allows plugins to add filters for posts and comments that are applied automatically. Simple… in theory.
It looks like while it supports filters, WordPress still expects you to write valid HTML when you post an entry. If you don’t, it corrects it. That’s great, except that Markdown isn’t HTML, and the autolink construct in Markdown is seen as a tag to be corrected (an autolink looks like this:
<http://example.com>). Once WordPress has “corrected” it, you don’t see any link in the post — it has been converted to something like
<http ://example.com></http> by WordPress, which is not a link and is invisible in the browser. The same issue will arrise if you write sample HTML code (inside a Markdown code block or span) and the sample happens to contain invalid XHTML.
Fortunately, this can be corrected by the user who can uncheck the “WordPress should correct invalidly nested XHTML automatically” box from the WordPress admin interface in Tools > Writing. But even if you do this, you have to go back editing any entry that was “corrected” by this filter, because the HTML correction is applied before the entry is saved in the database.
(By the way, this is exactly why I can’t make Markdown apply it’s filter before the HTML correction filter : Markdown runs at rendering time. If it ran prior to saving the entry, going back editing your entry afterward would reveal the HTML tags generated by Markdown instead of the text you typed in. Adding a filter prior saving the entry is not possible anyway, at least not until WordPress 1.5 is out.)
Sadly, that’s not all — there is the RSS excerpt issue. WordPress default templates remove any HTML tag from the description or the Atom summary, so it strips any tag from the excerpt prior publishing the feed. I have not problem with this, except one thing: it strips the tags before passing the text through the filters. So, even if I apply Markdown to the excerpts, autolinks and any HTML sample code included in a code block or span will be stripped before Markdown has a chance to look at them. Worse: Markdown will generate tags when it is called to build the excerpt. These tags won’t be stripped even if the template say so. :-(
To make things clear, PHP Markdown as a WordPress plugin currently does not filter the RSS excerpt at all. There is just no way to ensure it will produce the correct result and not make the feeds invalid. I’m thinking of some sort of tradeoff for the next version of PHP Markdown. And I hope that the next release of WordPress does something to help with the situation.
Until then, I suggest to those who care about their feed to manually write an excerpt without any Markdown formatting for the entry. To do that, you will need to use Advanced Editing for your post.
Another solution would be to enable full content in the feeds. The excerpt is problematic, but the full content is properly filtered. You can enable it from Options > Reading > Syndication Feeds > For each article, show full text.