I’ve been contacted many times by people asking about how to disable HTML within PHP Markdown. Up until a few months ago, I was opposed to offering that possibility on the ground that HTML is part of the Markdown syntax. After all, Markdown was designed so that if the syntax doesn’t have what you want, or if you just don’t know the syntax, you can fallback to HTML.
Removing HTML support in that context would mean more pressure to implement a Markdown-specific syntax for things that Markdown (with HTML) doesn’t really need. It also force people to learn the syntax because they can’t use HTML anymore. The better option, it seems to me, is to simply restrict what HTML tags and attributes can be used within Markdown.
But I kept receiving questions about how to best disable HTML within PHP Markdown, and for various reasons many weren’t impressed much by my arguments. And the technical answer wasn’t very straightforward: unless you want to sacrifice code blocks and spans, and automatic links, you just can’t escape in advance the less-than
< character used to open a tag.
Basically, many people implemented it wrong without even noticing (because they don’t use much automatic links or code blocks and spans). It appeared to me that this was more harmful to users trying to learn Markdown than the lack of HTML fallback. So I changed my stance about the problem and decided to help those who want to disable HTML completely.
If you want to disable HTML in PHP Markdown, please don’t hack.
PHP Markdown has a (hidden) setting in its latest version to do exactly that. Just instantiate a parser yourself and set the
no_markup property this way:
$parser = new Markdown_Parser; // or MarkdownExtra_Parser $parser->no_markup = true; $html = $parser->transform($text);
There’s also a
no_entities property you can set the same way if you want to disallow character entities.
Note that by forbidding HTML markup you’re denying the users of your script, CMS, or web application the necessary fallback for elements Markdown does not provide, such as
<object> (required to embed video), and many others. A better idea may be to just filter out the HTML output for unwanted HTML, using something such as kses, but I’ll let you be the juge of what’s best for you and your users.
With power comes responsibilities: please make sure your users have the best Markdown experience you can offer them. Thanks.