--- title: "HTML Sanitization In Rails That Actually Works" date: 2009-11-23T00:00:00+00:00 draft: false canonical_url: https://www.viget.com/articles/html-sanitization-in-rails-that-actually-works/ --- Assuming you don't want to simply escape everything, sanitizing user input is one of the relative weak points of the Rails framework. On [SpeakerRate](http://speakerrate.com/), where users can use [Markdown](http://daringfireball.net/projects/markdown/) to format comments and descriptions, we've run up against some of the limitations of Rails' built-in sanitization features, so we decided to dig in and fix it ourselves. In creating our own sanitizer, our goals were threefold: we want to **let a subset of HTML in**. As the [Markdown documentation](http://daringfireball.net/projects/markdown/syntax#html) clearly states, "for any markup that is not covered by Markdown's syntax, you simply use HTML itself." In keeping with the Markdown philosophy, we can't simply strip all HTML from incoming comments, so the included [HTML::WhiteListSanitizer](https://github.com/rails/rails/blob/master/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb#LID60) is the obvious starting point. Additionally, we want to **escape, rather than remove, non-approved tags**, since some commenters want to discuss the merits of, say, [`
[**HTML::StathamSanitizer**](https://gist.github.com/241114).
User-generated markup, you're on notice: this sanitizer will take its
shirt off and use it to kick your ass. At this point, I've written more
about the code than code itself, so without further ado:
```ruby
module HTML
class StathamSanitizer < WhiteListSanitizer
protected
def tokenize(text, options)
super.map do |token|
if token.is_a?(HTML::Tag) && options[:parent].include?(token.name)
token.to_s.gsub(/, "<")
else
token
end
end
end
def process_node(node, result, options)
result << case node
when HTML::Tag
if node.closing == :close && options[:parent].first == node.name
options[:parent].shift
elsif node.closing != :self
options[:parent].unshift node.name
end
process_attributes_for node, options
if options[:tags].include?(node.name)
node
else
bad_tags.include?(node.name) ? nil : node.to_s.gsub(/, "<")
end
else
bad_tags.include?(options[:parent].first) ? nil : node.to_s.gsub(/, "<")
end
end
end
end
```
As always, download and fork [at the
'hub](https://gist.github.com/241114).