Clean User Generated HTML In Ruby

Discussion in 'Ruby on Rails' started by pradeep, Jul 29, 2013.

  1. pradeep

    pradeep Team Leader

    Joined:
    Apr 4, 2005
    Messages:
    1,645
    Likes Received:
    87
    Trophy Points:
    0
    Occupation:
    Programmer
    Location:
    Kolkata, India
    Home Page:
    http://blog.pradeep.net.in
    Web applications have always to deal with user input, nowadays more HTML, so there is a risk of malicious HTML code, XSS, etc. So, the best way to deal with user input would be sanitize it i.e. the removal of unwanted HTML tags or attributes, like we might not want to have links or scripts in the user's HTML, so we'll have to remove script & a tags. In another case we might want to allow anchor tags with absolute URLs, the cases might be numerous, in this article we'll try to get the basics right.

    In this article we'll be looking at a Ruby gem Sanitize, which is a whitelist based sanitizing module, which means that you have to mentioned the allowed tags, attributes, etc. inversely you cannot specify disallowed tags or attributes.

    Installing Sanitize



    It's pretty easy to install the Sanitize gem in Ruby, just issue the following command and wait:

    Code:
    $ gem install sanitize
    

    Understanding & Using Sanitize



    Sanitize comes in with a few built-in modes, which help you complete a few mundane configurations without much effort, here are the built-in modes:

    Sanitize::Config::RESTRICTED - Only allows very simple inline formatting.
    Sanitize::Config::BASIC - Allows all formatting tags, links & lists. Does not allow tables & images, and a rel="nofollow" attribute is added to all links.
    Sanitize::Config::RELAXED - Like BASIC, but allows images & tables and does not add rel="nofollow" attribute to links.

    Example:
    Code:
    require 'rubygems'
    require 'sanitize'
    
    html = '<a href="http://www.go4expert.com>G4E</a>, <img src="http://www.go4expert.com/logo.jpg">'
    
    print Sanitize.clean(html, Sanitize::Config::RESTRICTED)
    
    You can also customize it to your needs, here's how to go about it.

    Code:
    require 'rubygems'
    require 'sanitize'
    
    html  = '<a href="http://www.go4expert.com" id="MyLink"  title="Go4expert">G4E</a>, <img  src="http://www.go4expert.com/logo.jpg"> <div id="myIdDiv"  class="myClass">Test for div tag</div> Send email to <a  href="mailto:info@go4expert.com">info@go4expert.com</a> or  Download from <a href="ftp://www.go4expert.com">FTP  Downloads</a>'
    
    ## Now I'll specify the list of allowed tags, tag specific attributes, and tag specific protocols
    print Sanitize.clean(html, :elements => ['a', 'span', div],
        :attributes => {'a' => ['href', 'title'], 'div' => ['class'] },
        :protocols => {'a' => {'href' => ['http']} }
    )
    
    ## the above config is pretty easy to understand
    
    Try the code for yourself, tweak it to improve your understanding. I hope this was helpful.
     
    shabbir likes this.
  2. jyotimiss123

    jyotimiss123 New Member

    Joined:
    Dec 5, 2013
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    0
    it,s really nice
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice