Using HTML Purifier in Code Igniter to clean user generated content
Accepting user-generated content into your web application is one of the most risky parts of web application development.
Luckily for PHP users, HTMLPurifier is a class which goes a long way toward solving this issue.
HTMLPurifier is a mature, open-source PHP class library designed to clean up and sanitize HTML input. It is continually developed and tested to ensure any newly discovered exploits and vulnerabilities are also secured against.
It works by using a whitelist approach and both removes XSS vulnerabilities and returns standards compliant, ‘safe’ HTML output. It also has a huge array of configuration options to enable it to be used in a varierty of situations.
As many of you already know, CodeIgniter is a high-performance, flexible PHP framework which aids rapid application development. To keep things simple, CodeIgniter requires class libraries to be in a specific format. This means the HTMLPurifier must me modified to work with CI.
Part 1: Adding HTML Purifier as library to CodeIgniter
This section is detailed in the blog post by Ortz, making html purifier work in CodeIgniter.
- Download the latest version of the HTML Purifier librarys. Put the contents of the HTMLPurifier library folder into the Libraries folder in your CodeIgniter application folder.
- Now go to HTMLPurifier.includes.php and comment out the line:
So that it now reads:
- Then go to the file called HTMLPurifier.php and add this snippet on line 2, just under the ‘load->library(‘HTMLPurifier’);
Now that we have the HTMLPurifier library working in our CodeIgniter installation we can implement it in our application.
Part 2: Using HTML Purifier in a CodeIgniter project
We then need to follow the guidance of Rdjs to implement the comment santizing in CodeIgniter.
Rjds’s article on validating comments in CodeIgniter is excellent. He shows how to add the code to CodeIgniter and implement it to clean comments submitted by users.
// load the config and overide defaults as necessary
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'Doctype', 'XHTML 1.0 Transitional');
$config->set('HTML', 'AllowedElements', 'a,em,blockquote,p,strong,pre,code');
$config->set('HTML', 'AllowedAttributes', 'a.href,a.title');
$config->set('HTML', 'TidyLevel', 'light');
// run the escaped html code through the purifier
$cleanHtml = $this->htmlpurifier->purify($dirtyHtml, $config);
The cleanComment function returns clean user input in a string. The tags which match the items passed in the ‘AllowedElements’ config option remain while all other tags are removed.
This entry was posted on Thursday, May 21st, 2009 at 6:36 pm and is filed under Programming, Tutorials. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.