XML external entity (XXE) attacks are one of the OWASP Top Ten security risks. XXE attacks are caused by XML parsers that have entity processing enabled. Here’s an example of a simple Ruby program that has entity processing enabled in Nokogiri, its XML parser:
This allows our XML parser to read the contents of our local filesystem, the key point being that this occurs because the NOENT
flag is enabled. When we run the program, we see contents of /etc/passwd
(limited to the first 10 lines for brevity):
$ ruby xxe.rb | head -n 10
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY users SYSTEM "file:///etc/passwd">
]>
<root>
<child>##
# User Database
#
# Note that this file is consulted directly only when the system is running
# in single-user mode. At other times this information is provided by
If we ran the program with NOENT
disabled, we’d see the following:
$ ruby xxe.rb | head -n 10
<?xml version="1.0"?>
<!DOCTYPE root [
<!ENTITY users SYSTEM "file:///etc/passwd">
]>
<root>
<child>&users;</child>
</root>
In this case, we see that there’s still a reference to the users
entity, and we haven’t read the contents of our local filesystem.
This raises a question: what does NOENT
actually mean?
At first glance, the naming is a bit counterintuitive. NOENT
looks like it means something like “no entities,” but we are processing our users
external entity when that flag is enabled.
Luckily, we don’t have to search far in Nokogiri’s source code to see how NOENT
is used. Nokogiri is partially implemented in Java, and we can find this code snippet in its XmlDomParserContent.java
source file:
In the same file, we find FEATURE_NOT_EXPAND_ENTITY
defined like so:
To summarize what we’ve discovered so far: when NOENT
is enabled, the FEATURE_NOT_EXPAND_ENTITY
feature is turned off, and this is when we see our entity expanded with contents from the local filesystem. When NOENT
is disabled, the FEATURE_NOT_EXPAND_ENTITY
feature is turned on, and we don’t read contents from the local filesystem.
That’s a lot of consecutive negatives! Let’s reword it for clarity: when our flag is enabled, the feature which expands entities is turned on. Put this way, the behaviour is a bit more clear – we see the contents of the local filesystem because our entity-expanding feature is enabled.
Still, this doesn’t answer our original question – why the name NOENT
? To answer that, we can look at Apache documentation related to the FEATURE_NOT_EXPAND_ENTITY
definition shown previously. Under the definition of the http://apache.org/xml/features/dom/create-entity-ref-node
s feature, we expect the following behaviour when FEATURE_NOT_EXPAND_ENTITY
is set to true
:
Create EntityReference nodes in the DOM tree. The EntityReference nodes and their child nodes will be read-only.
And when it’s set to false
:
Do not create EntityReference nodes in the DOM tree. No EntityReference nodes will be created, only the nodes corresponding to their fully expanded sustitution [sic] text will be created.
In other words, when NOENT
is enabled, it means that we don’t expect to see an EntityReference
node in our parsed content, and our parser should replace an entity with its definition (in the case of our example, replace the users
node with /etc/passwd
). If NOENT
is disabled, it means we do expect to see our entity in our parsed content, and so we still see a reference to users
in the output of our parser.
In conclusion: the NOENT
flag does mean “no entities”, as in, “no references to entities should exist in our parsed XML.” This is why our parser replaces it with the contents of /etc/passwd
. This naming convention leaves plenty of room for confusion, which is why fixing the names of parser flags is actually on the Nokogiri roadmap!
Start your journey towards writing better software, and watch this space for new content.