shift or die

security. photography. foobar.

SMTP over XXE − how to send emails using Java's XML parser

I regularly find XML eXternal Entity (XXE) vulnerabilities while performing penetration tests. These are particularly often present in Java-based systems, where the default for most XML parsers still is parsing and acting upon inline DTDs, even though I have not seen a single use case where this was really neceassary. While the vulnerability is useful for file disclosures (and Java is nice enough to also provide directory listings) or even process listings (via /proc/pid/cmdline), recently I stumbled over another interesting attack vector when using a Java XML parser.

Out of curiosity, I looked at what protocols would be supported in external entities. In addition to the usual such as http and https, Java also supports ftp. The actual connection to the FTP server is implemented in It supports authentication, so we can put usernames and passwords in the URL such as in ftp://user:password@host:port/file.ext and the FTP client will send the corresponding USER command in the connection.

The (presumably ancient) code has a bug, though: it does not verify the syntax of the user name. RFC 959 specifies that a username may consist of a sequence of any of the 128 ASCII characters except <CR> and <LF>. Guess what the JRE implementers forgot? Exactly − to check for the presence of <CR> or <LF>. This means that if we put %0D%0A anywhere in the user part of the URL (or the password part for that matter), we can terminate the USER (or PASS) command and inject a new command into the FTP session.

While this may be interesting on its own, it allows us to do something else: to speak SMTP instead of FTP. Note that for historical reasons, the two protocols are structurally very similar. For example, on connecting, they both send a reply with a 220 code and text:

$ nc 21
220 Welcome to
$ nc 25
220 ESMTP Postfix

So, if we send a USER command to a mail server instead of a FTP server, it will answer with an error code (since USER is not a valid SMTP command), but let us continue with our session. Combined with the bug mentioned above, this allows us to send arbitrary SMTP commands, which allows us to send emails. For example, let’s set the URL to the following (newlines added for readability):


When connects using this URL, the following commands will be sent to the mail server at

Subject: test<LF>

From Java’s perspective, the “FTP” connection fails with a Invalid username/password, but the mail is already sent.

This attack is particularly interesting in a scenario where you can reach an (unrestricted, maybe not even spam- or malware-filtering) internal mail server from the machine doing the XML parsing. It even allows for sending attachments, since the URL length seems to be unrestricted and only limited by available RAM (parsing a 400MB long URL did take more than 32 GBs of RAM for some reason, though ;-)).

A portscan by email − HTTP over X.509 revisited

Disclaimer: This was originally posted on Since n.runs went bankrupt, the blog is defunct now. I reposted this here in July 2015 to preserve it for posteriority.

The history

Design bugs are my favourite bugs. About six years ago, while I was working in the Public Key Infrastructure area, I identified such a bug in the X.509 certificate chain validation process (RFC 5280). By abusing the authority information access id-ad-caissuers extension, it allowed for triggering (blind) HTTP requests when (untrusted, attacker-controlled) certificates were validated. Microsoft was one of the few vendors who actually implemented that part of the standard and Microsoft CryptoAPI was vulnerable against it. Corresponding advisories (Office 2007, Windows Live Mail and Outlook) and a whitepaper were released in April 2008.

This issue was particularly interesting because it could be triggered by an S/MIME-signed email when opened in Microsoft Outlook (or other Microsoft mail clients using the CryptoAPI functionality). This allowed attackers to trigger arbitrary HTTP requests (also to internal networks) but not gaining any information about the result of the request. Also, because the request was done using CryptoAPI and not in a browser, it was impossible to exploit any kind of Cross Site Request Forgery issues in web applications, so the impact of the vulnerability was quite limited. In fact, I would consider this mostly privacy issue because the most interesting application was to find out that an email had been opened (and from which IP address and with which version of CryptoAPI), something that was otherwise (to my knowledge) pretty much impossible in Outlook (, a very interesting service with many tests for email privacy issues seems to confirm that).

Revisiting the issue

In May 2012, I revisited the issue to see if something that I had been thinking about previously could be implemented – leveraging the issue to do port scanning on internal hosts by alternating between internal and external HTTP requests and measuring the timing distance on the (attacker-controlled) external host. It turned out that in a specific combination of nested S/MIME signatures with particularly long URLs (about 3500 characters, don’t ask my why exactly they are needed), one can actually observe a difference in timing between an open port or a closed port.

To test this, URLs that are triggered by the email would for example look similar to the following:

  1. http://[attacker_server]/record_start?port=1&[3500*A]
  2. http://[internal_target_ip]:1/[3500*A]
  3. http://[attacker_server]/record_stop?port=1&[3500*A]
The scripts »record_start« and »record_stop« on the server are used to measure the time difference between the two external requests (1 and 3), with which we can tell (roughly) how long the internal request to port 1 on the internal target IP took.

Testing showed that in case the port is open, the time difference measured between the two external requests was significantly below one second, while if the port was closed, it was a bit above one second.

Unfortunately, we are not able to observe this for all possible ports. The timing difference for some HTTP request to a list of well-known ports was short regardless of whether they are open or closed, making it impossible to determine their state. My current assumption is that this is because the HTTP client library used by CryptoAPI does not allow connections on those ports to avoid speaking HTTP(S) on them (similar to browsers which typically make it impossible to speak HTTP on port 25).

A single email can be used to scan the 50 most-used (as determined by nmap) ports on a single host. A proof-of-concept which scans has been implemented and can be tried out by sending an empty email to You will receive an automatic reply with an S/MIME-signed message which when opened will trigger a number of HTTP requests to ports on local host and a data logger running on my webserver. After a few minutes, you can check on a web interface to see which ports are open and which ones are closed. Sometimes, your Exchange mail server might prevent the test email from being delivered though because it contains a lot of nested MIME parts (try again with a more relaxed mailserver then ;-)).

Problem solved

After repeatedly bugging the Microsoft Security Response team about the issue (and accidentally discovering an exploitable WriteAV issue when too many S/MIME signatures were used – MS13-068, fixed in the October 2013 patch day), this has now been fixed with the November 2013 patch day release (CVE-2013-3870). In case the id-ad-caissuers functionality is actually needed in an organization, the functionality can be turned on again, though – with the risk of still being vulnerable to this issue.

Geohashing with GPX files and QLandkarte GT

Because of some scientists near the south pole, I recently re-discovered geohashing. As I wanted an easy way to see the most recent hash points (and the upcoming one(s), since I live east of W30), I did some automation.

The different online services are pretty nice but they did not have all the features I wanted to have. Also, I have grown quite fond of the ability to have an OpenStreetMap available offline (not necessarily because I am offline that much, but because it makes looking at the map so much faster). I use QLandkarte GT and the map, as it shows cycling routes quite nicely.

QLandkarte GT supports loading GPX files, so the first thing I needed was something to produce GPX files for a given graticule (and date, or if no date is specified, for all upcoming ones). I wanted something similar to the Small Hash Inquiry Tool, as it shows you the hash points of the surrounding graticules as well. I took an evening to hack something together using Ruby, Sinatra and relet’s JSON web service. I decided to host it on Heroku, as it was easy and free. You can find out how to use it at It should also work quite nicely with GPS devices with GPX support (or with gpsbabel, for that matter). The source is available at git://, if you are curious.

But back from devices to the desktop, I wanted an easy way to view this in QLandkarte GT and keep it updated. Luckily, QLandkarte GT offers a sort of reload feature with the “-m” command line option. I’ve written a small wrapper which makes this available using a signal handler:

$ cat bin/qlandkartegt_reloadable.rb 
#!/usr/bin/env ruby

f = IO.popen("qlandkartegt -m0 #{ARGV.join(' ')}", 'w')

trap('USR1') do
  f.write 'A'
So now I can do something along the lines of:
qlandkartegt_reloadable.rb ~/gps/geohash.gpx
wget -O ~/gps/geohash.gpx && pkill -USR1 -f qlandkartegt_reloadable.rb
to keep the data shown in my (always open) QLandkarte GT up to date.

The only things on my TODO list for this are timezones (it works using UTC at the moment, which is fine for me since I am pretty close to UTC, but may be annoying if you are not that close) and the possible addition of business holidays to figure out if tomorrow will have new DJIA opening value or not (if anyone has a good, free source, please let me know). I might work on this or I might not, chances are higher if someone bothers me to do so.

Shell injection without whitespace

In a recent penetration test, I was in the situation where I could inject code into a Perl system call, but whitespace (\s+) was filtered beforehand (probably not for security but rather for functionality reasons).

In looking for a way to still execute more than a parameterless binary (which of course would be a possible solution if I had had a way to put a custom binary on the system), I stumbled over the $IFS variable, which is the “Internal Field Seperator” with default value “<space><tab><newline>”. It also works fine as a separator for commands, so you can inject something like:

without using a single whitespace character. May it come in handy for you one day.

Moving to Octopress

My blog has been running on Angerwhale, a Catalyst-based Perl blog framework. Although it worked fine for me from a usability point of view (plain text files plus some meta-data), it was way to slow to deal with a few concurrent hits. I never noticed until @chaosupdates linked to my blog post about @cryptofax2tweet and the server more or less exploded in (virtual) flames.

I made an attempt to change from the standalone server (well, no surprise that it did not deal well with load there but that was fine for a long time for my little blog here) to a real modperl-based installation, but that did not help much.

As I am more confident with Ruby then Perl nowadays I started looking for something to change to. A static solution would of course be nice to have because of the speed factor, so after some searching and #followerpower, I stumbled over Octopress, which uses the Jekyll framework.

I converted all my Angerwhale posts to Octopress using a small Ruby script (ping me if you are interested). The comments from the old site are still missing, but I am considering converting them with Jekyll::StaticComments plugin.

The plan for now is to try to blog a bit more than before, maybe it should be on my list of new year's resolutions (or maybe not as not to jinx it ;-).

Introducing CryptoFax2Tweet

Meet @cryptofax2tweet, a new Twitter account I run. So, what is so special about this account? As the name suggests, it can be used to tweet by sending an encrypted QR code using a fax, for example when your government decides to turn off the internet. In case you are not interested in the technical details on how it works and just want to use it, you can download the cryptofax.pdf file and open it in Adobe Reader. A one page user's guide is also available.

So, how does this work? Recent PDF versions support the XML Forms Architecture (XFA), which I've been playing with lately. It includes all kind of funny things, such as its own language (because having both Javascript and Flash in Reader is not enough, apparently), FormCalc. It is apparently not more useful than Javascript except if you want to generate arbitrary HTTP requests.

But I am digressing. One of the more interesting features of XFA is the possibility to create all kinds of barcodes, both one- and two-dimensional. The list of different types you can create in the specification is about five pages long(!). Also, the specification claims that the content of the barcode can be encrypted before creating it using an RC4/RSA hybrid encryption.

I had recently read about Google's @speak2tweet account and liked the idea but not the Medienbruch — the change from one medium (voice) to another (text). So I thought about implementing something using XFA which would allow people to send tweets via fax.

One obstacle on the way was finding out that Adobe does not want you to create dynamic 2D barcodes if you do not have the license for it. Unluckily, if you do not know this and modify the rawValue attribute of the barcode field after the form has rendered, you just get to see a grey block instead of the barcode and keep wondering whether Adobe just broke the functionality. Also, debugging Javascript code if you only have Adobe Reader is less funny than you think. Once I figured that out, I realised that you can ask for the dynamic information in the initialize event handler using app.response() and create the barcode at that point (not sure whether Adobe would consider this a bug or a feature).

After that was solved, I looked into encrypting the content of the tweet. Note that the encryption just helps against an attacker who only monitors the phone lines and not the @cryptofax2tweet account. Still, it might help people who have printed out the fax and it gets intercepted before it has been faxed. Unluckily, it looks like this particular functionality from the specification has not been implemented in Reader (the fact that the LiveCycle® Designer ES Scripting Reference does not talk about it at all points in this direction, too).

Luckily, there was no need to implement the cryptography myself, as there is already a pretty nice BSD-licensed RSA implementation for Javascript. A few patches later to fix some Reader-specific oddities, I was able to RSA-encrypt a tweet. As a tweet can only be 140 characters (thus at most 560 bytes in UTF-8), I just used a 4096 bit RSA key (not for security, just for convenience reasons :-). This would enable us to encode only at most 128 characters of four-byte UTF-8 characters (e.g. Klingon in the private use area). I accepted this trade-off and in the end it turned out that inputting four-byte UTF-8 characters using app.response() was impossible anyways.

The other end of the service needs to decode the QR code, decrypt the content and tweet it. This part was a lot easier than the PDF part, as it could be implemented in less than 50 lines using Ruby and the ZBar barcode library. A fax number was thankfully provided by so that I only needed to deal with emails from the fax2mail gateway.

If you managed to read this far, you might be interested in the code, which is available in a Git repository (or see the Gitweb interface).

Evading AVs using the XML Data Package (XDP) format

At work, I recently obtained a copy of iText in Action, 2nd Edition because I have been playing with PDF a bit lately and the book not only offers advice on how to use the Java PDF library iText but also some background on PDF internals and the new features in PDF 1.7.

One thing I stumbled about in the book was that there is a format called XML Data Package (XDP) which can be used to represent a PDF as XML. So of course I downloaded the specification and went to play with it a bit.

Acrobat Reader opens XDP files just fine if they have an .xdp file extension or if they are sent by a webserver with the application/vnd.adobe.xdp+xml MIME type. It was an easy exercise to write a small script to convert a given PDF to an XDP file (basically it's just an XML header, the Base64-encoded PDF and an XML footer).

I was wondering how Antivirus products would react to a malware PDF file in disguise as a XDP. Thus, I generated a PDF containing an exploit using Metasploit and uploaded it to VirusTotal. 13 out of 43 products classified the PDF as malware. Interestingly enough, 0/43 recognized the equivalent XDP file as malware (and neither did a few mail gateways I tested).

I've just submitted a feature request for Metasploit to add XDP support and added my pdf2xdp.rb script as a starting point. Let's see where this is heading.

Language-dependant spellchecking within sup

As we all know, all mail clients suck. In the eternal search for the one that sucks the least, I have come across Sup. I tried it a while ago and it wouldn't import my mailbox, so I gave it up pretty quickly. Apparently, it has become better in that regard so that I now have a full-text searchable index over roughly 200.000 mails and am seriously considering moving away from mutt.

One feature that I like particularly is the extensibility by using Ruby code in so-called hooks. One quick hack I did tonight allows me to set the spellchecking language of my editor (vim) based on the language of the message I am replying to by using the whatlanguage gem. This code in ~/.sup/hooks/reply-from.rb replaces the spelllang parameter on the (default) vim commandline in the configuration:

require 'whatlanguage'

case message.quotable_body_lines.join("\n").language
when :german then
    $config[:editor].gsub!(/spelllang=[^']+'/, %q|spelllang=de_de'|)
    $config[:editor].gsub!(/spelllang=[^']+'/, %q|spelllang=en_us'|)