HTTPEncode - Request For Comments
Last Revised August 1, 2005
An example of protocol encoding is HTTPEncode encoding (v, iv).
HTTPEncode uses the same idea as UUEncode with a slight difference.
Whereas UUEncode takes binary data and encodes it into "plain text",
HTTPEncode takes that binary data one step further. The binary data is
not only encoded but the HTTPEncode will create HTTP wrappers that add HTTP
tags to the beginning and end of the data, and throw in random HTML tags inside
the data. The encoding will "normal" HTML pages that cannot be
distinguished when analyzed by Hidden Markov Model (HMM) or Shamir’s test for
entropy (i) . This is to avoid transport layer --> application layer
firewalls that look for tunneling over port 80.
For the highest-level security of HTTPEncode encoding, the data would be sorted
for the "highest" recurring numbers. Random words would be
chosen by the destination machine in their native language and sent to the end
point machine along with the security setting. Random pages would be
fetched via Google by the end point machine using the random words. As
per the above referenced paper (i) word substitution and JPEG steganography
would occur via a dictionary that is contained within the software, the higher
the security setting the lower amount of data that would occur per block of
data (to try to avoid steganography detection). A random number would be
chosen, the dictionary of nouns, verbs, adverbs, etc would be randomized using
the number chosen, with some words at the end of the dictionary becoming the
"escape" and "escape plus" words. The destination
machine would make the http request with a <random word(s)>.html.
The random function would, of course, be a pseudo random number generator with
the seed sent as the first part of the "conversation". Word
substitution would ensue substituting the correct word for the dictionary words
of that value. Certain words would be "escape" and "escape
plus" words, i.e. that word or that word and the next word should be
ignored. This would allow "correct" grammar that should escape
casual observation of the encoded data. When images are present in the
page that was fetched via Google, stenography can be used on the image to
increase the data sent. The page would be sent back as <word>.html
(per the request of the destination machine). Additionally other elements
of http (ActiveX, JAVA scripts, cookies, etc.) could be used to hide data
transmissions.
This same idea was discussed in a talk by Mystic at Defcon XI (iii), however
his software randomly generates text rather than using a input source.
The idea was originally written in 1996 in a book named "Disappearing
Cryptography" by Peter Wayner. Mystic wrote a tool ircMimic (iv) to
perform this function.
When using FTPEncode for port 20, the data port, UUEncoding will suffice.
When using port 21, again the data will need to be messaged to look like FTP
commands. Port 443 is encoded already so the data conversion should be
minimal.
i) Simova, Martina Pollett, Chris and Stamp, Mark "STEALTHY
CIPHERTEXT". URL: http://www.cs.sjsu.edu/faculty/stamp/papers/stealthy.pdf,
March 2005 (Accessed July 23, 2005)
ii) Krista Bennett "LINGUISTIC STEGANOGRAPHY: SURVEY, ANALYSIS, AND
ROBUSTNESS CONCERNS FOR HIDING INFORMATION IN TEXT". URL : https://www.cerias.purdue.edu/tools_and_resources/bibtex_archive/archive/2004-13.pdf
(Accessed July 23, 2005)
iii) Mystic "Mimicry". URLs: http://www.defcon.org/html/defcon-11/defcon-11-speakers.html#Mystic
http://www.inventati.info/pub/defcon11/Mimic-Mimicry/Mimicry.ppt
(Accessed July 23, 2005).
iv) Mystic "Mimicry" software. URL: http://www.inventati.info/pub/defcon11/Mimic-Mimicry/
(Accessed July 23, 2005)
v) Matthias Bauer "New Covert Channels in HTTP". URL: http://www.freehaven.net/anonbib/cache/bauer:wpes2003.ps
October 30, 2003, (Accessed July 30, 2005)
vi) H. Balakrishnan, D. Karger, N. Feamster, M. Balazinska, W. Wang, G. Harfst
"Infranet". URL: http://nms.csail.mit.edu/projects/infranet/
(accessed July 30, 2005)