HTTPEncode - Request For Comments

 


Last Revised August 1, 2005

An example of protocol encoding is HTTPEncode encoding (v, iv).  HTTPEncode uses the same idea as UUEncode with a slight difference.  Whereas UUEncode takes binary data and encodes it into "plain text", HTTPEncode takes that binary data one step further.  The binary data is not only encoded but the HTTPEncode will create HTTP wrappers that add HTTP tags to the beginning and end of the data, and throw in random HTML tags inside the data.  The encoding will "normal" HTML pages that cannot be distinguished when analyzed by Hidden Markov Model (HMM) or Shamir’s test for entropy (i) .  This is to avoid transport layer --> application layer firewalls that look for tunneling over port 80.

For the highest-level security of HTTPEncode encoding, the data would be sorted for the "highest" recurring numbers.   Random words would be chosen by the destination machine in their native language and sent to the end point machine along with the security setting.   Random pages would be fetched via Google by the end point machine using the random words.  As per the above referenced paper (i) word substitution and JPEG steganography would occur via a dictionary that is contained within the software, the higher the security setting the lower amount of data that would occur per block of data (to try to avoid steganography detection).  A random number would be chosen, the dictionary of nouns, verbs, adverbs, etc would be randomized using the number chosen, with some words at the end of the dictionary becoming the "escape" and "escape plus" words.  The destination machine would make the http request with a <random word(s)>.html.  The random function would, of course, be a pseudo random number generator with the seed sent as the first part of the "conversation".  Word substitution would ensue substituting the correct word for the dictionary words of that value.  Certain words would be "escape" and "escape plus" words, i.e. that word or that word and the next word should be ignored.  This would allow "correct" grammar that should escape casual observation of the encoded data.  When images are present in the page that was fetched via Google, stenography can be used on the image to increase the data sent.  The page would be sent back as <word>.html (per the request of the destination machine).  Additionally other elements of http (ActiveX, JAVA scripts, cookies, etc.) could be used to hide data transmissions.

This same idea was discussed in a talk by Mystic at Defcon XI (iii), however his software randomly generates text rather than using a input source.  The idea was originally written in 1996 in a book named "Disappearing Cryptography" by Peter Wayner.  Mystic wrote a tool ircMimic (iv) to perform this function.

When using FTPEncode for port 20, the data port, UUEncoding will suffice.  When using port 21, again the data will need to be messaged to look like FTP commands.  Port 443 is encoded already so the data conversion should be minimal.

i) Simova, Martina Pollett, Chris and Stamp, Mark "STEALTHY CIPHERTEXT".  URL: http://www.cs.sjsu.edu/faculty/stamp/papers/stealthy.pdf, March 2005 (Accessed July 23, 2005)

ii) Krista Bennett "LINGUISTIC STEGANOGRAPHY: SURVEY, ANALYSIS, AND ROBUSTNESS  CONCERNS FOR HIDING INFORMATION IN TEXT".  URL : https://www.cerias.purdue.edu/tools_and_resources/bibtex_archive/archive/2004-13.pdf  (Accessed July 23, 2005)

iii) Mystic "Mimicry".  URLs: http://www.defcon.org/html/defcon-11/defcon-11-speakers.html#Mystic http://www.inventati.info/pub/defcon11/Mimic-Mimicry/Mimicry.ppt  (Accessed July 23, 2005).

iv) Mystic "Mimicry" software. URL: http://www.inventati.info/pub/defcon11/Mimic-Mimicry/  (Accessed July 23, 2005)

v) Matthias Bauer "New Covert Channels in HTTP".  URL: http://www.freehaven.net/anonbib/cache/bauer:wpes2003.ps October 30, 2003, (Accessed July 30, 2005)

vi) H. Balakrishnan, D. Karger, N. Feamster, M. Balazinska, W. Wang, G. Harfst "Infranet".  URL: http://nms.csail.mit.edu/projects/infranet/ (accessed July 30, 2005)

Back