OBFUSCATION BY HEX ENCODING
Hex, short for Hexadecimal, is a base 16 numbering system. In your everyday life, you're used to a base 10 numbering system, beginning at 0 and ending with 9. You're probably aware of binary, which is a base 2 numbering system that only has two values: 0 and 1. Hexadecimal has 16 individual values, going from 0 to 9, and then from A to F. Mathematical operations work exactly the same in hex as they do in decimal; they just use a different numbering system. For example, using decimal, you know that 9 + 1 equals 10. Using hex, 9 + 1 equals A (if you're wondering, F + 1 equals 10). Here's a quick table to help you understand:
| Dec | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Hex | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Dec | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| Hex | A | B | C | D | E | F | 10 | 11 | 12 | 13 |
Table 3-1: Decimal and hexadecimal values
Don't worry if you're not completely sure of how this works; it's only important that you understand computers can do their math in hex. The next piece of the puzzle is ASCII (American Standard Code for Information Exchange) character codes. Every textual letter, number, and symbol stored in a file or displayed on screen is an ASCII character; even the text you're reading right now is comprised of ASCII characters. Each ASCII character has an ASCII code, for example the letter "A" has a decimal ASCII code of 65, the letter "B" has a code of 66, and so on.
Input Options
Because of the way Windows works, programs such as Outlook and Internet Explorer can accept input as either plain text or hexadecimal ASCII codes. This means it's perfectly possible to access a Web site by taking the URL, converting each character to its decimal ASCII code, converting that decimal code into a hexadecimal value, and then supplying the result to Internet Explorer! This example might require a small leap of faith, but it does work. Let's do the conversion with www.cnet.com:
- The first character of the URL is w.
- The decimal ASCII code of w is 119.
- 119 converted to hexadecimal is 77.
- The URL begins with www, so the first three hex codes are 77, 77, 77.
- The next character is a full stop.
- The decimal ASCII code of . is 46.
- 46 converted to hexadecimal is 2E.
- Our hex codes are now 77, 77, 77, 2E.
The process is repeated for all the characters in the URL until the entire string is converted. The final step is to replace all the commas with percent signs, and the conversion is complete. The URL www.cnet.com becomes:
%77%77%77%2E%63%6E%65%74%2E%63%6F%6D
If you type this string into Internet Explorer's address bar, you're taken to the CNET Web site, as shown in Figure 3-3. Look closely at the URL in the address bar:
Returning to the URL in the phishing e-mail, the point of all this conversion and messing about is nothing other than to confuse the reader and obfuscate its true purpose. Obfuscating URLs in this manner is a common trick used by malware and spyware programmers, too -- you'll often see this type of text in the output of HijackThis or Spybot if you're unlucky enough to be infected.
The URL in the example phishing e-mail is valid and reachable; however, it isn't included in this lesson. If you do decide to decode it, you're very strongly warned not to visit the Web site under any circumstances. It's a live, malicious Web site.
