Encoding the Entire Message

The next step down this path is more involved. Rather than encoding a URL, the sender encodes the entire content of the message in a way that your email reader can display but that is undecipherable to the casual user that wants to look at the message source.

Here is an example of a phishing attempt:

    Dear e-gold user !

    Our system has undergone to serious preventive maintenance,
    please, check up functioning your e-gold account.

    The e-gold site is at:

    http://www.e-gold.com

    This is automatic email.
    Do not reply to this email.

The message is simple enough, but in order to check on that URL you have to view the message source. Here it is, with some of the headers removed:

    Date: Tue, 30 Mar 2004 20:15:29 -0500
    To: XYZ@craic.com
    From: =?windows-1251?B?QWNjb3VudFJvYm90X2Rvbm90cmVwbHlAZS1nb2
    xkLmNvbQ==?= <AccountRobot_donotreply@e-gold.com>
    Subject: =?windows-1251?B?QXR0ZW50aW9uIGUtZ29sZCB1c2VyICE=?=
    MIME-Version: 1
    Content-Transfer-Encoding.0: Base64
    Content-Type: text/html; charset="windows-1251"

    PGh0bWw+CjxoZWFkPgo8dGl0bGU+VW50aXRsZWQgRG9jdW1lbnQ8L3RpdGxl
    Pgo8bWV0YSBodHRwLWVxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRl
    eHQvaHRtbDsgY2hhcnNldD13aW5kb3dzLTEyNTEiPgo8L2hlYWQ+Cgo8Ym9k
    eSBiZ2NvbG9yPSIjRkZGRkZGIiB0ZXh0PSIjMDAwMDAwIj4KPHA+RGVhciBl
    LWdvbGQgdXNlciAhPC9wPgo8cD5PdXIgc3lzdGVtIGhhcyB1bmRlcmdvbmUg
    dG8gc2VyaW91cyBwcmV2ZW50aXZlIG1haW50ZW5hbmNlLCBwbGVhc2UsIGNo
    ZWNrIHVwIAogIGZ1bmN0aW9uaW5nIHlvdXIgZS1nb2xkIGFjY291bnQuPC9w
    Pgo8cD5UaGUgZS1nb2xkIHNpdGUgaXMgYXQ6IDwvcD4KPHA+PGEgaHJlZj0i
    aHR0cDovL3d3dy5lLWdvbGQuY29tAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
    AQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
    AQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEB
    AQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQFAcmV5bnNhbi5uZXRmaXJtcy5j
    b20iPmh0dHA6Ly93d3cuZS1nb2xkLmNvbTwvYT48L3A+CjxwPlRoaXMgaXMg
    YXV0b21hdGljIGVtYWlsLjxicj4KICBEbyBub3QgcmVwbHkgdG8gdGhpcyBl
    bWFpbC48L3A+CjwvYm9keT4KPC9odG1sPgo=

That doesn’t look anything like the displayed text. This header line tells us what is going on:

    Content-Transfer-Encoding: Base64

Base64 is perhaps the most widely used method to encode binary data, such as images, into a set of ASCII characters so they can be transferred via email. Although intended for encoding binary data, it works just fine with regular text.

It uses a subset of 65 characters from the US_ASCII alphabet, 64 characters that actually encode data, and =, which represents the end of a Base64 block, as shown in Table 4-1.

Table 4-1. Base64 character set

Value

Encoding

Value

Encoding

Value

Encoding

Value

Encoding

0

A

17

R

34

i

51

z

1

B

18

S

35

j

52

0

2

C

19

T

36

k

53

1

3

D

20

U

37

l

54

2

4

E

21

V

38

m

55

3

5

F

22

W

39

n

56

4

6

G

23

X

40

o

57

5

7

H

24

Y

41

p

58

6

8

I

25

Z

42

q

59

7

9

J

26

a

43

r

60

8

10

K

27

b

44

s

61

9

11

L

28

c

45

t

62

+

12

M

29

d

46

u

63

/

13

N

30

e

47

v

  

14

O

31

f

48

w

Pad

=

15

P

32

g

49

x

  

16

Q

33

h

50

y

  

The encoding allows 6 bits of input data to be represented by a single ASCII character. A byte has 8 bits, so the encoding takes 3-byte chunks of data, which is 24 bits, and encodes it as 4 ASCII characters. As you can see, this is not a compression scheme. You commonly compress a file first and then encode its binary data using Base64.

Manually decoding the output would be extremely tedious. One way to handle this is to copy the encoded text, and nothing else, into a file and pass it to the Unix program openssl.

            % openssl enc -d -a -in your_file

An alternative is to install the MIME::Base64 Perl module on your system and then use this Perl one-liner to decode it.

            % perl -MMIME::Base64 -ne 'print decode_base64($_)' < your_file

The example given previously decodes to this simple web page:

    <html>
    <head>
    <title>Untitled Document</title>
    <meta http-equiv="Content-Type" content="text/html;
    charset=windows-1251">
    </head>
    <body bgcolor="#FFFFFF" text="#000000">
    <p>Dear e-gold user !</p>
    <p>Our system has undergone to serious preventive maintenance,
    please, check up functioning your e-gold account.</p>
    <p>The e-gold site is at: </p>
    <p><a href="http://www.e-gold.com@reynsan.netfirms.com">
    http://www.e-gold.com</a></p>
    <p>This is automatic email.<br>
      Do not reply to this email.</p>
    </body>
    </html>

This is simply the HTML code for the text that was displayed in the mail client. The difference is that you can see the real target for the URL in that message: . Well, not quite. This URL is actually the one I used as an example in the earlier section Usernames in URLs, with 140 %01 padding characters. This character is a non-printing ASCII character, so when you view the decoded output in more, you don’t see them. Open up the output in emacs and they are visible as 140 Ctrl-A characters.

If that all seems unnecessarily complex, remember that one reason for the disguise is to defeat spam-filtering software. Unless that software can decode Base64 to get at the real text then it can’t tell if this is a legitimate message. The same motivation leads some spammers to make images containing the text of the message, perhaps captured from a screen dump. The email messages may contain a URL to an image on a remote server or may include the image as a block of encoded text within the message. The images are placed within an anchor tag, so that you can click anywhere on the image and go to the target URL.

This target URL was taken from a message that supposedly came from a bank. Just to make things more of a challenge that URL was encoded:

http://%32%32%31%2E%31%38%34%2E%39%32%2E%31%36%39: %34%39%30%33/%63%69%74/%69%6E%64%65%78%2E%68%74%6D

Translating that yields a numeric IP address and a nonstandard port number:

http://221.184.92.169:4903/cit/index.htm

This is a good example of the multiple layers of deception that profession scammers will employ to make it difficult for spam filters and people like us who want to reveal them.

Get Internet Forensics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.