aboutsummaryrefslogtreecommitdiff
path: root/doc/base64.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/base64.txt')
-rw-r--r--doc/base64.txt107
1 files changed, 0 insertions, 107 deletions
diff --git a/doc/base64.txt b/doc/base64.txt
deleted file mode 100644
index 48eafabe..00000000
--- a/doc/base64.txt
+++ /dev/null
@@ -1,107 +0,0 @@
- The Base64 Content-Transfer-Encoding is designed to represent
- arbitrary sequences of octets in a form that need not be humanly
- readable. The encoding and decoding algorithms are simple, but the
- encoded data are consistently only about 33 percent larger than the
- unencoded data. This encoding is virtually identical to the one used
- in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421.
- The base64 encoding is adapted from RFC 1421, with one change: base64
- eliminates the "*" mechanism for embedded clear text.
-
- A 65-character subset of US-ASCII is used, enabling 6 bits to be
- represented per printable character. (The extra 65th character, "=",
- is used to signify a special processing function.)
-
- NOTE: This subset has the important property that it is
- represented identically in all versions of ISO 646, including US
- ASCII, and all characters in the subset are also represented
- identically in all versions of EBCDIC. Other popular encodings,
- such as the encoding used by the uuencode utility and the base85
- encoding specified as part of Level 2 PostScript, do not share
- these properties, and thus do not fulfill the portability
- requirements a binary transport encoding for mail must meet.
-
- The encoding process represents 24-bit groups of input bits as output
- strings of 4 encoded characters. Proceeding from left to right, a
- 24-bit input group is formed by concatenating 3 8-bit input groups.
- These 24 bits are then treated as 4 concatenated 6-bit groups, each
- of which is translated into a single digit in the base64 alphabet.
- When encoding a bit stream via the base64 encoding, the bit stream
- must be presumed to be ordered with the most-significant-bit first.
-
- That is, the first bit in the stream will be the high-order bit in
- the first byte, and the eighth bit will be the low-order bit in the
- first byte, and so on.
-
- Each 6-bit group is used as an index into an array of 64 printable
- characters. The character referenced by the index is placed in the
- output string. These characters, identified in Table 1, below, are
- selected so as to be universally representable, and the set excludes
- characters with particular significance to SMTP (e.g., ".", CR, LF)
- and to the encapsulation boundaries defined in this document (e.g.,
- "-").
-
- Table 1: The Base64 Alphabet
-
- Value Encoding Value Encoding Value Encoding Value Encoding
- 0 A 17 R 34 i 51 z
- 1 B 18 S 35 j 52 0
- 2 C 19 T 36 k 53 1
- 3 D 20 U 37 l 54 2
- 4 E 21 V 38 m 55 3
- 5 F 22 W 39 n 56 4
- 6 G 23 X 40 o 57 5
- 7 H 24 Y 41 p 58 6
- 8 I 25 Z 42 q 59 7
- 9 J 26 a 43 r 60 8
- 10 K 27 b 44 s 61 9
- 11 L 28 c 45 t 62 +
- 12 M 29 d 46 u 63 /
- 13 N 30 e 47 v
- 14 O 31 f 48 w (pad) =
- 15 P 32 g 49 x
- 16 Q 33 h 50 y
- The output stream (encoded bytes) must be represented in lines of no
- more than 76 characters each. All line breaks or other characters
- not found in Table 1 must be ignored by decoding software. In base64
- data, characters other than those in Table 1, line breaks, and other
- white space probably indicate a transmission error, about which a
- warning message or even a message rejection might be appropriate
- under some circumstances.
-
- Special processing is performed if fewer than 24 bits are available
- at the end of the data being encoded. A full encoding quantum is
- always completed at the end of a body. When fewer than 24 input bits
- are available in an input group, zero bits are added (on the right)
- to form an integral number of 6-bit groups. Padding at the end of
- the data is performed using the '=' character. Since all base64
- input is an integral number of octets, only the following cases can
-arise: (1) the final quantum of encoding input is an integral
- multiple of 24 bits; here, the final unit of encoded output will be
- an integral multiple of 4 characters with no "=" padding, (2) the
- final quantum of encoding input is exactly 8 bits; here, the final
- unit of encoded output will be two characters followed by two "="
- padding characters, or (3) the final quantum of encoding input is
- exactly 16 bits; here, the final unit of encoded output will be three
- characters followed by one "=" padding character.
-
- Because it is used only for padding at the end of the data, the
- occurrence of any '=' characters may be taken as evidence that the
- end of the data has been reached (without truncation in transit). No
- such assurance is possible, however, when the number of octets
- transmitted was a multiple of three.
-
- Any characters outside of the base64 alphabet are to be ignored in
- base64-encoded data. The same applies to any illegal sequence of
- characters in the base64 encoding, such as "====="
-
- Care must be taken to use the proper octets for line breaks if base64
- encoding is applied directly to text material that has not been
- converted to canonical form. In particular, text line breaks must be
- converted into CRLF sequences prior to base64 encoding. The important
- thing to note is that this may be done directly by the encoder rather
- than in a prior canonicalization step in some implementations.
-
- NOTE: There is no need to worry about quoting apparent
- encapsulation boundaries within base64-encoded parts of multipart
- entities because no hyphen characters are used in the base64
- encoding.