blob: 0267afe2ef3bc81dd6cbdab6e7d2fbed8bf4b2b4 [file] [log] [blame]
<html lang="en">
<head>
<title>Encode Binary Data - The GNU C Library</title>
<meta http-equiv="Content-Type" content="text/html">
<meta name="description" content="The GNU C Library">
<meta name="generator" content="makeinfo 4.13">
<link title="Top" rel="start" href="index.html#Top">
<link rel="up" href="String-and-Array-Utilities.html#String-and-Array-Utilities" title="String and Array Utilities">
<link rel="prev" href="Trivial-Encryption.html#Trivial-Encryption" title="Trivial Encryption">
<link rel="next" href="Argz-and-Envz-Vectors.html#Argz-and-Envz-Vectors" title="Argz and Envz Vectors">
<link href="http://www.gnu.org/software/texinfo/" rel="generator-home" title="Texinfo Homepage">
<!--
This file documents the GNU C library.
This is Edition 0.12, last updated 2007-10-27,
of `The GNU C Library Reference Manual', for version
2.8 (Sourcery G++ Lite 2011.03-41).
Copyright (C) 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002,
2003, 2007, 2008, 2010 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being ``Free Software Needs Free Documentation''
and ``GNU Lesser General Public License'', the Front-Cover texts being
``A GNU Manual'', and with the Back-Cover Texts as in (a) below. A
copy of the license is included in the section entitled "GNU Free
Documentation License".
(a) The FSF's Back-Cover Text is: ``You have the freedom to
copy and modify this GNU manual. Buying copies from the FSF
supports it in developing GNU and promoting software freedom.''-->
<meta http-equiv="Content-Style-Type" content="text/css">
<style type="text/css"><!--
pre.display { font-family:inherit }
pre.format { font-family:inherit }
pre.smalldisplay { font-family:inherit; font-size:smaller }
pre.smallformat { font-family:inherit; font-size:smaller }
pre.smallexample { font-size:smaller }
pre.smalllisp { font-size:smaller }
span.sc { font-variant:small-caps }
span.roman { font-family:serif; font-weight:normal; }
span.sansserif { font-family:sans-serif; font-weight:normal; }
--></style>
<link rel="stylesheet" type="text/css" href="../cs.css">
</head>
<body>
<div class="node">
<a name="Encode-Binary-Data"></a>
<p>
Next:&nbsp;<a rel="next" accesskey="n" href="Argz-and-Envz-Vectors.html#Argz-and-Envz-Vectors">Argz and Envz Vectors</a>,
Previous:&nbsp;<a rel="previous" accesskey="p" href="Trivial-Encryption.html#Trivial-Encryption">Trivial Encryption</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="String-and-Array-Utilities.html#String-and-Array-Utilities">String and Array Utilities</a>
<hr>
</div>
<h3 class="section">5.11 Encode Binary Data</h3>
<p>To store or transfer binary data in environments which only support text
one has to encode the binary data by mapping the input bytes to
characters in the range allowed for storing or transfering. SVID
systems (and nowadays XPG compliant systems) provide minimal support for
this task.
<!-- stdlib.h -->
<!-- XPG -->
<div class="defun">
&mdash; Function: char * <b>l64a</b> (<var>long int n</var>)<var><a name="index-l64a-584"></a></var><br>
<blockquote><p>This function encodes a 32-bit input value using characters from the
basic character set. It returns a pointer to a 7 character buffer which
contains an encoded version of <var>n</var>. To encode a series of bytes the
user must copy the returned string to a destination buffer. It returns
the empty string if <var>n</var> is zero, which is somewhat bizarre but
mandated by the standard.<br>
<strong>Warning:</strong> Since a static buffer is used this function should not
be used in multi-threaded programs. There is no thread-safe alternative
to this function in the C library.<br>
<strong>Compatibility Note:</strong> The XPG standard states that the return
value of <code>l64a</code> is undefined if <var>n</var> is negative. In the GNU
implementation, <code>l64a</code> treats its argument as unsigned, so it will
return a sensible encoding for any nonzero <var>n</var>; however, portable
programs should not rely on this.
<p>To encode a large buffer <code>l64a</code> must be called in a loop, once for
each 32-bit word of the buffer. For example, one could do something
like this:
<pre class="smallexample"> char *
encode (const void *buf, size_t len)
{
/* <span class="roman">We know in advance how long the buffer has to be.</span> */
unsigned char *in = (unsigned char *) buf;
char *out = malloc (6 + ((len + 3) / 4) * 6 + 1);
char *cp = out, *p;
/* <span class="roman">Encode the length.</span> */
/* <span class="roman">Using `htonl' is necessary so that the data can be</span>
<span class="roman">decoded even on machines with different byte order.</span>
<span class="roman">`l64a' can return a string shorter than 6 bytes, so </span>
<span class="roman">we pad it with encoding of 0 (</span>'.'<span class="roman">) at the end by </span>
<span class="roman">hand.</span> */
p = stpcpy (cp, l64a (htonl (len)));
cp = mempcpy (p, "......", 6 - (p - cp));
while (len &gt; 3)
{
unsigned long int n = *in++;
n = (n &lt;&lt; 8) | *in++;
n = (n &lt;&lt; 8) | *in++;
n = (n &lt;&lt; 8) | *in++;
len -= 4;
p = stpcpy (cp, l64a (htonl (n)));
cp = mempcpy (p, "......", 6 - (p - cp));
}
if (len &gt; 0)
{
unsigned long int n = *in++;
if (--len &gt; 0)
{
n = (n &lt;&lt; 8) | *in++;
if (--len &gt; 0)
n = (n &lt;&lt; 8) | *in;
}
cp = stpcpy (cp, l64a (htonl (n)));
}
*cp = '\0';
return out;
}
</pre>
<p>It is strange that the library does not provide the complete
functionality needed but so be it.
</blockquote></div>
<p>To decode data produced with <code>l64a</code> the following function should be
used.
<!-- stdlib.h -->
<!-- XPG -->
<div class="defun">
&mdash; Function: long int <b>a64l</b> (<var>const char *string</var>)<var><a name="index-a64l-585"></a></var><br>
<blockquote><p>The parameter <var>string</var> should contain a string which was produced by
a call to <code>l64a</code>. The function processes at least 6 characters of
this string, and decodes the characters it finds according to the table
below. It stops decoding when it finds a character not in the table,
rather like <code>atoi</code>; if you have a buffer which has been broken into
lines, you must be careful to skip over the end-of-line characters.
<p>The decoded number is returned as a <code>long int</code> value.
</p></blockquote></div>
<p>The <code>l64a</code> and <code>a64l</code> functions use a base 64 encoding, in
which each character of an encoded string represents six bits of an
input word. These symbols are used for the base 64 digits:
<p><table summary=""><tr align="left"><td valign="top"></td><td valign="top">0 </td><td valign="top">1 </td><td valign="top">2 </td><td valign="top">3 </td><td valign="top">4 </td><td valign="top">5 </td><td valign="top">6 </td><td valign="top">7
<br></td></tr><tr align="left"><td valign="top">0 </td><td valign="top"><code>.</code> </td><td valign="top"><code>/</code> </td><td valign="top"><code>0</code> </td><td valign="top"><code>1</code>
</td><td valign="top"><code>2</code> </td><td valign="top"><code>3</code> </td><td valign="top"><code>4</code> </td><td valign="top"><code>5</code>
<br></td></tr><tr align="left"><td valign="top">8 </td><td valign="top"><code>6</code> </td><td valign="top"><code>7</code> </td><td valign="top"><code>8</code> </td><td valign="top"><code>9</code>
</td><td valign="top"><code>A</code> </td><td valign="top"><code>B</code> </td><td valign="top"><code>C</code> </td><td valign="top"><code>D</code>
<br></td></tr><tr align="left"><td valign="top">16 </td><td valign="top"><code>E</code> </td><td valign="top"><code>F</code> </td><td valign="top"><code>G</code> </td><td valign="top"><code>H</code>
</td><td valign="top"><code>I</code> </td><td valign="top"><code>J</code> </td><td valign="top"><code>K</code> </td><td valign="top"><code>L</code>
<br></td></tr><tr align="left"><td valign="top">24 </td><td valign="top"><code>M</code> </td><td valign="top"><code>N</code> </td><td valign="top"><code>O</code> </td><td valign="top"><code>P</code>
</td><td valign="top"><code>Q</code> </td><td valign="top"><code>R</code> </td><td valign="top"><code>S</code> </td><td valign="top"><code>T</code>
<br></td></tr><tr align="left"><td valign="top">32 </td><td valign="top"><code>U</code> </td><td valign="top"><code>V</code> </td><td valign="top"><code>W</code> </td><td valign="top"><code>X</code>
</td><td valign="top"><code>Y</code> </td><td valign="top"><code>Z</code> </td><td valign="top"><code>a</code> </td><td valign="top"><code>b</code>
<br></td></tr><tr align="left"><td valign="top">40 </td><td valign="top"><code>c</code> </td><td valign="top"><code>d</code> </td><td valign="top"><code>e</code> </td><td valign="top"><code>f</code>
</td><td valign="top"><code>g</code> </td><td valign="top"><code>h</code> </td><td valign="top"><code>i</code> </td><td valign="top"><code>j</code>
<br></td></tr><tr align="left"><td valign="top">48 </td><td valign="top"><code>k</code> </td><td valign="top"><code>l</code> </td><td valign="top"><code>m</code> </td><td valign="top"><code>n</code>
</td><td valign="top"><code>o</code> </td><td valign="top"><code>p</code> </td><td valign="top"><code>q</code> </td><td valign="top"><code>r</code>
<br></td></tr><tr align="left"><td valign="top">56 </td><td valign="top"><code>s</code> </td><td valign="top"><code>t</code> </td><td valign="top"><code>u</code> </td><td valign="top"><code>v</code>
</td><td valign="top"><code>w</code> </td><td valign="top"><code>x</code> </td><td valign="top"><code>y</code> </td><td valign="top"><code>z</code>
<br></td></tr></table>
<p>This encoding scheme is not standard. There are some other encoding
methods which are much more widely used (UU encoding, MIME encoding).
Generally, it is better to use one of these encodings.
</body></html>