This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
engineering an opposable thumb for guile?
- To: guile at sourceware dot cygnus dot com
- Subject: engineering an opposable thumb for guile?
- From: Daniel Ortmann <ortmann at vnet dot ibm dot com>
- Date: Mon, 24 Jan 2000 17:21:15 -0600
- Reply-To: "Daniel Ortmann" <ortmann at vnet dot ibm dot com>
Sirs,
I think the following missing guile feature is SO IMPORTANT that its lack
should be considered a bug of the highest order. Portability is good, but
practical engineering use requires that the following "opposable thumb" be
added to guile. Its lack certainly stops my guile usage in its tracks. :-(
Thank you so much for your consideration!
SUMMARY:
I believe Guile needs flexible versions of perl's "pack" and "unpack"
functions. Built in the spirit of Scheme, of course.
CONTEXT:
My work is in the messy world of practical seat-of-the-pants electrical
engineering. Often I cannot adapt problems to myself but must adapt myself to
those problems.
The current problem, practically solvable only with perl, consists of reading
and writing 4-byte or 8-byte binary floating point numbers from a file. Here
is some perl code which does this:
perl -e 'undef $/; $f = <>; print join( "\n", unpack "f*", $f ), "\n"' file-of-4-byte-binary-numbers
Guile appears to be missing a similar type of "opposable thumb". :-(
PROPOSED SOLUTION:
Scheme should be able to generate clean and efficient code. I don't believe
that this is only possible in C and perl.
- Should support general "endianness".
- Should support *all* IEEE floats that the hardware platform supports
(4-byte, 8-byte, extended, etc).
- Should support at least the perl-like functionality marked below ...
- ... but should do it with a clean Scheme design.
- Should support *all* hardware integer values. But saying "long" is not
enough because one machine's "long" is 4 bytes while another is 8 bytes. A
lot of issues need to be considered. And don't forget "long long".
- Don't forget the sizes of various pointers, including function pointers.
- Compiler alignment issues need to be considered because of padding. This is
the "padding you get when you specify XYZ option."
- Explicit padding (mentioned below with the "x" template format character.
- Perhaps should issue "warning: machine dependent code!" messages when used.
... I don't know. Bigger brains than mine need to think about this.
:-)
ADDITIONAL DOCUMENTATION:
The following documentation is from the "perlfunc" manpage.
pack TEMPLATE,LIST
Takes an array or list of values and packs it into a binary
structure, returning the string containing the structure. The
TEMPLATE is a sequence of characters that give the order and type of
values, as follows:
A An ascii string, will be space padded.
a An ascii string, will be null padded.
b A bit string (ascending bit order, like vec()).
B A bit string (descending bit order).
h A hex string (low nybble first).
H A hex string (high nybble first).
c A signed char value.
C An unsigned char value.
s A signed short value.
S An unsigned short value.
i A signed integer value.
I An unsigned integer value.
l A signed long value.
L An unsigned long value.
n A short in "network" order.
N A long in "network" order.
v A short in "VAX" (little-endian) order.
V A long in "VAX" (little-endian) order.
f A single-precision float in the native format.
d A double-precision float in the native format.
p A pointer to a null-terminated string.
P A pointer to a structure (fixed-length string).
u A uuencoded string.
x A null byte.
X Back up a byte.
@ Null fill to absolute position.
Each letter may optionally be followed by a number which gives a
repeat count. With all types except "a", "A", "b", "B", "h" and "H",
and "P" the pack function will gobble up that many values from the
LIST. A * for the repeat count means to use however many items are
left. The "a" and "A" types gobble just one value, but pack it as a
string of length count, padding with nulls or spaces as necessary.
(When unpacking, "A" strips trailing spaces and nulls, but "a" does
not.) Likewise, the "b" and "B" fields pack a string that many bits
long. The "h" and "H" fields pack a string that many nybbles long.
The "P" packs a pointer to a structure of the size indicated by the
length. Real numbers (floats and doubles) are in the native machine
format only; due to the multiplicity of floating formats around, and
the lack of a standard "network" representation, no facility for
interchange has been made. This means that packed floating point
data written on one machine may not be readable on another - even if
both use IEEE floating point arithmetic (as the endian-ness of the
memory representation is not part of the IEEE spec). Note that Perl
uses doubles internally for all numeric calculation, and converting
from double into float and thence back to double again will lose
precision (i.e. `unpack("f", pack("f", $foo)') will not in general
equal $foo).
Examples:
$foo = pack("cccc",65,66,67,68);
# foo eq "ABCD"
$foo = pack("c4",65,66,67,68);
# same thing
$foo = pack("ccxxcc",65,66,67,68);
# foo eq "AB\0\0CD"
$foo = pack("s2",1,2);
# "\1\0\2\0" on little-endian
# "\0\1\0\2" on big-endian
$foo = pack("a4","abcd","x","y","z");
# "abcd"
$foo = pack("aaaa","abcd","x","y","z");
# "axyz"
$foo = pack("a14","abcdefg");
# "abcdefg\0\0\0\0\0\0\0"
$foo = pack("i9pl", gmtime);
# a real struct tm (on my system anyway)
sub bintodec {
unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
}
The same template may generally also be used in the unpack function.
unpack TEMPLATE,EXPR
Unpack does the reverse of pack: it takes a string representing a
structure and expands it out into a list value, returning the array
value. (In a scalar context, it merely returns the first value
produced.) The TEMPLATE has the same format as in the pack function.
Here's a subroutine that does substring:
sub substr {
local($what,$where,$howmuch) = @_;
unpack("x$where a$howmuch", $what);
}
and then there's
sub ordinal { unpack("c",$_[0]); } # same as ord()
In addition, you may prefix a field with a %<number> to indicate that
you want a <number>-bit checksum of the items instead of the items
themselves. Default is a 16-bit checksum. For example, the
following computes the same number as the System V sum program:
while (<>) {
$checksum += unpack("%16C*", $_);
}
$checksum %= 65536;
The following efficiently counts the number of set bits in a bit
vector:
$setbits = unpack("%32b*", $selectmask);
Thanks!
--
Daniel Ortmann, IBM Circuit Technology, Rochester, MN 55901-7829
ortmann@vnet.ibm.com or ortmann@us.ibm.com and 507.253.6795 (external)
ortmann@rchland.ibm.com and tieline 8.553.6795 (internal)
ortmann@isl.net and 507.288.7732 (home)
"The answers are so simple, and we all know where to look,
but it's easier just to avoid the question." -- Kansas