muuh-gnu wrote:Such simple tasks shouldnt be so non-obvious to do in Lisp, from a beginners point of view, or there should be a "batteries included" implementation delivering all those "standard" functionalities out of the box like python does.
Reading, identifying and interpreting floating-point numbers is not a "simple task" [see the examples below], and I'm in severe doubt that Python can do this, at least not with the "batteries included" libraries.
The main "problem" in this case is that Common Lisp is defined as a hardware independent programming language and the data type "double" is strongly hardware dependent (and software dependent, too).
So the question must be:
What is wrong with "doubles"?
The trouble begins with the fact that there are several specifications for data types called "double":
- A "double" can be an 16-bit (double byte) or 32-bit (double 16-bit register) or 64-bit (double 32-bit register) integer. All these formats have been used in the past on various hardware platforms. Probably somewhere in the future there will be double 64-bit, double 128-bit, and so on integers.
- The IEEE_754 defines a floating-point data type named "double" that is from the math point of view not even a number, instead it is an inexact approximation of a number because of several hardware and software limitations.
Summary: the data type "double" virtually tells nothing about the data. It neither tells if the number is an integer or a floating-point number, nor tells it if the data is a number at all. The IEEE_754 defines e.g. "NaN" as "not a number".
For the further discussion I assume that we are talking about IEEE_754 "double" floating-point numbers.
The problem with IEEE_754 floating-point numbers is that only the memory format is defined but not the exaxt printed representation, what has to the consequence that there exist several locale dependent print-formats of IEEE_754 floating-point numbers:
100.0 = "one hundred dot zero" in english/american notation
100,0 = "one hundred comma zero" often used e.g. in Europe
In America the comma is often used as a "thousands" separator, while in Europe the dot is used for this purpose. This leads to an even worse "one million" mess:
1,000,000.0 = "one million dot zero" using commas as "thousands" separator
1.000.000,0 = "one million comma zero" using dots as "thousands" separator
And there are also various "e" formats for "standard", "scientific" and whatever exponentiation.
Now the contest: Whoever can tell me a library in
any programming language that can
reliably read, identify and interpret all the myriads of "double" printed integer and floating-point formats wins a big piece of pie.
I myself don't know a single library in
any programming language that can do this.
The "rft" in "edgar-rft" is the german abbreviation of "Radio/Television Broadcast Technician". My job is to work with all sort of hardware and software measurement equipment every day and I can tell you that printed measurement data in floating-point "double" format can be considered as completely useless as long as nothing is known about the exact print-format of the "doubles" and the floating-point errors of the machine that has produced the data.
To me the question is: what do you want to achive with measurement data represented by an inreliable print-format?
- edgar
P.S: I will try to write a Common Lisp function to read your data into an array if you can tell more about the "doubles" print-format.