Reading formatted data

Discussion of Common Lisp
edgar-rft
Posts: 226
Joined: Fri Aug 06, 2010 6:34 am
Location: Germany

Re: Reading formatted data

Post by edgar-rft » Sat Jun 25, 2011 3:58 am

Paul wrote:The OP said scanf(3) can do it, so he's talking about ASCII decimal representations, not binary format. In which case, READ will read them (just set *read-default-float-format* to 'double).
'double is no valid Common Lisp type specifier, I assume you meant 'double-float, and here is what happens:

Code: Select all

CL-USER> (setf *read-default-float-format* 'double-float)
DOUBLE-FLOAT

CL-USER> 100,0
error: comma is not inside a backquote

CL-USER> 1,000,000.0
error: comma is not inside a backquote

CL-USER> 1.000.000,0
error: The variable |1.000.000| is unbound.
Yes, its true, C's scanf also can't read these numbers, too, but all are valid ASCII representations of floating-point numbers. And this means nothing else than that C's scanf is incapable of reading floating-point numbers, too. This is a reality that happens to me every day.
Paul wrote:C requires that a "double" have at least 37 bits of mantissa equivalent, so there are no 16- or 32-bit "double"s.
DSP machines (producing measurement data) usually work with assembly code, not with C and there are still many mesurement machines using 8-bit and 16-bit architectures, and in a 8-bit architecture a "double" has 16-bit.
Paul wrote:Can you give an example of an "inexact approximation of a number"?
What I meant was of course:

Code: Select all

CL-USER> (float 1/3)
0.33333334f0
Okay, this is an approximation of the value of a number, and not an approximation of the number itself.
marcoxa wrote:Guys... we (should) all have read Goldberg's paper, but that has little import to the OP question...
That's true - [David Goldberg: What Every Computer Scientist Should Know About Floating-Point Arithmetic]

But that unfortunately doesn't change the fact that the question can't be answered as long as nothing is known about the floating-point print format. C's scanf cannot serve as a reference because it can't read floating-point numbers correctly.

The main topic of my rant was (and still is):

I have to deal every day with floating-point measurement data in printed form, either in ASCII or Unicode data-files or printed on paper, where nothing is known about the tolerances involved from hardware limitations like unknown quantisation depth or software limitations introduced by sample frequency interpolation or floating-point rounding errors, so the data is virtually useless. This is even worsened by the fact that most english/american C-programs cannot even interpret printed floating-point data correctly.

C's scanf has only very limited functionality and in most cases can only read number printed by C's printf but not by other programming languages. Same to say about Lisp's READ, that only can read numbers in Lisp format. [Same to say about the scan or read functions of any other programming language].

So the original question still remains: Is there any solution known how to read floating-point numbers in any print-format correctly?

- edgar

marcoxa
Posts: 85
Joined: Thu Aug 14, 2008 6:31 pm

Re: Reading formatted data

Post by marcoxa » Sat Jun 25, 2011 1:14 pm

edgar-rft wrote:
So the original question still remains: Is there any solution known how to read floating-point numbers in any print-format correctly?

- edgar
Yes. Even in Common Lisp, provided that you actually define the "floating point" syntax. Common Lisp defines one for you. I am sure that www.kli.org has another idea of what constitutes a valid typographical representation of a floating number :ugeek:

Cheers
Marco Antoniotti

edgar-rft
Posts: 226
Joined: Fri Aug 06, 2010 6:34 am
Location: Germany

Re: Reading formatted data

Post by edgar-rft » Sat Jun 25, 2011 2:02 pm

marcoxa: I am sure that www.kli.org has another idea ... - LOL :lol: :lol: :lol:

In case of doubt: I'm indeed actually writing a parser converting "special"-formatted float-strings into Common Lisp syntax. I'm already halfway through but please let me first find the most wackiest bugs on my own, then I will show you the results...

The original parser, where I have stolen a lot of code, is here: cs.cmu.edu/user/ai/lang/lisp/code/math/atof/atof.cl

Hopefully the Klingons will like it, too.

- edgar

Paul
Posts: 106
Joined: Tue Jun 02, 2009 6:00 am

Re: Reading formatted data

Post by Paul » Sat Jun 25, 2011 5:55 pm

edgar-rft wrote: DSP machines (producing measurement data) usually work with assembly code, not with C and there are still many mesurement machines using 8-bit and 16-bit architectures, and in a 8-bit architecture a "double" has 16-bit.
"Double", in this sense, doesn't mean "two hardware words", it means "more bits than 'float'" (presumably "double", but I suspect there are probably implementations out there where a "double" has fewer than that) -- do you know of a machine with 8-bit floating point hardware? What can you represent with that?

(I've used 8 bit machines with no hardware float support, and 48 bit floating point numbers implemented in software...the one and only float type it had, though, so that's a "float", not a "double").
Paul wrote:Can you give an example of an "inexact approximation of a number"?
What I meant was of course:

Code: Select all

CL-USER> (float 1/3)
0.33333334f0
Okay, this is an approximation of the value of a number, and not an approximation of the number itself.
Depends what you think it represents. It's an exact number, it's just not the same number as 1/3.

Post Reply