Page 1 of 1
Low-Level File Access
Posted: Mon Sep 13, 2010 3:24 pm
by schoppenhauer
Is there any possibility (portable or not) to have fast low-level access to files using some Common Lisp Implementation? Like mmap, mapping into a vector, or so?
I have found
http://blog.viridian-project.de/2009/07 ... with-mmap/ but I dont know whether this deref-call is efficient in any way.
Is there any portable library?
Re: Low-Level File Access
Posted: Tue Sep 14, 2010 4:40 am
by ramarren
You can bind MMAP using CFFI. This will be essentially the same as what the blog you link to describes, except it uses SBCL-specific layer, to which CFFI maps anyway. I doubt there is any more efficient way to access anything "low-level" that the FFI interface. I guess you could use array pinning (see for example how
FFA is implemented for SBCL) and map to that address, but I have no idea if there is any actual gain in that.
Re: Low-Level File Access
Posted: Wed Sep 15, 2010 8:53 am
by nuntius
Also, you can simply specify an unsigned-byte element type when opening the file. There may be some residual stream overhead, but it cuts out the major cost of character interpretation/conversion.
What prompted this question?
Re: Low-Level File Access
Posted: Wed Sep 15, 2010 8:12 pm
by schoppenhauer
nuntius wrote:Also, you can simply specify an unsigned-byte element type when opening the file. There may be some residual stream overhead, but it cuts out the major cost of character interpretation/conversion.
What prompted this question?
I am currently working with some large image-files, and I wanted to do as much as possible directly in Lisp, so I was interested in that.
Re: Low-Level File Access
Posted: Wed Sep 15, 2010 9:25 pm
by nuntius
For reasonably large image files, ":element-type 'unsigned-byte" should be sufficient. Premature optimization and all that...
Re: Low-Level File Access
Posted: Thu Sep 16, 2010 10:16 am
by audwinc
Just to clarify for my learning, with :element-type of 'unsigned-byte in place, would I make-array something the size of the file and then read-sequence that block size for the fastest I/O for this kind of binary file?
Re: Low-Level File Access
Posted: Thu Sep 16, 2010 2:32 pm
by nuntius
audwinc wrote:Just to clarify for my learning, with :element-type of 'unsigned-byte in place, would I make-array something the size of the file and then read-sequence that block size for the fastest I/O for this kind of binary file?
Yes, make a vector matching the file's :element-type and size; then read-sequence should be nearly optimal for portable code. I wouldn't worry about doing better until you have something working. (Numerous cool projects die because people try squeezing speed before getting something working. "fast*0" is much less than "slow*something")
Note that CL libraries already exist for reading most image formats.
Re: Low-Level File Access
Posted: Thu Sep 16, 2010 2:39 pm
by schoppenhauer
reading into a sequence needs a lot of memory. using mmap is faster and normally simpler, otherwise one would have to implement random access. I wonder why mmapped files accessible via normal sequence ops are not already a common extension to cl. I mean its something obvious and should be easy to implement, and most platforms do have something similar to mmap, for the others one can simply write a wrapper for random access.
Re: Low-Level File Access
Posted: Thu Sep 16, 2010 3:14 pm
by gugamilare
schoppenhauer wrote:reading into a sequence needs a lot of memory. using mmap is faster and normally simpler, otherwise one would have to implement random access. I wonder why mmapped files accessible via normal sequence ops are not already a common extension to cl. I mean its something obvious and should be easy to implement, and most platforms do have something similar to mmap, for the others one can simply write a wrapper for random access.
It should be up to the implementation to abstract low level optimizations away. I think some implementations like SBCL already have file access not much slower than mmap, or possibly even using it. Are you actually having trouble with speed in file access or are you just previously worried? If your case is the latter, make your program and worry about that later, otherwise, at least do some benchmark before concluding you are actually having some performance gain by using mmap instead of using CL's functions.
Then, if you really need to, you should try to implement file reading in C, compile it to a dynamic library and wrap it with CFFI, as suggested by Ramarren. And, like he said, you can even pass a Lisp array as a pointer and manipulate it in C. If you don't mind using vectors of '(unsigned-byte 8), there are portable functions available in CFFI that
allow you to do that.