Low-Level File Access

Discussion of Common Lisp
Post Reply
schoppenhauer
Posts: 99
Joined: Sat Jul 26, 2008 2:30 pm
Location: Germany
Contact:

Low-Level File Access

Post by schoppenhauer » Mon Sep 13, 2010 3:24 pm

Is there any possibility (portable or not) to have fast low-level access to files using some Common Lisp Implementation? Like mmap, mapping into a vector, or so?

I have found http://blog.viridian-project.de/2009/07 ... with-mmap/ but I dont know whether this deref-call is efficient in any way.

Is there any portable library?

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Low-Level File Access

Post by ramarren » Tue Sep 14, 2010 4:40 am

You can bind MMAP using CFFI. This will be essentially the same as what the blog you link to describes, except it uses SBCL-specific layer, to which CFFI maps anyway. I doubt there is any more efficient way to access anything "low-level" that the FFI interface. I guess you could use array pinning (see for example how FFA is implemented for SBCL) and map to that address, but I have no idea if there is any actual gain in that.

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Low-Level File Access

Post by nuntius » Wed Sep 15, 2010 8:53 am

Also, you can simply specify an unsigned-byte element type when opening the file. There may be some residual stream overhead, but it cuts out the major cost of character interpretation/conversion.

What prompted this question?

schoppenhauer
Posts: 99
Joined: Sat Jul 26, 2008 2:30 pm
Location: Germany
Contact:

Re: Low-Level File Access

Post by schoppenhauer » Wed Sep 15, 2010 8:12 pm

nuntius wrote:Also, you can simply specify an unsigned-byte element type when opening the file. There may be some residual stream overhead, but it cuts out the major cost of character interpretation/conversion.

What prompted this question?
I am currently working with some large image-files, and I wanted to do as much as possible directly in Lisp, so I was interested in that.
Sorry for my bad english.
Visit my blog http://blog.uxul.de/

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Low-Level File Access

Post by nuntius » Wed Sep 15, 2010 9:25 pm

For reasonably large image files, ":element-type 'unsigned-byte" should be sufficient. Premature optimization and all that...

audwinc
Posts: 12
Joined: Thu Sep 02, 2010 11:46 am

Re: Low-Level File Access

Post by audwinc » Thu Sep 16, 2010 10:16 am

Just to clarify for my learning, with :element-type of 'unsigned-byte in place, would I make-array something the size of the file and then read-sequence that block size for the fastest I/O for this kind of binary file?

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Low-Level File Access

Post by nuntius » Thu Sep 16, 2010 2:32 pm

audwinc wrote:Just to clarify for my learning, with :element-type of 'unsigned-byte in place, would I make-array something the size of the file and then read-sequence that block size for the fastest I/O for this kind of binary file?
Yes, make a vector matching the file's :element-type and size; then read-sequence should be nearly optimal for portable code. I wouldn't worry about doing better until you have something working. (Numerous cool projects die because people try squeezing speed before getting something working. "fast*0" is much less than "slow*something")

Note that CL libraries already exist for reading most image formats.

schoppenhauer
Posts: 99
Joined: Sat Jul 26, 2008 2:30 pm
Location: Germany
Contact:

Re: Low-Level File Access

Post by schoppenhauer » Thu Sep 16, 2010 2:39 pm

reading into a sequence needs a lot of memory. using mmap is faster and normally simpler, otherwise one would have to implement random access. I wonder why mmapped files accessible via normal sequence ops are not already a common extension to cl. I mean its something obvious and should be easy to implement, and most platforms do have something similar to mmap, for the others one can simply write a wrapper for random access.
Sorry for my bad english.
Visit my blog http://blog.uxul.de/

gugamilare
Posts: 406
Joined: Sat Mar 07, 2009 6:17 pm
Location: Brazil
Contact:

Re: Low-Level File Access

Post by gugamilare » Thu Sep 16, 2010 3:14 pm

schoppenhauer wrote:reading into a sequence needs a lot of memory. using mmap is faster and normally simpler, otherwise one would have to implement random access. I wonder why mmapped files accessible via normal sequence ops are not already a common extension to cl. I mean its something obvious and should be easy to implement, and most platforms do have something similar to mmap, for the others one can simply write a wrapper for random access.
It should be up to the implementation to abstract low level optimizations away. I think some implementations like SBCL already have file access not much slower than mmap, or possibly even using it. Are you actually having trouble with speed in file access or are you just previously worried? If your case is the latter, make your program and worry about that later, otherwise, at least do some benchmark before concluding you are actually having some performance gain by using mmap instead of using CL's functions.

Then, if you really need to, you should try to implement file reading in C, compile it to a dynamic library and wrap it with CFFI, as suggested by Ramarren. And, like he said, you can even pass a Lisp array as a pointer and manipulate it in C. If you don't mind using vectors of '(unsigned-byte 8), there are portable functions available in CFFI that allow you to do that.

Post Reply