Passing values by reference

Discussion of Common Lisp
vityok
Posts: 20
Joined: Fri Jul 11, 2008 6:20 am
Location: Kyiv, Ukraine
Contact:

Passing values by reference

Post by vityok » Mon Jul 14, 2008 4:30 am

Hello,

I would like to ask whether it is possible to pass parameters to functions by reference?

As far as I understand, function call parameters (even vectors, sequences, etc.) are passed by value (the form is evaluated, yielding the vector, sequence, etc. and the value is passed as a function parameter).

It seems to me, that by passing vectors and sequences by reference, it would be possible to reduce cons'ing and memory usage.

What are the general approaches to reducing memory usage by Lisp applications?

With best regards

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Passing values by reference

Post by ramarren » Mon Jul 14, 2008 5:28 am

vityok wrote:As far as I understand, function call parameters (even vectors, sequences, etc.) are passed by value (the form is evaluated, yielding the vector, sequence, etc. and the value is passed as a function parameter).
I'm afraid you understand wrong (unless it's me). There is no implicit copying in Common Lisp. Simplifying (hopefully), there are two kinds of objects: immutable (numbers, characters), which are sort of passed by value, and everything else, which is passed by reference. One must remember that CL has no lists, only conses, which could lead to problems at some point, but vectors are fine.

On the other hand, places cannot be passed to functions, since they are not first class objects. But I am not sure if this has anything to do with your question.

vityok
Posts: 20
Joined: Fri Jul 11, 2008 6:20 am
Location: Kyiv, Ukraine
Contact:

Re: Passing values by reference

Post by vityok » Mon Jul 14, 2008 5:47 am

Ramarren wrote:But I am not sure if this has anything to do with your question.
Actually, I am trying to parse CSV files stored in GZIP archives, calculate simple statistics and output results.

The calculations are simple (like computing mean value over some samples) and, generally speaking, could be made in constant space:
  • one vector to hold the sum, and
  • one vector to hold the read sample
  • some helper variables
However, processing a 515K GZIP archive with approx 4600 lines of CSV records cons'es approx. 40M of memory. This is not yet critical as the program was already run with 30M GZIP files, but there is a 300M file on the way and I would like to reduce memory consumption as much as it is reasonable before processing this archive.

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Passing values by reference

Post by ramarren » Mon Jul 14, 2008 7:01 am

vityok wrote:I would like to reduce memory consumption as much as it is reasonable before processing this archive.
Well, it is hard to say anything specific without looking at the code, but note that with generational garbage collection allocating and the throwing away object is pretty fast, so usually you only need to care about things which remain allocated thorough the run. Obviously, decompressing a 300 MB file in memory might not be a good idea, but if it is streamed, I don't see how actual memory footprint would depend on size of the file.

findinglisp
Posts: 447
Joined: Sat Jun 28, 2008 7:49 am
Location: Austin, TX
Contact:

Re: Passing values by reference

Post by findinglisp » Mon Jul 14, 2008 8:18 am

vityok wrote:I would like to ask whether it is possible to pass parameters to functions by reference?

As far as I understand, function call parameters (even vectors, sequences, etc.) are passed by value (the form is evaluated, yielding the vector, sequence, etc. and the value is passed as a function parameter).

It seems to me, that by passing vectors and sequences by reference, it would be possible to reduce cons'ing and memory usage.
The best way to think about this is that all Lisp objects are allocated in the heap when they are created (even integers and characters), and they are always referenced by a pointer. The pointer is always passed to each function, so there isn't any copying involved (except for perhaps the pointer onto a stack, etc.). Now, putting everything into the heap creates garbage needlessly and is inefficient for some small data types (notably integers). Thus, most implementations actually fuse the small data types with the pointer for the sake of efficiency. A few of the low-order bits (called tag bits) are typically used to determine whether the pointer value is really a pointer value to something on the heap or a small immediate value. When Ramarren says that there are two types of objects, that's what he's referring to.

Now, because objects are always passed by reference, if you're working with a larger, aggregate object (list, vector, CLOS object, etc.), you can actually modify that larger object using mutable functions. Thus, it's quite common to pass vectors into functions where the data values in the vector are written with new values. With lists, because there is a lot of underlying structure there (linked cons cells), you have to be a little more careful about modifying them than with vectors. If you change the head of the list for any reason, the reference that the calling function had passed in is no longer valid and so you typically have to return a reference to the new list. Look at NCONC, for instance. It destructively joins the two lists, but it returns a reference to the new list.

Code: Select all

(setf foo '(a b c))
(setf bar '(d e f))
(nconc foo bar)   ; <= error!!! The new head of the list may not be FOO
This code might work sometimes, on some implementations, but the right way to use this is:

Code: Select all

(setf foo '(a b c))
(setf bar '(d e f))
(setf foo (nconc foo bar))  ; <= correct
What are the general approaches to reducing memory usage by Lisp applications?
Generally, you can reduce memory consumption in similar ways as you would any GC'd language such as Java--by reusing data structures and destructively modifying them. You'll find a lot of duality in the Lisp library, where there are both pure functional library functions that create new lists (e.g. APPEND) as well as a destructive versions of the same operation (e.g. NCONC). As Ramarren said, though, you'd be better off not succumbing to premature optimization. Garbage does take time to collect, but depending on what your program is doing, that may or may not be very significant. Don't try to program Lisp like you would C, for instance. You'll end up creating a lot of bugs for yourself (accidentally modify something that you shouldn't) and it might not bring as much benefit as you'd like.

The SBCL implementation of Common Lisp has a good compiler, a good optimizer, and a generational GC. I would write your implementation in the most natural style and then use the profiler to figure out where the slow parts are after you have determined that the natural implementation style doesn't meet your performance goals.
Cheers, Dave
Slowly but surely the world is finding Lisp. http://www.findinglisp.com/blog/

tayloj

Re: Passing values by reference

Post by tayloj » Mon Jul 14, 2008 9:44 am

findinglisp wrote:
vityok wrote:With lists, because there is a lot of underlying structure there (linked cons cells), you have to be a little more careful about modifying them than with vectors. If you change the head of the list for any reason, the reference that the calling function had passed in is no longer valid and so you typically have to return a reference to the new list. Look at NCONC, for instance. It destructively joins the two lists, but it returns a reference to the new list.

Code: Select all

(setf foo '(a b c))
(setf bar '(d e f))
(nconc foo bar)   ; <= error!!! The new head of the list may not be FOO
This code might work sometimes, on some implementations, but the right way to use this is:

Code: Select all

(setf foo '(a b c))
(setf bar '(d e f))
(setf foo (nconc foo bar))  ; <= correct
Not to be too pedantic, for it is, in general, good style to make sure that the result of a destructive function is preserved, e.g., saving the results of SORT, or DELETE, whose output may not be eq to their input. From the HyperSpec entry on NCONC, however,
nconc is defined using the following recursive relationship:

Code: Select all

(nconc) =>  ()
(nconc nil . lists) ==  (nconc . lists)
(nconc list) =>  list
(nconc list-1 list-2) ==  (progn (rplacd (last list-1) list-2) list-1)
(nconc list-1 list-2 . lists) ==  (nconc (nconc list-1 list-2) . lists)
which means that the result of (nconc l1 l2 ... ln) will be eq to the first non-nil li (and, of course, nil if they are all empty lists, and nil is eq to nil). E.g.,

Code: Select all

(setf a '())
(setf b (list 1 2 3))
(setf c (nconc a b))
(eq a c) => false
But it's always well defined which of nconc's arguments are eq to its result.

death
Posts: 17
Joined: Sat Jun 28, 2008 1:44 am

Re: Passing values by reference

Post by death » Mon Jul 14, 2008 10:36 am

Just a quick note: arguments are always passed by value, i.e. copied, in Common Lisp, but here "arguments" denotes "implicit pointers to objects" (disregarding optimization tricks). That is why the function below will never change whatever object it is called with.

Code: Select all

(defun foo (x)
  (setf x 42))

findinglisp
Posts: 447
Joined: Sat Jun 28, 2008 7:49 am
Location: Austin, TX
Contact:

Re: Passing values by reference

Post by findinglisp » Mon Jul 14, 2008 12:17 pm

death wrote:Just a quick note: arguments are always passed by value, i.e. copied, in Common Lisp, but here "arguments" denotes "implicit pointers to objects" (disregarding optimization tricks). That is why the function below will never change whatever object it is called with.

Code: Select all

(defun foo (x)
  (setf x 42))
Yes, you're right. I said that wrong. The references are always copied, so it's call by value, but if the reference is for a whole list, the list itself is not copied. So, as you say in your example, x is a binding to a value. But if you did this:

Code: Select all

(defun foo (x)
    (setf (car x) 42))
Then it would modify the car of the first cons cell referenced by x.
Cheers, Dave
Slowly but surely the world is finding Lisp. http://www.findinglisp.com/blog/

findinglisp
Posts: 447
Joined: Sat Jun 28, 2008 7:49 am
Location: Austin, TX
Contact:

Re: Passing values by reference

Post by findinglisp » Mon Jul 14, 2008 12:20 pm

tayloj wrote: Not to be too pedantic, for it is, in general, good style to make sure that the result of a destructive function is preserved, e.g., saving the results of SORT, or DELETE, whose output may not be eq to their input.
Yea, that was a bad example to grab out of thin air. I just reached for the first destructive function that came to mind. Something like NREVERSE would have been a better example, or SORT and DELETE as you mention.
Cheers, Dave
Slowly but surely the world is finding Lisp. http://www.findinglisp.com/blog/

vityok
Posts: 20
Joined: Fri Jul 11, 2008 6:20 am
Location: Kyiv, Ukraine
Contact:

Re: Passing values by reference

Post by vityok » Tue Jul 15, 2008 5:34 am

Thanks everybody for helping me.

It seems that the issue becomes more clear to me. As far as I understood, the vectors I pass to functions are not copied, instead, references are passed.

I have also conducted a very simple test:

Code: Select all

CL-USER> (setq a (make-sequence '(vector float) 3 :initial-element 0))
#(0 0 0)

CL-USER> (defun dumb (d)
	   (setf (elt d 0) 1))
DUMB
CL-USER> (dumb a)
1
CL-USER> a
#(1 0 0)
With best regards,

Post Reply