a comparison sort function..

Discussion of Common Lisp
Post Reply
vimsical
Posts: 6
Joined: Wed Oct 27, 2010 10:06 pm

a comparison sort function..

Post by vimsical » Thu Dec 01, 2011 11:45 pm

As an exercise to LISP, I would like to write an in place sorting function to sort a simple vector of number, but I would like to make it general enough that it can also sort a vector of a list like:

Code: Select all

(vector '(1 . a) '(2 . b) '(1.5 . c))
So, I am thinking writing something like this:

Code: Select all

(defun my-sort (vec &key (idl 0) (idr (1- (length vec))) (less-fn #'<))
   " sort vec from range index idl to idr using less-fn as comparison function "
 ...
So that user of this function can pass a comparison function as argument like this:

Code: Select all

(my-sort vec :less-fn (lambda (x y) (< (car x) (car y))))
As a beginner, I would like to ask is passing a comparison function as an argument the way to do it? Are there more LISPy ways? Thanks.

smithzv
Posts: 94
Joined: Wed Jul 23, 2008 11:36 am

Re: a comparison sort function..

Post by smithzv » Fri Dec 02, 2011 1:19 am

vimsical wrote: As a beginner, I would like to ask is passing a comparison function as an argument the way to do it? Are there more LISPy ways? Thanks.
Yes, it is. Also there are more Lispy ways to do it. The SORT function in CL also includes a "key" function, which simplifies the comparison predicate. This is clear if you consider how you would sort your example using CL:SORT.

Only other Lispy advice I can think of is that Lisp functions, by convention, try to avoid the "modify argument in place" style in preference of the "the return value is the result but the argument can no longer be considered valid" style. This doesn't mean much for your application (sorting a vector). All this means is that the sorted array needs to be the return value.

vimsical
Posts: 6
Joined: Wed Oct 27, 2010 10:06 pm

Re: a comparison sort function..

Post by vimsical » Fri Dec 02, 2011 3:58 am

smithzv wrote:Only other Lispy advice I can think of is that Lisp functions, by convention, try to avoid the "modify argument in place" style in preference of the "the return value is the result but the argument can no longer be considered valid" style. This doesn't mean much for your application (sorting a vector). All this means is that the sorted array needs to be the return value.
Pondering what you mean. (my-sort vec ...) shouldn't sort in place? And my-sort looks like this:

Code: Select all

(defun my-sort ( vec &key less-fn ...)
  ;; quicksort vec
  vec)
So it sorts the array in-place, then return it.

I just check that (sort vec ...) also modifies vec.

smithzv
Posts: 94
Joined: Wed Jul 23, 2008 11:36 am

Re: a comparison sort function..

Post by smithzv » Fri Dec 02, 2011 4:32 am

vimsical wrote:Pondering what you mean. (my-sort vec ...) shouldn't sort in place? And my-sort looks like this:
That depends on what you mean. If, by sort in place, you mean it moved elements around inside the vector, then, it is fine if it sorts in place (it is often times a nice thing memory usage wise to guarantee this). If you mean it sorts the value that is held in the vec symbol and after you execute (my-sort vec ...), vec is sorted, that is generally considered bad style (though often times quite possible). It is better to return the result. This latter style is common in the C language.

Functions like SORT and a bunch that start with the letter "N" are destructive functions in that they mangle their input in order to create their output. This is an point of often confusion for new Lisp users.

Code: Select all

(let ((data (generate-data)))
  ;; data is valid at this point
  (print (sort #'< data))
  ;; data is no longer considered valid.  In general you should not do the following
  (print data))
Using DATA after the sort is dangerous. If you are sorting a list with SORT it will probably be corrupted. When you are sorting a vector "in place", you will almost certainly use the same vector for your result, and the user might get away with using data after it has been sorted (even when using CL:SORT). However, for consistency's sake, it's nice if functions return their results.

vimsical
Posts: 6
Joined: Wed Oct 27, 2010 10:06 pm

Re: a comparison sort function..

Post by vimsical » Fri Dec 02, 2011 5:55 am

smithzv wrote:Using DATA after the sort is dangerous. If you are sorting a list with SORT it will probably be corrupted. When you are sorting a vector "in place", you will almost certainly use the same vector for your result, and the user might get away with using data after it has been sorted (even when using CL:SORT). However, for consistency's sake, it's nice if functions return their results.
So after

Code: Select all

(setf y (destructive-function x))
one really have to assume that x is no longer valid. The fact that for sort and nreverse y is the same as x should be considered a "lucky" situation.

smithzv
Posts: 94
Joined: Wed Jul 23, 2008 11:36 am

Re: a comparison sort function..

Post by smithzv » Fri Dec 02, 2011 9:13 am

Yes, I believe this is the truth both in the wording and the spirit of CL.

To go a bit further, but perhaps not pertinent. As I alluded to, CL is flexible enough that you don't need to do this (as say, a purely functional language wouldn't be) so there might be cases where you decide you want to write a function that modifies data "in place" (in the sense where we don't have to use the result). But even if you wanted to write this way, you should still return some meaningful result since, well, why not?

vimsical
Posts: 6
Joined: Wed Oct 27, 2010 10:06 pm

Re: a comparison sort function..

Post by vimsical » Sat Dec 03, 2011 1:09 pm

smithzv wrote:Yes, I believe this is the truth both in the wording and the spirit of CL.

To go a bit further, but perhaps not pertinent. As I alluded to, CL is flexible enough that you don't need to do this (as say, a purely functional language wouldn't be) so there might be cases where you decide you want to write a function that modifies data "in place" (in the sense where we don't have to use the result). But even if you wanted to write this way, you should still return some meaningful result since, well, why not?
I can think of at least one reason why not, and it could be an invalid one, given my limited knowledge of CL. It boils down to whether return is by-value or by-reference. If a copy is returned, making copy cost computing time. If a reference is returned, there is the danger of unsuspecting user modifying the original object by modifying the returned reference. In language like C, the syntax suggests (but not guarantee) behavior.

Code: Select all

type f1(type *obj) { ... }
type* f2(type *obj) {... }
The first one suggests returning a copy some type object. The second suggests returning a reference to a type object, it could points to the same one obj points to and one should use with caution.

In CL, who knows when I do:

Code: Select all

(let (x (mystery-function obj))
  (things-I-do-to x))
whether I will be also modifying obj.

smithzv
Posts: 94
Joined: Wed Jul 23, 2008 11:36 am

Re: a comparison sort function..

Post by smithzv » Sat Dec 03, 2011 1:46 pm

This is a good point and something Lispers need to constantly keep in mind. CL is a pass by-reference language. I believe most Lisps, including Scheme, are pass by reference for various reasons inherent in basically all computer science, actually (the only exception I can think of is NewLisp). Take a look at discussions regarding equality and generic copying of objects for some insight as to why pass-by-value can be a hairy issue. (Note that I'm not talking about the extreme pass by reference from languages like Fortran/C++ where your variables inside the function are not even scoped to the body of the function. In CL, setting a variable to a value doesn't change the value of the variable in the call frame, in fact you have to use some underhanded tricks to get that to work. By pass by reference, I mean basically the C sense of the phrase. Nothing is copied and modifying constituent parts will change things in other stack frames. Edit: I now remember that C does allow you to pass structures by value (shallow copy) but it is rarely used. I'm thinking of how programmers choose to pass everything that isn't a base type in C by address)

This means, as you have correctly noted, that the user must be wary of return values. Particularly, it is the case that if you start mutating bits of returned values, you may cause very odd behavior. For instance, when I was first starting out, it was utterly amazing to me that:

Code: Select all

* (defun return-list () '(1 2 3))
RETURN-LIST

* (return-list)
(1 2 3)

* (let ((list (return-list)))
    (setf (second list) -2))
-2
* (return-list)
(1 -2 3)
In this case, SBCL is closing over the list '(1 2 3) and when I mutate it, I effectively alter the functions behavior, permanently. I even believe that this is not a predictable feature as a different Lisp implementation might interpret '(1 2 3) to build a new list each time it is executed (though you can get this behavior predictably by explicitly closing over the list). Also, I am basically altering a constant piece of data, which is a no-no and SBCL would have warned me about if it was able to detect it.

A good rule to follow is that unless a function API documentation explicitly tells you it is returning freshly consed (Lisp lingo for allocated) data, you should assume that it is not yours to mutate. An even better rule is, treat lists as immutable data types (don't change values in them, build new ones) and everything else as a mutable type. Really the whole thing is a bit of a hornets' nest but in practice it doesn't bother anyone basically because of these two rules that basically everybody follows.

Post Reply