Lesson of the Day

Post by **gugamilare** » Sat Sep 11, 2010 12:03 pm

My code is at Github, there is a simple version of quicksort there (which only works for simple vectors). I still did not test it against SBCL's stable-sort, I have some optimizations to do before that. There is also a quicksort which uses the "mean of means" as a pivot and is guaranteed it to be n*log(n), but makes it become slower than (my version of) merge sort. I'll be working on optimizations and keep you posted

Regards,
Gustavo.

Post by **gugamilare** » Sat Sep 11, 2010 12:46 pm

By the way, you get even better results if you use

Code: Select all

(defun fixnum< (a b)
  (< (the fixnum a) (the fixnum b)))

(stable-sort vector #'fixnum<)

Because passing the #'< to sort allocates a list (because it uses &rest). This is a known limitation with SBCL which is posted on SBCL's launchpad.

Tom · Post by **Tom** » Sun Sep 19, 2010 10:07 am

Warren Wilkinson wrote:You know your problem domain better than I do; if you say that you've got objects that are best described with optional slots, I'll believe it. In anycase, this thread seemed an appropriate place for some function timing information I gleaned from SBCL. If you disagree, I'll move it elsewhere.

I think what we are having here is a failure to communicate. :-)

Anyway, the specific sorting algorithm of an implementation is a fine topic for this thread. I never bother worrying about the specific implementation of SORT and friends, so it was interesting to see some details.

Thanks,

~ Tom

Warren Wilkinson wrote: Sorting In SBCL

In my application I store a lot of data presorted on disk; once there are enough updates, the updates and data are merged and rewritten to disk. In the mean time, however, I must still merge the data I've loaded from the disk and the in memory patches together everytime users request any range of that data.

I had considered storing a lazily computed 'final location' with each update, so I avoid needlessly recomputing that location (which I initially would have done the simplist way: linear search). Then I had a better idea: Since the data will be largely in order anyway, why not just cons it then sort?

How well this works depends on the sorting algorithm and (perhaps) the format of the data (list vs vector). Before taking this route, I checked out sort in the SBCL git repository, and ran some timing tests. SBCL uses merge sort which is brilliant because its optimal for linked list, and as far as I can tell it works great when data is mostly sorted. Its fast for both lists and vectors, whichever I prefer to use:
Code: Select all
(defun worsen (thing) (let ((a (random 800)) (b (random 800)))
			(let ((temp (elt thing b)))
			  (setf (elt thing b) (elt thing a)
				(elt thing a) temp))))

(defvar *badness0* (iota 800))
(defvar *badness10* (iota 800))
(defvar *badness100* (iota 800))
(defvar *badness1000* (iota 800))
(loop repeat 10 do (worsen *badness10*))
(loop repeat 100 do (worsen *badness100*))
(loop repeat 1000 do (worsen *badness1000*))

(defvar *vbadness0* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness10* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness100* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness1000* (make-array 800 :initial-contents (iota 800)))
(loop repeat 10 do (worsen *vbadness10*))
(loop repeat 100 do (worsen *vbadness100*))
(loop repeat 1000 do (worsen *vbadness1000*))

(defun time-sort (array &aux (copies (loop repeat 10000 collect (copy-seq array))))
  (time (dolist (copy copies) (sort copy #'<))))

(time-sort *badness0*) ;; 5.048 seconds
(time-sort *badness10*) ;; 6.248 seconds
(time-sort *badness100*) ;; 7.354 seconds
(time-sort *badness1000*) ;; 7.803 seconds

(time-sort *vbadness0*) ;; 3.308 seconds
(time-sort *vbadness10*) ;; 3.388 seconds
(time-sort *vbadness100*) ;; 3.325 seconds
(time-sort *vbadness1000*) ;; 3.535 seconds

LispForum

Lesson of the Day

Re: Lesson of the Day

Re: Lesson of the Day

Re: Lesson of the Day