Lesson of the Day

Discussion of Common Lisp
Tom
Posts: 22
Joined: Sat Jun 28, 2008 12:52 pm
Location: Wichita, KS
Contact:

Lesson of the Day

Post by Tom » Thu Sep 09, 2010 10:09 am

It seems that I learn something new about Common Lisp just about every day. As I learn more about Common Lisp, I often cringe at some hack that I've employed in my code as a result of my ignorance. I'd like this topic to be a place for people to post their lesson of the day.

Today's lesson relates to using the CLOS and constructor functions. As recommended in Keene[1], I like to define constructor functions for objects of the form make-foo. Then I can enforce mandatory and optional INITARGs for making an instance of the object. Plus, I can hide construction of the object behind an API. I've struggled in the past with how to properly pass the optional INITARGS to MAKE-INSTANCE. This week I had a requirement for using the &ALLOW-OTHER-KEYS lambda list argument. While reading through Section 3.4.1 of the Hyperspec, it suddenly struck me how to properly pass the optional INITARGS to MAKE-INSTANCE.

Code: Select all

(defclass scratch-object ()
  ((mandatory1
    :initarg :mandatory1
    :accessor mandatory1)
   (mandatory2
    :initarg :mandatory2
    :accessor mandatory2)
   (option1
    :initarg :option1
    :accessor option1)
   (option2
    :initarg :option2
    :accessor option2)
   (option3
    :initarg :option3
    :accessor option3))
  (:default-initargs
   :option1 "Option 1"
   :option2 "Option 2"
   :option3 "Option 3")
  (:documentation
   "A scratch object for testing concepts."))

(defun make-scratch-object (mandatory1 mandatory2
                            &rest all-keys
                            &key option1 option2 option3)
  "Return a new instance of a scratch object."
  (apply #'make-instance
         'scratch-object
         :mandatory1 mandatory1
         :mandatory2 mandatory2
         all-keys))
This constructor will properly handle the mandatory and optional INITARGS and will also check the keywords. That was my Common Lisp lesson of the day.

Cheers,

~ Tom

[1] Sonja E. Keen, "Object-Oriented Programming in Common Lisp", Addison-Wesley, 1989.

Warren Wilkinson
Posts: 117
Joined: Tue Aug 10, 2010 11:24 pm
Location: Calgary, Alberta
Contact:

Re: Lesson of the Day

Post by Warren Wilkinson » Thu Sep 09, 2010 5:10 pm

I'm not entirely sure what your trying to do.. it sounds like you've decided to always do something (in this case, initialize certain slots), but rather than just doing it, you've written extra code to look over your shoulder.

If you want lisp to bug you if you fail to provide a value for an CLOS slot, try:

Code: Select all

(defclass my-obj ()
   ((mandatory :initarg :mandatory :accessor mandatory :initform (error "Mandatory argument not provided."))
     (optional :initarg :optional :accessor optional)))
This will signal an error if the slot value isn't specified. You could also use a macro:

Code: Select all

(defmacro make-my-obj (mandatory &rest args) `(make-instance 'my-obj :mandatory ,mandatory ,@args))
But personally I wouldn't bother, it's the callers responsibility to make sure they instantiate objects properly: 'garbage input, garbage output'.
Need an online wiki database? My Lisp startup http://www.formlis.com combines a wiki with forms and reports.

Tom
Posts: 22
Joined: Sat Jun 28, 2008 12:52 pm
Location: Wichita, KS
Contact:

Re: Lesson of the Day

Post by Tom » Thu Sep 09, 2010 8:05 pm

The primary goal of that constructor interface I outlined is to clearly delineate the necessary and optional arguments. The secondary goal of the constructor interface is to work with your IDE to provide feedback when you are using the interface. The 2 IDE's that I am familiar with are SLIME and LispWorks. Both provide argument list hints while you are typing. Relying on MAKE-INSTANCE directly does not accomplish either of these goals.

In the past I've tried both approaches you suggest. There are 2 problems with the :INITARG + :INITFORM approach. The first is that you don't get any feedback that the slot is required at initialization until you actually try to make an instance of the object. The second problem is that :INITARG and :INITFORM have separate use cases. The intent of :INITARG is to declare a symbol to use to initialize the slot combined with :DEFAULT-INITARGS for specifying a default value. The intent of :INITFORM is to define a default value for a slot. It is bad form to use both since :INITARG combined with :DEFAULT-INITARGS will always override the value specified with :INITFORM. See Keene pg. 158.

I've not had good experience with the macro approach in the past. It usual takes me a couple tries to get it correct. Even when I think it's correct, I end up finding other errors. It has just reinforced, for me, the lesson that the first rule of using macros is not to use macros. Besides, the Hyperspec specifically states that the point of combining &REST and &KEY keywords in a lambda list is to facilitate passing arguments to other functions.

Hopefully that clarifies my intent.

Thanks for the reply,

~ Tom

EDIT: Disable smilies.

Warren Wilkinson
Posts: 117
Joined: Tue Aug 10, 2010 11:24 pm
Location: Calgary, Alberta
Contact:

Re: Lesson of the Day

Post by Warren Wilkinson » Fri Sep 10, 2010 10:01 am

The intent of :INITARG is to declare a symbol to use to initialize the slot combined with :DEFAULT-INITARGS for specifying a default value. The intent of :INITFORM is to define a default value for a slot. It is bad form to use both since :INITARG combined with :DEFAULT-INITARGS will always override the value specified with :INITFORM. See Keene pg. 158.
Yes, that is why it works =).

Why does your object have optional slots? Don't you end up with dozens of test to determine if an optional slots is bound before you use it? Could you instead provide appropriate dummy values to the optional slots and use algorithms that requires less decisions? This would make every slot have equal importance, and you lose the arbitrary optional/mandatory distinction.
Need an online wiki database? My Lisp startup http://www.formlis.com combines a wiki with forms and reports.

gugamilare
Posts: 406
Joined: Sat Mar 07, 2009 6:17 pm
Location: Brazil
Contact:

Re: Lesson of the Day

Post by gugamilare » Fri Sep 10, 2010 10:40 am

I think you two are fighting over nothing. It is obvious that every case has its proper tool and vice versa. Sometimes a constructor makes things clearer and easier, and that's why structs provide constructors which can be customized to use optional or required parameters. Sometimes they are almost useless (for instance when your objects are created by internal functions or implicitly by macros).

Plus, everyone has its own style and preferences. I personally never use :default-initargs simply because :initform does the same thing and it is more straightforward, but, if I intend to create objects from different places or allow someone to create objects externally from a package, I definitely create a constructor.

And, yes, some classes make more sense with some optional slots (not necessarily unbound but having default values if the use does not supply them).

Tom
Posts: 22
Joined: Sat Jun 28, 2008 12:52 pm
Location: Wichita, KS
Contact:

Re: Lesson of the Day

Post by Tom » Fri Sep 10, 2010 7:59 pm

Warren Wilkinson wrote:Why does your object have optional slots? Don't you end up with dozens of test to determine if an optional slots is bound before you use it? Could you instead provide appropriate dummy values to the optional slots and use algorithms that requires less decisions? This would make every slot have equal importance, and you lose the arbitrary optional/mandatory distinction.
I always initialize slots as you suggest. The distinction between the mandatory and optional slot is whether there is sensible initial value. Some slots in an object don't have a sensible initial value, so it is mandatory that the initial value be provided by the constructor. Some slots do have a sensible initial value, so then the question is whether the constructor should have the option to provide an initial value. If the initial value can be overridden, I use :INITARG combined with :DEFAULT-INITARGS. If the initial value should never be overridden, I use :INITFORM. The problem I've always had is designing the constructor appropriately.

Hope that makes it clear what I am trying to accomplish.

Thanks,

~ Tom

Tom
Posts: 22
Joined: Sat Jun 28, 2008 12:52 pm
Location: Wichita, KS
Contact:

Re: Lesson of the Day

Post by Tom » Fri Sep 10, 2010 8:20 pm

gugamilare wrote:I think you two are fighting over nothing. It is obvious that every case has its proper tool and vice versa. Sometimes a constructor makes things clearer and easier, and that's why structs provide constructors which can be customized to use optional or required parameters. Sometimes they are almost useless (for instance when your objects are created by internal functions or implicitly by macros).
We are not having an argument.
gugamilare wrote:Plus, everyone has its own style and preferences. I personally never use :default-initargs simply because :initform does the same thing and it is more straightforward, but, if I intend to create objects from different places or allow someone to create objects externally from a package, I definitely create a constructor.
Be careful substituting :INITFORM for :DEFAULT-INITARGS, they are distinctly different. The difference lies in how they are handled by SHARED-INITIALIZE and mainly causes problems, at least in my experience, when using REINITIALIZE-INSTANCE.
REINITIALIZE-INSTANCE wrote:The system-supplied primary method for reinitialize-instance checks the validity of initargs and signals an error if an initarg is supplied that is not declared as valid. The method then calls the generic function shared-initialize with the following arguments: the instance, nil (which means no slots should be initialized according to their initforms), and the initargs it received.
There are probably other areas where you can run into problems using :INITFORM and :DEFAULT-INITARGS inter-changeably, so I try to be careful about how I use them.

Thanks,

~ Tom

Warren Wilkinson
Posts: 117
Joined: Tue Aug 10, 2010 11:24 pm
Location: Calgary, Alberta
Contact:

Re: Lesson of the Day

Post by Warren Wilkinson » Fri Sep 10, 2010 10:40 pm

You know your problem domain better than I do; if you say that you've got objects that are best described with optional slots, I'll believe it. In anycase, this thread seemed an appropriate place for some function timing information I gleaned from SBCL. If you disagree, I'll move it elsewhere.

Sorting In SBCL

In my application I store a lot of data presorted on disk; once there are enough updates, the updates and data are merged and rewritten to disk. In the mean time, however, I must still merge the data I've loaded from the disk and the in memory patches together everytime users request any range of that data.

I had considered storing a lazily computed 'final location' with each update, so I avoid needlessly recomputing that location (which I initially would have done the simplist way: linear search). Then I had a better idea: Since the data will be largely in order anyway, why not just cons it then sort?

How well this works depends on the sorting algorithm and (perhaps) the format of the data (list vs vector). Before taking this route, I checked out sort in the SBCL git repository, and ran some timing tests. SBCL uses merge sort which is brilliant because its optimal for linked list, and as far as I can tell it works great when data is mostly sorted. Its fast for both lists and vectors, whichever I prefer to use:

Code: Select all

(defun worsen (thing) (let ((a (random 800)) (b (random 800)))
			(let ((temp (elt thing b)))
			  (setf (elt thing b) (elt thing a)
				(elt thing a) temp))))

(defvar *badness0* (iota 800))
(defvar *badness10* (iota 800))
(defvar *badness100* (iota 800))
(defvar *badness1000* (iota 800))
(loop repeat 10 do (worsen *badness10*))
(loop repeat 100 do (worsen *badness100*))
(loop repeat 1000 do (worsen *badness1000*))

(defvar *vbadness0* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness10* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness100* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness1000* (make-array 800 :initial-contents (iota 800)))
(loop repeat 10 do (worsen *vbadness10*))
(loop repeat 100 do (worsen *vbadness100*))
(loop repeat 1000 do (worsen *vbadness1000*))

(defun time-sort (array &aux (copies (loop repeat 10000 collect (copy-seq array))))
  (time (dolist (copy copies) (sort copy #'<))))

(time-sort *badness0*) ;; 5.048 seconds
(time-sort *badness10*) ;; 6.248 seconds
(time-sort *badness100*) ;; 7.354 seconds
(time-sort *badness1000*) ;; 7.803 seconds

(time-sort *vbadness0*) ;; 3.308 seconds
(time-sort *vbadness10*) ;; 3.388 seconds
(time-sort *vbadness100*) ;; 3.325 seconds
(time-sort *vbadness1000*) ;; 3.535 seconds
Need an online wiki database? My Lisp startup http://www.formlis.com combines a wiki with forms and reports.

gugamilare
Posts: 406
Joined: Sat Mar 07, 2009 6:17 pm
Location: Brazil
Contact:

Re: Lesson of the Day

Post by gugamilare » Sat Sep 11, 2010 6:02 am

Ok, sorry, I misinterpreted.
Tom wrote:Be careful substituting :INITFORM for :DEFAULT-INITARGS, they are distinctly different. The difference lies in how they are handled by SHARED-INITIALIZE and mainly causes problems, at least in my experience, when using REINITIALIZE-INSTANCE.
REINITIALIZE-INSTANCE wrote:The system-supplied primary method for reinitialize-instance checks the validity of initargs and signals an error if an initarg is supplied that is not declared as valid. The method then calls the generic function shared-initialize with the following arguments: the instance, nil (which means no slots should be initialized according to their initforms), and the initargs it received.
There are probably other areas where you can run into problems using :INITFORM and :DEFAULT-INITARGS inter-changeably, so I try to be careful about how I use them.
I didn't say I use them inter-changeably, I always use :initarg and :initform. I never needed to use shared-initialize of reinitialize-instance, maybe that is the reason. But thank you for the information, I'll keep that in mind.

Warren Wilkinson wrote:Sorting In SBCL

In my application I store a lot of data presorted on disk; once there are enough updates, the updates and data are merged and rewritten to disk. In the mean time, however, I must still merge the data I've loaded from the disk and the in memory patches together everytime users request any range of that data.

I had considered storing a lazily computed 'final location' with each update, so I avoid needlessly recomputing that location (which I initially would have done the simplist way: linear search). Then I had a better idea: Since the data will be largely in order anyway, why not just cons it then sort?

How well this works depends on the sorting algorithm and (perhaps) the format of the data (list vs vector). Before taking this route, I checked out sort in the SBCL git repository, and ran some timing tests. SBCL uses merge sort which is brilliant because its optimal for linked list, and as far as I can tell it works great when data is mostly sorted.
Call it coincidence or not, I was checking SBCL's sorting algorithms performance-wise as well. I'm programing various sorting methods and see how fast they are.

Just as a note, SBCL's sort uses heapsort for vectors, while stable-sort uses merge sort. I actually found out that, in at least two systems (one x86 with DDR RAM and another x86-64 with DDR2 RAM), stable-sort is faster than sort except for cases with a lot of collisions (and by a lot of collisions I mean the vector having at most 2 different elements). I already posted on SBCL's mailing list and, once I finish my tests, I said I would commit a patch.

SBCL's merge sort is interesting, though. Instead of doing the standard recursion method, it uses a loop that sorts progressively larger subvectors or sublists across the sequence.
Warren Wilkinson wrote:Its fast for both lists and vectors, whichever I prefer to use:

Code: Select all

(defun worsen (thing) (let ((a (random 800)) (b (random 800)))
			(let ((temp (elt thing b)))
			  (setf (elt thing b) (elt thing a)
				(elt thing a) temp))))

(defvar *badness0* (iota 800))
(defvar *badness10* (iota 800))
(defvar *badness100* (iota 800))
(defvar *badness1000* (iota 800))
(loop repeat 10 do (worsen *badness10*))
(loop repeat 100 do (worsen *badness100*))
(loop repeat 1000 do (worsen *badness1000*))

(defvar *vbadness0* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness10* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness100* (make-array 800 :initial-contents (iota 800)))
(defvar *vbadness1000* (make-array 800 :initial-contents (iota 800)))
(loop repeat 10 do (worsen *vbadness10*))
(loop repeat 100 do (worsen *vbadness100*))
(loop repeat 1000 do (worsen *vbadness1000*))

(defun time-sort (array &aux (copies (loop repeat 10000 collect (copy-seq array))))
  (time (dolist (copy copies) (sort copy #'<))))

(time-sort *badness0*) ;; 5.048 seconds
(time-sort *badness10*) ;; 6.248 seconds
(time-sort *badness100*) ;; 7.354 seconds
(time-sort *badness1000*) ;; 7.803 seconds

(time-sort *vbadness0*) ;; 3.308 seconds
(time-sort *vbadness10*) ;; 3.388 seconds
(time-sort *vbadness100*) ;; 3.325 seconds
(time-sort *vbadness1000*) ;; 3.535 seconds
Just a small tip:

Code: Select all

(defun worsen (thing) (let ((a (random 800)) (b (random 800)))
  (rotatef (elt thing b) (elt thing a)))

Warren Wilkinson
Posts: 117
Joined: Tue Aug 10, 2010 11:24 pm
Location: Calgary, Alberta
Contact:

Re: Lesson of the Day

Post by Warren Wilkinson » Sat Sep 11, 2010 10:41 am

On your advice I extended and reran these test.
  • I decreased the number of repetitions and increased the size of the vectors.
  • I ran it with sort and stable sort
  • I added tests for (vector fixnum length) to go along with the list and vector tests.

There is a bug in this: I intended each test run to recieve the same input, instead each test run recieves equally worsened input --- but not exactly the same input. It would be best of all if each test had exactly the same inputs, but oh well.

Code: Select all

(defconstant +length+ 80000)
(defconstant +intervals+ '(0 10 100 1000 10000 100000 1000000))
(defconstant +repeat+ 10)
(flet ((worsen (thing) (rotatef (elt thing (random +length+)) (elt thing (random +length+)))))
  (defun init (w)
    (let ((arr (make-array +length+ :initial-contents (loop for i from 0 upto (1- +length+) collecting i))))
      (loop repeat w do (worsen arr))
      arr)))

(defun time-sort (type)
  (loop for worsen in +intervals+
        as copies = (loop repeat +repeat+ collect (copy-seq (coerce (init worsen) type)))
        do (format t "~%~a RUN (~d)" type worsen)
        do (time (dolist (copy copies) (sort copy #'<)))))

(defun time-ssort (type)
  (loop for worsen in +intervals+
        as copies = (loop repeat +repeat+ collect (copy-seq (coerce (init worsen) type)))
        do (format t "~%~a RUN (~d)" type worsen)
        do (time (dolist (copy copies) (stable-sort copy #'<)))))
The results were interesting. For an 80000 element vector SORT is a slower than STABLE-SORT regardless of how out of order the input started. Vectors use a heap-sort for sort, merge sort for stable-sort. Adding type declarations to the vector also made things a lot slower.

The very fastest sort was sorting a simple vector using STABLE-SORT, next up was sorting lists using either sort method (because for lists they both use the same merge sort algorithm).

I'm actually a bit suprised the 0 worsened cases aren't faster; with 80000 elements it took a minimum of .06 seconds to sort even if no changes were required. I guess this is time is spent allocating temporary space, and circling through loops. In anycase, for my own problem I'll probably stable-sort on vectors. If it ever becomes a bottle neck I can custom implement quicksort and go there.

Code: Select all

(time-sort 'list)
;;      0 = 0.828
;;     10 = 0.936
;;    100 = 1.041
;;   1000 = 1.177
;;  10000 = 1.346
;; 100000 = 1.678
;;1000000 = 1.603

(time-sort 'vector)
;;      0 = 1.941
;;     10 = 1.946
;;    100 = 1.968
;;   1000 = 2.015
;;  10000 = 1.970
;; 100000 = 1.958
;;1000000 = 1.936

(time-sort `(vector fixnum ,+length+))
;;      0 = 3.483
;;     10 = 3.442
;;    100 = 3.520
;;   1000 = 3.544
;;  10000 = 3.520
;; 100000 = 3.509
;;1000000 = 3.463

(time-ssort 'list)
;;      0 = 0.828
;;     10 = 0.929
;;    100 = 1.060
;;   1000 = 1.195
;;  10000 = 1.384
;; 100000 = 1.623
;;1000000 = 1.620

(time-ssort 'vector)
;;      0 = 0.656
;;     10 = 0.755
;;    100 = 0.843
;;   1000 = 0.946
;;  10000 = 1.040
;; 100000 = 1.124
;;1000000 = 1.110

(time-ssort `(vector fixnum ,+length+))
;;      0 = 1.045
;;     10 = 1.130
;;    100 = 1.247
;;   1000 = 1.387
;;  10000 = 1.492
;; 100000 = 1.612
;;1000000 = 1.578
Need an online wiki database? My Lisp startup http://www.formlis.com combines a wiki with forms and reports.

Post Reply