Page 1 of 1
Is this legal Lisp?
Posted: Mon Jun 14, 2010 2:12 pm
by lispamour
The following comes again from Genlt introduction to Symbolic Computation p. 257, Exercise 4. Given a list of symbols (A T G C), representing a strand of DNA, count the number of DNA bases in the strand in a function called COUNT-BASES. The catch is that the DNA strand could be either a single strand or a double strand (since DNA is double-stranded, except during replication). Thus the function could be called as
Code: Select all
(COUNT-BASES '((G C) (A T) (T A) (T A) (C G)))
which would return
or it could be called as
which would return
I wrote it up as
Code: Select all
(defun count-bases (dna)
(let ((num-a 0)
(num-t 0)
(num-g 0)
(num-c 0))
(defun count-nucleotides (x)
(cond ((eq x 'a) (incf num-a))
((eq x 't) (incf num-t))
((eq x 'g) (incf num-g))
((eq x 'c) (incf num-c))))
(dolist (base-pair dna
(list (list 'a num-a) (list 't num-t) (list 'g num-g) (list 'c num-c)))
(cond ((listp base-pair) (count-nucleotides (car base-pair))
(count-nucleotides (cadr base-pair)))
(t (count-nucleotides base-pair))))))
I defined the function COUNT-NUCLEOTIDES within the body of the function COUNT-BASES in order to take advantage of the lexical environment of the LET block; the alternative would have been to either use as global variables NUM-A. NUM-T, NUM-G, NUM-C; or to define a
Code: Select all
(defstruct base-count
(num-a 0)
(num-t 0)
(num-g 0)
(num-c 0))
and pass the structure by value along with the variable X to update the count of the components in each iteration. For simplicity's sake I chose to define the sub-function COUNT-NUCLEOTIDES within COUNT-BASES.
The function works properly under both Clozure and LispWorks, though I get a warning from LispWorks when I compile it:
Code: Select all
The following function is undefined:
COUNT-NUCLEOTIDES which is referenced by COUNT-BASES
Is the warning legitimate? Are nested function definitions allowed in Common Lisp?
Moreover, is it good Lisp style to do something like this?
Re: Is this legal Lisp?
Posted: Mon Jun 14, 2010 3:03 pm
by ramarren
lispamour wrote:Are nested function definitions allowed in Common Lisp?
There are, but they are not doing what you most likely think they are doing. What is happening when done like that is that the inner function is defined as global function when the outer one is run. Which is why Lispworks is complaining, since when the outer function is compiled they inner one is not yet defined, and will not be until the outer one is run for the first time.
While allowed, this is a very bad style and it stops many compiler optimizations. The way to define a local function is through
FLET or LABELS special operators.
Also, your COND form in this would be better expressed as
CASE. Or, even better, as symbol keyed hash-table (or even for this sizes an alist), since those value have identical behaviour.
Code: Select all
(defun count-bases (dna)
(let ((counts (list (list 'a 0) (list 't 0) (list 'g 0) (list 'c 0))))
(dolist (base-pair dna counts)
(labels ((count-nucleotide (x)
(incf (second (assoc x counts)))))
(cond ((listp base-pair)
(mapc #'count-nucleotide base-pair))
(t (count-nucleotide base-pair)))))))
Re: Is this legal Lisp?
Posted: Mon Jun 14, 2010 4:14 pm
by Suroy
I couldnt resist
My totally unhelpful code for this (it uses other libraries, but i just wanted to show that you dont have to stick with vanilla lisp, make your own 'cl' library named something else which contains extremely convenient functions for everything. For this, i like to use the conduit-packages library which will not only import symbols but will also export them, so you could export a subset of the :cl or any otherpackage)
Code: Select all
(defun count-bases (dna)
(let ((counts (hash (a 0) (t 0) (g 0) (c 0))))
(iter
(for base in dna)
(mapc #(incf (@ % counts)) base))
counts))
Re: Is this legal Lisp?
Posted: Tue Jun 15, 2010 1:14 pm
by lispamour
Ramarren wrote:lispamour wrote:Are nested function definitions allowed in Common Lisp?
There are, but they are not doing what you most likely think they are doing. What is happening when done like that is that the inner function is defined as global function when the outer one is run. Which is why Lispworks is complaining, since when the outer function is compiled they inner one is not yet defined, and will not be until the outer one is run for the first time.
While allowed, this is a very bad style and it stops many compiler optimizations. The way to define a local function is through
FLET or LABELS special operators.
Thank you for that very complete and lucid explanation. Your code for COUNT-BASES was great: very succinct and idiomatic Lisp. The book GISC did cover association lists, and I'm a little chagrined that I did not think of it as it presents the solution in a clear and compact manner. I'm starting to see how experienced Lispers find and factor out certain patterns in organizing their code.
Suroy wrote:For this, i like to use the conduit-packages library which will not only import symbols but will also export them, so you could export a subset of the :cl or any otherpackage)
Thank you also for this example. I'll have to come back to this a little later, after I'm a little more experienced, and have learned how to use code from other Lisp packages. It's remarkable how malleable the Lisp language can be, that Lisp programmers can invent and re-use syntactic forms to apply to common types of problems.
Re: Is this legal Lisp?
Posted: Tue Jun 15, 2010 4:57 pm
by death
I don't know why proposed solutions used an alist/hash-table. I think lispamour got it basically right. My proposed solution:
Code: Select all
(defun count-bases (tree)
(let ((as 0) (ts 0) (gs 0) (cs 0))
(labels ((rec (x)
(etypecase x
((eql a) (incf as))
((eql t) (incf ts))
((eql g) (incf gs))
((eql c) (incf cs))
(null)
(cons (rec (car x))
(rec (cdr x))))))
(rec tree))
`((a ,as) (t ,ts) (g ,gs) (c ,cs))))
Re: Is this legal Lisp?
Posted: Tue Jun 15, 2010 10:34 pm
by ramarren
death wrote:I don't know why proposed solutions used an alist/hash-table.
Because if you doing the exact same operation on multiple variables then you almost always was a datastructure. It might not matter in this example, where the operation is trivial, but it is an exercise anyway, and data structures are a core computer science concept, so it is also good to practice them.
Re: Is this legal Lisp?
Posted: Wed Jun 16, 2010 7:41 am
by death
Ramarren wrote:Because if you doing the exact same operation on multiple variables then you almost always was a datastructure.
If you're doing the exact same operation on multiple variables/values you want an operator, not a data structure. A data structure might make sense when there is a connection between different data. It is true that COUNT-BASES can be thought of as doing rule-based dispatch. For example, matching the type of the object X in my code is the rule, and the action depends on satisfaction of the rule. In another model, X is considered a member of {A, T, C, G} and the action is to increment the corresponding count cell. If there was an intention to generalize COUNT-BASES beyond these cases, or if there were many of them, I'd see the value in defining a more sophisticated data structure. In programming it is also important to recognize when a solution might be an overkill, and to choose a different one: knowing when to use a data structure is important as well.
Re: Is this legal Lisp?
Posted: Wed Jun 16, 2010 8:48 am
by ramarren
death wrote:If you're doing the exact same operation on multiple variables/values you want an operator, not a data structure.
And you would still have to apply the operator to every variable manually, which is missing my point. If you are using the same operator on multiple variables then it is almost always better, except in cases of some microoptimizations which obviously are not relevant to an exercise of this kind, to map the operator over a datastructure rather than write out the mapping by hand.
death wrote:For example, matching the type of the object X in my code is the rule, and the action depends on satisfaction of the rule.
But it doesn't really depend so, since in cases where X is a symbol you are performing the exact same action, just on a different place which has one-on-one relationship with the symbol in question. That is a perfect place to use a key-value map of some sort. It is also a simpler solution in the sense that there is a smaller number of entities to consider, and it pushes an, in this case admittedly small, amount of accidental complexity into built-in functionality of the language.
Re: Is this legal Lisp?
Posted: Wed Jun 16, 2010 9:39 am
by death
If generality and avoiding accidental complexity is the aim, we can decompose the problem like this:
Mapping the structure:
Code: Select all
(defun map-tree (function tree)
(typecase tree
(null)
(cons (map-tree function (car tree))
(map-tree function (cdr tree)))
(t (funcall function tree))))
Counting objects:
Code: Select all
(defun histogram (structure &key (test #'eql) (key #'identity) (mapper #'map-tree))
(let ((histogram (make-hash-table :test test)))
(funcall mapper
(lambda (x)
(incf (gethash (funcall key x) histogram 0)))
structure)
histogram))
Converting to the output structure:
Code: Select all
(defun histogram-list (histogram)
(loop for object being the hash-key of histogram
using (hash-value count)
collect (list object count)))
Now COUNT-BASES can be defined:
Code: Select all
(defun count-bases (dna)
(histogram-list (histogram dna)))