Compile-time information across macros

Discussion of Common Lisp
slowcoder
Posts: 4
Joined: Sat Aug 15, 2009 4:20 pm

Compile-time information across macros

Post by slowcoder » Sat Aug 15, 2009 5:00 pm

Hi, I am trying to code a set of algebraic algorithms parameterised on structures such as rings and fields. My primary concern is efficiency.

Since algorithms that take higher-order functions representing the operations on the structure would incur a function call overhead, I have been trying to use macros to achieve a similar effect.

Having some experience with C++, I decided to perform something similar to templates in Lisp, but I have been having a problem. In C++, the code generated by templates goes to some sort of "cache" so that duplication in minimized. But I have not found a way to implement this "cache", and that is exactly because I do not know of any mechanism to keep a "compile-time data structure" to be shared among macros.

Put another way, in Paul Graham's book "On Lisp", there is a chapter about shifting computation to compile-time. In that chapter, he teaches how to implement a macro "nth". That macro receives an integer n and a list and returns the n-th smallest element in the list. If the datum passed to the macro as the integer n is in fact an integer and is less than 20, the macro expands into a (huge) specialized code suitable only to the integer n. The problem that is not addressed in the book is that sucessive invocations of the macro with the same integer argument will produce different copies of the same (huge) code. That makes the program larger and can make the instruction cache miss a lot.

If possible, I would like the code to be shared even among different compilation units, just like the C++ compiler does.

What do you recommend that I do?

lnostdal
Posts: 20
Joined: Thu Jul 03, 2008 2:01 pm
Location: Skien, Norway
Contact:

Re: Compile-time information across macros

Post by lnostdal » Mon Aug 17, 2009 12:45 pm

I have been pondering something similar. I don't mean to hijack your thread, but I wonder if LOAD-TIME-VALUE make sense in this context(?)

edit: ..maybe not; can't dump this to fasls etc..
edit2/update: ..or maybe if i rearrange things a bit..

Code: Select all

(sb-ext:defglobal -nth-storage-
    (make-hash-table))


(dotimes (i 20)
  (setf (gethash i -nth-storage-)
        (labels ((recur (n)
                   (if (zerop n)
                       'list
                       `(cdr ,(recur (1- n))))))
          (compile nil `(lambda (list) (car ,(recur i)))))))


(defmacro my-nth (n list)
  (if (and (constantp n) (< n 20))
      `(funcall (sb-ext:truly-the function (load-time-value (gethash ,n -nth-storage-) t)) ,list)
      `(nth ,n ,list)))
Last edited by lnostdal on Mon Aug 17, 2009 2:41 pm, edited 2 times in total.

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Compile-time information across macros

Post by ramarren » Mon Aug 17, 2009 2:12 pm

slowcoder wrote:Since algorithms that take higher-order functions representing the operations on the structure would incur a function call overhead, I have been trying to use macros to achieve a similar effect.
Honestly, this sounds like a premature optimization to me. In most cases either the functions are big enough for actual functionality to dominate the overhead, or small enough to be cheaply inlined. Also, C++ templates are much more constrained than macros. From what I remember a template instantiation is just a function or class anyway, so if you limit yourself to creating only those constructs caching becomes trivial. Remember, during macro-expansion time you have a whole Lisp system available, so you "compile-time data structures" are the same as every other data structure, just requiring some application of EVAL-WHEN and being aware of importance of macro idempotence.

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Compile-time information across macros

Post by nuntius » Mon Aug 17, 2009 7:25 pm

slowcoder wrote:Hi, I am trying to code a set of algebraic algorithms parameterised on structures such as rings and fields. My primary concern is efficiency.
...
Having some experience with C++, I decided to perform something similar to templates in Lisp, but I have been having a problem. In C++, the code generated by templates goes to some sort of "cache" so that duplication in minimized. But I have not found a way to implement this "cache", and that is exactly because I do not know of any mechanism to keep a "compile-time data structure" to be shared among macros.
...
If possible, I would like the code to be shared even among different compilation units, just like the C++ compiler does.
A few points:
- CL doesn't have "compilation units" the way C/C++ does; one lisp image generally compiles and runs multiple source files.
- C++ generally has to instantiate templates for each compilation unit; these multiple instantiations are resolved at link time.
- Templates occurring within a class/struct definition are blessed with a hidden "inline" modifier; on modern compilers the only effect of "inline" is to suppress linker errors due to multiple instantiation.
- C++ uses name mangling for function "overloading", including template instantiations.
- Your macros could declare inline functions (optional), instantiate them using name mangling, and call them, just like C++ template instantiation. If the symbol is already FBOUNDP, the macro doesn't need to define it.

Another approach is "partial evaluation". IIRC, that's the approach taken by SciCL; the first time a code path is hit, it looks at the input data and generates a highly efficient function tailored to the datatypes. Its something like JIT compilation, but directly controlled by the programmer.

FWIW, modern C++ template math libs are using "template metaprogramming". Much like partial evaluation, they look at a whole expression when generating code. This helps eliminate temporary variables and arrange for other optimizations that aren't obvious when looking at a single operation/function call at a time.

slowcoder
Posts: 4
Joined: Sat Aug 15, 2009 4:20 pm

Re: Compile-time information across macros

Post by slowcoder » Sat Aug 22, 2009 6:27 am

Sorry for the long reply time, but apparently I did not receive any email from the forum.

Ramarren:
Well, I admit this is some kind of premature optimization, but I really want to do it now because I am not tolerating a suboptimal final system and when profiling tells me there is an overhead intrinsic to the choices I have made in the past and that I would need to change the whole system because of that, then I will be sad.

Yes, I can perfectly see C++ templates are much more limited than lisp macros, and that is exactly why I want to know how to implement them in Lisp. I mean, since Lisp macros are more general and powerful than C++ templates, I want to know how to implement the latter in the former, for I have had no idea of how to do so.

By the way, when you said "... or small enough to be cheaply inlined.", did you mean the Lisp compiler can inline higher-order functions? If so how is this possible? I assumed that for each function taking a function as an argument, the compiler only emmited a single code for it.

About EVAL-WHEN, if I declare a data structure with (eval-when :compile-toplevel ...) do I get a guarantee it will not appear at runtime (so no space overhead)?

About macro idempotence, I googled for it, but found nothing.


Nuntius:
You mean that if I define a function in a source file, compile it and load it, then fboundp will tell me of its existence when compiling another file. What if the first file was not loaded? I will look for deeper information about fboundp.

:-) And I can see why doing that kind of expression analysis is much easier in Lisp than it is in C++, at least if I could "cache" the shared functions.


Thanks for the replies!!!

gugamilare
Posts: 406
Joined: Sat Mar 07, 2009 6:17 pm
Location: Brazil
Contact:

Re: Compile-time information across macros

Post by gugamilare » Sat Aug 22, 2009 8:55 am

slowcoder wrote:About EVAL-WHEN, if I declare a data structure with (eval-when :compile-toplevel ...) do I get a guarantee it will not appear at runtime (so no space overhead)?
Actually, it's (eval-when (:compile-toplevel) ...). Answering your question, the space overhead remains if you compile and load the file in the same session (which is most likely what you are going to do when developing), but if you start a new session and load the code, then yes, no space overhead.
slowcoder wrote:About macro idempotence, I googled for it, but found nothing.
Macro idempotence means that, no matter how much times a single reference of your macro is expanded, it is expanded into equivalent code.

For instance, this macro is not idempotent:

Code: Select all

(defvar *counter* 0)

(defmacro foo (x)
  `(+ x ,(incf *counter*)))
This is not a very good choice, because, if you compile this function more than once:

Code: Select all

(defun bar (x)
  (foo x))
the function will not remain the same. This is just a matter of paying attention to what you are doing specially if your macro do some side effect (like the macro foo, which increases the counter every time it is expanded). It is like avoiding infinite loops.

One last thing: don't worry about using macros right away. You can use functions (which are easier to debug, because, if you change a function, you don't need to recompile every function that uses it, but that isn't true with a macro). Later, when your code is stable, you can create compiler macros (which are esssentially macros that expand in the place of every call of your function). For instance:

Code: Select all

(defun my-1+ (x)
  ;; This function has a call overhead
  (+ x 1))

(define-compiler-macro my-1+ (x)
  ;; This avoids a call overhead
  ;; (unless possibly when you pass #'my-1+ to other functions,
  ;; which is not allowed by macros anyway)
  `(+ ,x 1))

(defun foo (x)
  ;; with the compiler-macro, this code is equivalent to
  ;; (do-something-with (+ x 1)) - no call overhead
  (do-something-with (my-1+ x)))


;;; Another example

(defun my-mapcar (fun list)
  (if list
      (cons (funcall fun (car list))
            (my-mapcar fun (cdr list)))))

(defun is-function-or-lambda-p (arg)
  (and (consp fun)
       (membar (car fun) '(quote function lambda))))

(define-compiler-macro my-mapcar (&whole whole fun list)
  ;; optimize calls of the form (my-mapcar #'some-fun some-list) or (my-mapcar 'some-fun some-list)
  ;; and also calls of the form (my-mapcar (lambda (arg) ...) some-list)
  (if (is-function-or-lambda-p fun)
      ;; In a efficient implementation, (SBCL or CMUCL, for instance, don't know about others),
      ;; this will avoid a call overhead, because, for instance,
      ;; (funcall #'some-fun some-arg) is replaced with (some-fun some-arg)
      ;; in these implementations
      (let ((elt (gensym)))
        `(loop for ,elt in ,list collect (funcall ,fun ,elt)))
      ;; This is how you give up a compiler-macro expansion: just return the &whole argument.
      whole))

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Compile-time information across macros

Post by ramarren » Sat Aug 22, 2009 9:14 am

slowcoder wrote:Ramarren:
Well, I admit this is some kind of premature optimization, but I really want to do it now because I am not tolerating a suboptimal final system and when profiling tells me there is an overhead intrinsic to the choices I have made in the past and that I would need to change the whole system because of that, then I will be sad.
But function call overhead is very low. Moreover, you said in the original post:
slowcoder wrote:If possible, I would like the code to be shared even among different compilation units, just like the C++ compiler does.
And only after posting my reply I realized that the question is rather strange, because shared code is, by definition, a function.
slowcoder wrote:Yes, I can perfectly see C++ templates are much more limited than lisp macros, and that is exactly why I want to know how to implement them in Lisp. I mean, since Lisp macros are more general and powerful than C++ templates, I want to know how to implement the latter in the former, for I have had no idea of how to do so.
Well, I used C++ in a pretty limited way, but from what I remember templates were mostly used to create functions/methods with different types from a common template. This is not really needed in Lisp, because the type system is dynamic. I suppose if one wanted to avoid runtime type dispatch it would be possible to create something similar with macros expanding to specialized function invocations, with the specialized forms possibly generated ad-hoc with (compile nil ...) and cached.

I am just not sure what it is that you are trying to achieve. Can you give an example of use of C++ templates in a way you want to use macros?
slowcoder wrote:By the way, when you said "... or small enough to be cheaply inlined.", did you mean the Lisp compiler can inline higher-order functions? If so how is this possible? I assumed that for each function taking a function as an argument, the compiler only emmited a single code for it.
Well, a higher order function can be inlined just as any other function, although I don't think this is what you are asking. The argument can be inlined if it is compile-time constant perhaps with a compiler macro. But all inlining obviously leads to code duplication.
slowcoder wrote:About EVAL-WHEN, if I declare a data structure with (eval-when :compile-toplevel ...) do I get a guarantee it will not appear at runtime (so no space overhead)?
As far as I can tell, yes. It might depends on the implementation.
slowcoder wrote:About macro idempotence, I googled for it, but found nothing.
I meant that there is no guarantee that any given macro will be expanded once per time it appears. It might be repeatedly expanded or not at all if a code section is optimized away, which means that any side effects it has must be idempotent, ie. multiple effect must be equal to single effect. This is usually not a problem with caching, as long as the things cached are consistent.

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Compile-time information across macros

Post by nuntius » Sat Aug 22, 2009 8:29 pm

slowcoder wrote: You mean that if I define a function in a source file, compile it and load it, then fboundp will tell me of its existence when compiling another file. What if the first file was not loaded? I will look for deeper information about fboundp.
I was suggesting that you could write code something like the following.

Code: Select all

;; "template definition"
(defmacro template-f (n x)
  "roughly equivalent to the C++ template
template int f<int n>(int x){ return n+x; }"
  (let ((mangled-name (intern (format nil "f_~A" n))))
    (format t "instantiating template-f: ~S~%" mangled-name)
    (unless (fboundp mangled-name)
      (declaim (inline mangled-name))
      (setf (symbol-function mangled-name)
	    (lambda (y)
	      (+ n y))))
    (list mangled-name x)))

;; "template instantiation"
(defun do-f3 (x)
  "roughly equivalent to the C++ instantiation
int do_f3(int x) { return f<3>(x); }"
  (template-f 3 x))
However, a quick (disassemble #'|f_3|) (dissassemble #'do-f3) shows that |f_3| isn't being inlined for me on SBCL 1.0.24.47. Not sure why not.

nuntius
Posts: 538
Joined: Sat Aug 09, 2008 10:44 am
Location: Newton, MA

Re: Compile-time information across macros

Post by nuntius » Sat Aug 22, 2009 10:16 pm

nuntius wrote:However, a quick (disassemble #'|f_3|) (dissassemble #'do-f3) shows that |f_3| isn't being inlined for me on SBCL 1.0.24.47. Not sure why not.
But if you really wanted things inline, you'd simply have the macro return the function expansion, rather than declaiming things to be inline and changing the symbol-function. Duh!

gugamilare
Posts: 406
Joined: Sat Mar 07, 2009 6:17 pm
Location: Brazil
Contact:

Re: Compile-time information across macros

Post by gugamilare » Sat Aug 22, 2009 10:52 pm

@ slowcoder

Whatever your idea for the code will be, I'm 90% sure that compiler macros and other CL tools will be enough. Lisp is more flexible than C++, and, with compiler macros, using high order functions does not get in your way of optimizing your code. If you still have doubts about if it is possible to optimize your code, why don't you make some example code and let us help you with whatever you need to do to optimize it?

Trying to make CL macros work "exactly like" C++ templates seems not like the right thing anyway: Lisp code is Lisp code, C++ code is C++ code. Optimization works differently from each other.

Post Reply