Let me take a crack at this and see if I can manage to not botch it up. I'll make use of some possible implementation details which are typically opaque to a programmer but which might help ground you a bit. I'll also simplify and cut corners, so don't be too hard on me if I'm not strictly accurate all the way through.
Okay, first you have to understand about the Lisp reader. Whenever a symbol is read into the system (at the REPL, for instance), the symbol must be
interned. Roughly, this corresponds to entering the symbol into a lookup table with the string that names the symbol as the key. Whenever the reader reads that same symbol again, it first checks whether the symbol is already interned by looking up the symbol's name in the lookup table. If it's already there, the reader returns the previous symbol. If the symbol hasn't yet been interned, the reader creates a symbol object with the name that was read and enters the name -> symbol mapping into the lookup table so subsequent readings will return the same symbol.
Now, given that background, the trick with packages is that rather than have a single global lookup table for symbols read into the system, we have many. Each
package is basically a data structure that contains a lookup table as well as a list of symbols that are exported from itself, and a list of symbols and packages that it imports symbols from. This package data structure is created with the DEFPACKAGE form. All that DEFPACKAGE does is create this data structure. Nothing more. One of the key points is that this is a first-class data structure in the system, however. It's not a figment of the compiler's imagination that disappears when code is compiled. It persists through the life of the running Lisp image or until the programmer removes it.
Now, when the reader reads in a source file, the question is, where should those symbols be interned? The reader has to intern them someplace. The answer is that it interns them in the package that is identified by the variable *PACKAGE*. The IN-PACKAGE form basically just sets *PACKAGE* to the specified package name. The rest of the symbols read by the reader after that point are then interned in the specified package until another IN-PACKAGE form is encountered or until end-of-file. A logical consequence of this is that a source file can contain symbols and definitions that are interned in more than one package. This isn't possible in a language like Java where a source file has a 1-to-1 correspondence with a class name and its position in the file system identifies which package it belongs to. If the Lisp behavior seems odd, try to envision the reader simply operating on the REPL, incrementally, rather than compilers and all the rest of it. Another consequence is that the various forms that make up the package can be scattered across multiple source files. As we need to do is execute an IN-PACKAGE form and keep reading.
An important point is that when a package
uses another package, the reader first checks in all the used package to see if the symbol is a public symbol of that package before interning it in the current package. If so, the reader returns that symbol. That's why you can use symbols like CONS in every package that has used the COMMON-LISP package without having to write it out as COMMON-LISP:CONS all the time. If the reader doesn't find the symbols in any of the used packages, it interns it in the current package.
Now, there are several consequences of all this:
- We need to create the packages with the DEFPACKAGE form before we can use IN-PACKAGE to then start interning symbols into the package. That has to happen first, otherwise IN-PACKAGE can't find the package. The typical way to handle this is to put the DEFPACKAGE form into a separate file of its own and to make sure that file is loaded before any others in the package. By doing that, we can be sure we only enter the DEFPACKAGE once and that everything works correctly. If your package is small, you can also put the DEFPACKAGE form right at the front of the single source file for that package.
- Second, we need to first load all the packages that a given package uses before we start reading code that depends on the symbols from those other packages, otherwise the reader won't find them interned in the used packages and the symbols will get interned in the current package.
So, now imagine a large system that is composed of tens of different packages, possibly with hundreds of source files. Managing all those dependencies can be complicated. When you load everything, it all has to be done in the right order. If it's done in the wrong order, symbols will get interned into the wrong packages and lots of bad stuff will happen.
To manage all this complexity automatically, we use a system definition tool, or a
defsystem. ASDF is one example of a defsystem, but there are others, too. The defsystem manages all the dependencies and ensures that packages are loaded before the packages that use them, all the way up to the top of the application.
I hope that helps.