Escaping a string

Discussion of Common Lisp
Post Reply
jstoddard
Posts: 20
Joined: Fri Jan 28, 2011 6:13 pm

Escaping a string

Post by jstoddard » Fri Mar 11, 2011 5:02 pm

I've put together a function using the cl-ppcre package that escapes a string for insertion into a MySQL database. The function works perfectly, but I'd like to figure out why -- I had to place in more backslashes than I would have intuitively guessed. Since my head starts spinning when I try and track all the backslashes in escaping and de-escaping strings, I may have easily missed where some of them are disappearing to...

Code: Select all

(defun escape-string (string)
  "Escape a string for MySQL, replacing a NUL byte with \0, a \ with \\, a '
with \' and a double quote with backslash double quote."
  (let ((+null+ (code-char 0)))
    (regex-replace-all
     "\""
     (regex-replace-all
      "'"
      (regex-replace-all
       (string +null+)
       (regex-replace-all "\\" string "\\\\\\")
       "\\\\0")
      "\\\\'")
     "\\\"")))
Intuitively, I would have thought that "\\0" would be giving me \0, that only four backslashes would be necessary to escape the backslash, and that "\\'" would work for the single quote. However, it takes the above code to give the expected results:

Code: Select all

CL-USER> (format t (escape-string (string (code-char 0))))
\0
NIL
CL-USER> (format t (escape-string "\\"))
\\
NIL
CL-USER> (format t "~a~%" (escape-string "\"Hello, 'World!'\" \\"))
\"Hello, \'World!\'\" \\
NIL
Is the regex-replace-all function possibly eating up some of my backslashes, or where are they going to? Why not an extra pair of backslashes for the double quote since they were needed for everything else?

Thanks,
Jeremiah

ramarren
Posts: 613
Joined: Sun Jun 29, 2008 4:02 am
Location: Warsaw, Poland
Contact:

Re: Escaping a string

Post by ramarren » Sat Mar 12, 2011 1:52 am

The replacement argument in regex-replace-all has it's own special meaning for a backslash, like in Perl. This is a general issue with regular expressions. You should either use cl-interpol or the structured replacement form:

Code: Select all

(defun escape-string (string)
  "Escape a string for MySQL, replacing a NUL byte with \0, a \ with \\, a '
    with \' and a double quote with backslash double quote."
  (let ((+null+ (code-char 0)))
    (regex-replace-all
     "\""
     (regex-replace-all
      "'"
      (regex-replace-all
       (string +null+)
       (regex-replace-all "\\" string '("\\\\"))
       '("\\0"))
      '("\\'"))
     '("\\\""))))

jstoddard
Posts: 20
Joined: Fri Jan 28, 2011 6:13 pm

Re: Escaping a string

Post by jstoddard » Sat Mar 12, 2011 8:25 am

Thanks; I guess it's a matter of I need to read more carefully. The "weird" behaviors are explained for each case -- or at least more or less (it doesn't entirely make sense that "\\0" is picked up since 0 is not a positive integer). Anyway, as long as I don't need to use any of the special substrings, I guess putting the replacement string in a list per your example is the way to avoid these surprises.

Post Reply