Page 1 of 1

Lisp and Regex

Posted: Wed Oct 29, 2008 4:35 pm
by laivindur
Hi!

Im using a programme to get word/char-frequencies from a txt file, and im wondering if there is an easy way of searching through a string looking for a
regex pattern?

I would for example like to do something like this:

Code: Select all


(defun check-word (word)
  (let ((s word))
    (setf s (string s))
    (search "[^A-Za-z]" s)))

to check that a word does not contain any other chars than A-Z or a-z, where "[^A-Za-z]" is the regular expression.

Anyone know if such a thing is possible to achive? :)

Re: Lisp and Regex

Posted: Wed Oct 29, 2008 4:56 pm
by kroger
you can use cl-ppcre, a portable regexp library for lisp:

http://www.weitz.de/cl-ppcre/

Re: Lisp and Regex

Posted: Wed Oct 29, 2008 9:42 pm
by lnostdal
cl-user.net is a good resource for finding things:

http://www.cl-user.net/asp/search?search=regex

..in this case, Google ain't that bad either:

http://www.google.com/search?q=lisp+regex

Re: Lisp and Regex

Posted: Fri Oct 31, 2008 6:31 am
by metageek
kroger wrote:you can use cl-ppcre, a portable regexp library for lisp:

http://www.weitz.de/cl-ppcre/
It probably won't matter for this usage, but it's important to note that PCRE stands for Perl-Compatible Regular Expression, and Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines, and cannot offer the linear performance guarantees of true regular expressions. If that's what you need, don't use PCREs.

Re: Lisp and Regex

Posted: Fri Oct 31, 2008 1:28 pm
by Exolon
metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?

Re: Lisp and Regex

Posted: Fri Oct 31, 2008 3:21 pm
by Kompottkin
Exolon wrote:
metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?
Yes. For example, if I am not mistaken, the Perl regex

Code: Select all

(a*)b\1
matches {a^n b a^n | n ∈ ℕ}, which is not regular.

Re: Lisp and Regex

Posted: Fri Oct 31, 2008 3:48 pm
by qbg
Exolon wrote:
metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?
Wikipedia wrote: Many features found in modern regular expression libraries provide an expressive power that far exceeds the regular languages. For example, the ability to group subexpressions with parentheses and recall the value they match in the same expression means that a pattern can match strings of repeated words like "papa" or "WikiWiki", called squares in formal language theory. The pattern for these strings is (.*)\1. However, the language of squares is not regular, nor is it context-free. Pattern matching with an unbounded number of back references, as supported by numerous modern tools, is NP-hard.

Re: Lisp and Regex

Posted: Sat Nov 01, 2008 7:59 am
by Exolon
Thanks guys, this makes sense :)