Discussion of Common Lisp
-
laivindur
Post
by laivindur » Wed Oct 29, 2008 4:35 pm
Hi!
Im using a programme to get word/char-frequencies from a txt file, and im wondering if there is an easy way of searching through a string looking for a
regex pattern?
I would for example like to do something like this:
Code: Select all
(defun check-word (word)
(let ((s word))
(setf s (string s))
(search "[^A-Za-z]" s)))
to check that a word does not contain any other chars than A-Z or a-z, where "[^A-Za-z]" is the regular expression.
Anyone know if such a thing is possible to achive?
-
metageek
- Posts: 10
- Joined: Fri Jul 25, 2008 8:01 am
Post
by metageek » Fri Oct 31, 2008 6:31 am
It probably won't matter for this usage, but it's important to note that PCRE stands for Perl-Compatible Regular Expression, and Perl's regular expressions are
not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines, and cannot offer the linear performance guarantees of true regular expressions. If that's what you need, don't use PCREs.
-
Exolon
- Posts: 49
- Joined: Sat Jun 28, 2008 12:53 pm
- Location: Ireland
-
Contact:
Post
by Exolon » Fri Oct 31, 2008 1:28 pm
metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?
-
Kompottkin
- Posts: 94
- Joined: Mon Jul 21, 2008 7:26 am
- Location: München, Germany
-
Contact:
Post
by Kompottkin » Fri Oct 31, 2008 3:21 pm
Exolon wrote:metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?
Yes. For example, if I am not mistaken, the Perl regex
matches {a^n b a^n | n ∈ ℕ}, which is not regular.
-
qbg
- Posts: 64
- Joined: Mon Jun 30, 2008 1:05 pm
- Location: Minnesota
Post
by qbg » Fri Oct 31, 2008 3:48 pm
Exolon wrote:metageek wrote:Perl's regular expressions are not regular expressions in the formal sense. This means that they cannot be implemented be implemented by finite state machines
Does that mean they're more flexible than normal regular expressions - they can classify context-free/sensitive languages?
Wikipedia wrote:
Many features found in modern regular expression libraries provide an expressive power that far exceeds the regular languages. For example, the ability to group subexpressions with parentheses and recall the value they match in the same expression means that a pattern can match strings of repeated words like "papa" or "WikiWiki", called squares in formal language theory. The pattern for these strings is (.*)\1. However, the language of squares is not regular, nor is it context-free. Pattern matching with an unbounded number of back references, as supported by numerous modern tools, is NP-hard.
-
Exolon
- Posts: 49
- Joined: Sat Jun 28, 2008 12:53 pm
- Location: Ireland
-
Contact:
Post
by Exolon » Sat Nov 01, 2008 7:59 am
Thanks guys, this makes sense