0
votes

I am trying to implement a pattern matching "syntax" and language. I know of regular expressions but these aren't enough for my scopes. I have individuated some "mathematical" operators. In the examples that follow I will suppose that the subject of pattern mathing are character strings but it isn't necessary.

Having read the description bellow: The question is, does any body knows of a mathematical theory explicitating that or any language that takes the same approach implementing it ? I would like to look at it in order to have ideas !

Descprition of approach:

At first we have characters. Characters may be aggregated to form strings.

A pattern is: a) a single character b) an ordered group of patterns with the operator matchAny c) an ordered group of patterns with the operator matchAll d) other various operators to see later on.

Explanation: We have a subject character string and a starting position.

If we check for a match of a single character, then if it matches it moves the current position forward by one position.

If we check for a match of an ordered group of patterns with the operator matchAny then it will check each element of the group in sequence and we will have a proliferation of starting positions that will get multiplied by the number of possible matches being advanced by the length of the match.

E.G suppose the group of patterns is { "a" "aba" "ab" "x" "dd" } and the string under examination is: "Dabaxddc" with current position 2 ( counting from 1 ). Then applying matchAny with the previous group we have that "a" mathces "aba" matches and "ab" matches while "x" and "dd" do not match. After having those matches there are 3 starting positions 3 4 5 ( corresponding to "a" "ab" "aba" ).

We may continue our pattern matching by accepting to have more then one starting positions. So now we may continue to the next case under examination and check for a matchAll. matchAll means that all patterns must match sequentially and are applied sequentially. subcases of matchAll are match0+ match1+ etc.

I have to add that the same fact to try to ask the question has already helped me and cleared me out some things. But I would like to know of similar approaches in order to study them. Please only languages used by you and not bibliography !!!

2
The branch of mathematics (computer science?) you're looking for is formal language theory and automata theory. Your description sounds like a regular language indeed, would you mind to elaborate why you think they are not enough? If they really are, you might want to look into "parser generators".Bergi
I am working alone, and I have no knowledge more than what I had 30 years ago ( I am resuming some old knowledge ). Things have changed a lot and surelly somebody knows more. I know neither what means formal language theory neither automata theory. At my time, computer science wasn't a university discipline. Some practical example of a language that has a flexible pattern matchig would suffice, but I need no just theoretical knowledge, but practical and used one.George Kourtis
You've asked for mathematical theories that help you with your task, and I've given you their names - I won't explain them to you :-) Look them up in your favourite encyclopedia and come back with a more explicit question, please. Btw, the Chomsky hierarchy is way older than 30 years, only the field wasn't called "theoretical computer science" back then.Bergi
I looked in wikipedia, and formal theory is what I try to implement for pattern recognition, while I use automata for the interpretation of input - not in patterns but generally. But that doesn't helps me a lot in pattern matching. I need to know if there is a language that has the possibility given a composition of patterns to tell me if that is matched or no ! Thanks any way for the links !George Kourtis
As I said, I think regular languages (represented by regular expressions) seem to fit your problem - and you can compose them in various ways. If you don't want to reinvent the wheel and look for existing implementations, you would need to give more details about your actual problem, your environment, your programming language.Bergi

2 Answers

1
votes

I suggest you have a look at the paper "Parsing Permutation Phrases". It deals with recognizing a set of things in any order where the "things" can be recognizers themselves. The presentation in the paper might be a little different than what you expect; they don't compile to finite automaton. However, they do give an implementation in a functional language and that should be helpful to you.

1
votes

Your description of matching strings against patterns is exactly what a compiler does. In particular, your description of multiple potential matches is highly reminiscent of the way an LR parser works.

If the patterns are static and can be described by an EBNF, then you could use an LR parser generator (such as YACC) to generate a recogniser.

If the patterns are dynamic but can still be formulated as EBNF there are other tools that can be applied. It just gets a bit more complicated.

[In Australia at least, Computer Science was a University course in 1975, when I did mine. YACC dates from around 1970 in its original form. EBNF is even older.]