6
votes

I am trying my hands on SWI-Prolog in win xp. I am trying to understand how to split a sentence in Prolog into separate atoms.

Ex : Say I have a sentence like this :

"this is a string"
Is there any way to get individual words to get stored in a variable?

like :

X = this
Y = is
....
and so forth.

Can anyone please explain how this works?

Thanks.

3

3 Answers

13
votes

I would use atomic_list_concat/3. See

http://www.swi-prolog.org/pldoc/man?predicate=atomic_list_concat%2F3

Normally it is meant to insert a separator but because of Prolog's bidirectionality of unification, it can also be used to split a string given the separator:

atomic_list_concat(L,' ', 'This is a string').
L = ['This',is,a,string]

Of course once the split is done you can play with the elements of the list L.

5
votes

I like the answer of 'pat fats', but you have to convert your string to atom before:

..., atom_codes(Atom, String), atomic_list_concat(L, ' ', Atom), ...

If you need to work directly with strings, I have this code in my 'arsenal':

%%  split input on Sep
%
%   minimal implementation
%
splitter(Sep, [Chunk|R]) -->
    string(Chunk),
    (   Sep -> !, splitter(Sep, R)
    ;   [], {R = []}
    ).

being a DCG, must be called in this way:

?- phrase(splitter(" ", L), "this is a string"), maplist(atom_codes, As, L).
L = [[116, 104, 105, 115], [105, 115], [97], [115, 116, 114, 105, 110|...]],
As = [this, is, a, string] .

edit: more explanation

I forgot to explain how that works: DCG are well explained by @larsman, in this other answer. I cite him

-->, which actually adds two hidden arguments to it. The first of these is a list to be parsed by the grammar rule; the second is "what's left" after the parse. c(F,X,[]) calls c on the list X to obtain a result F, expecting [] to be left, i.e. the parser should consume the entire list X.

Here I have 2 arguments, the first it's the separator, the second the list being built. The builtin string//1 come from SWI-Prolog library(http/dcg_basics). It's a very handy building block, that match literally anything on backtracking. Here it's 'eating' each char before the separator or the end-of-string. Having done that, we can recurse...

-5
votes

?-split("this is a string"," ", Out).

Out=["this","is","a"," string"]