3
votes

I have been following the tutorial regarding the Google Search API at https://developers.google.com/appengine/docs/java/search/overview. The information I have found is very clear on how to build the document and load it into an index. What I am not sure of is how to load the datastore data into the document.

What am trying to achieve is a simple %LIKE% query on a few fields. For example, I am working on a music library. If the user types in "glory", then I would like to use the Search API to return all entities with "glory" somewhere in the title. I have implemented the "starts with" work around by adding the search text to "\uFFFD", however, I find this insufficient. My users will be very novice, and it would also be helpful if they didn't have to pick a field as in a traditional search. So full text search seems the solution.

Here are my questions:

  1. Should each record in my datastore be a document? Or all the records into one document? I have a pretty well fixed datastore size of only 1000 records. Could anyone provide an example of the correct method?

  2. I would like to return the entire datastore entity (it's only 8 fields) as an Iterable of the type of my entity. Do we specify each field we need to return? The example just says:

    for (ScoredDocument scoredDocument : results) { // process scoredDocument }

Does anybody have an example of what comes out of the stored document? Is it exactly what we put in or must you identify each field again? Or an example of processes a ScoredDocument returning a datastore entities?

If anybody could help fill in these blanks for me, I would appreciate it.

Thank you for looking at this with me.

1
Hi @user2353180, I am trying to do something similar and looking for an example. If you could you share your findings or solution in case you figured it out already, it would be very helpful for me and others looking for a similar solution. Thanks in advance. - MemoryLeak

1 Answers

0
votes

What am trying to achieve is a simple %LIKE% query on a few fields

In order to achieve this you need to "tokenize" your records name, GAE provides FULL TEXT SEARCH so in order for you to get partial matches you need to add partial matches for every record so:

If your record's name is "Glory" you should add the tokens for "G","Gl","Glo","Glor","y","ry","ory","lory"

Here's a very basic implementation I use to provide partial search results (only for "starts with" not implementing "end with")

public void addField(String name, String value, boolean tokenize) {
    addField(Field.newBuilder().setName(name).setText(value));
    if (tokenize) {
        for (int i = startTokenIndex ; i < value.length() ;i++) {
            addField(Field.newBuilder().setName("token" + (lastTokenIndex++))
                    .setText(value.substring(0, i)));
        }

    }
}

Should each record in my datastore be a document?

Yes. you could even match the document ID with the entity's datastore ID for quick matching. (or you can just add it as a separate field)

I would like to return the entire datastore entity (it's only 8 fields) as an Iterable of the type of my entity. Do we specify each field we need to return?

You need to store your entity's ID in your document, that way when your search returns a set of documents you just retrieve all entities with those IDs.

Does anybody have an example of what comes out of the stored document? Is it exactly what we put in or must you identify each field again? Or an example of processes a ScoredDocument returning a datastore entities?

Documents return all fields you stored in them, plus a lot of data like scoring, id, etc. The "processing" in your case would consist of getting the entity id form the Document.

If you are certain your records wont grow above 1000 you could virtually store everything within your search index. Just bear in mind the index is not designed for that and will face some serious limitations when scaling, which the datastore obviously doesn't.