0
votes

I'm using Alfresco 5.1 community Edition with Solr4 configured as Search Service and Transaction queries configured as Hybrid (Solr & DB)

When I do a search in Solr GUI from the below URL

Solr Query GUI: https://localhost:8443/solr4/#/alfresco/query

I get the search results in the below format with some ID & other info.

Solr Search Result (Results JSON truncated for readability)

{
  "responseHeader": {
    "status": 0,
    "QTime": 25,
    "params": {
      "q": "testing",
      "defType": "dismax",
      "qt": "",
      "indent": "true",
      "wt": "json",
      "_": "1476349027637"
    }
  },
  ...
    "docs": [
      {
        "id": "_DEFAULT_!8000000000000040!80000000000008e3",
        "_version_": 0,
        "DBID": 2275
      },
      {
        "id": "_DEFAULT_!8000000000000072!8000000000000902",
        "_version_": 0,
        "DBID": 2306
      },
      {
        "id": "_DEFAULT_!8000000000000040!80000000000008ea",
        "_version_": 0,
        "DBID": 2282
      },
      {
        "id": "_DEFAULT_!800000000000000b!80000000000008ef",
        "_version_": 0,
        "DBID": 2287
      },
      {
        "id": "_DEFAULT_!8000000000000071!80000000000008f0",
        "_version_": 0,
        "DBID": 2288
      },
      {
        "id": "_DEFAULT_!8000000000000025!80000000000008eb",
        "_version_": 0,
        "DBID": 2283
      }
    ]
  },
  "processedDenies": false
}

I'm trying to build a UI where in these search results displayed, a user can click through to retrieve the respective document in Alfresco. Below is the Alfresco API I use to retrieve content from Alfresco.

Alfresco API URL to open a Document : http://localhost:8080/alfresco/api/-default-/public/cmis/versions/1.1/atom/content?id=

A sample Alfresco document ID looks like the one shown below. I don't get such ID returned in Solr4 search results.

Sample Document Id:

7edf97f4-43cf-4fe5-8099-85608776d159

Questions:

1) What is the ID returned by Solr4 ?
2) How do I get the relevant Alfresco document ID to be able to retrieve the same from the search result ?

EDIT:

Some background about my requirement to use Solr directly

Alfresco will be used to create documents based on some templates by interal users (business content administrators from Intranet typically). We've a front end web app (customer facing) which will have a Search section. When users perform a search operation with some keywords (Typically full text search), we would be invoking Solr API to search content in the documents created by Business Admins and the same results would be displayed on the Front end of Web app. When users clicks on the respective search results, the document content would be retrieved from Alfresco & displayed on the Front end webapp.

Thanks in advance.

3
Why are you calling SOLR directly? Why not write a Share page (quite possibly using Aikau) and have Alfresco do the talking to SOLR + mapping to documents for you? - Gagravarr
Thanks for the update @Gagravarr. I'm new to Alfresco & I'm learning things here. Any pointers on how to do that ? Preferrably a documentation or example ? - Venkat

3 Answers

3
votes

It would be much easier to implement it as Alfresco Web Script.

With Web Scripts, you can either build your own RESTful interface using light-weight scripting technologies such as JavaScript and Freemarker.

Using web script you can access search root object:

search - org.alfresco.repo.jscript.Search - Root object providing access to the various Alfresco search interfaces such as FTS-Alfresco, Lucene, XPath, and Saved Search results

Your REST web script may be available to every user but run as admin:

    <webscript>
      <shortname>My Rest Query</shortname>
      <url>/api/my/query</url>
      <format default="json">argument</format>
      <authentication runas="admin">guest</authentication>
      <transaction allow="readonly">required</transaction>
    </webscript>

There are many tutorials...

2
votes

1) The ID returned by Solr is probably the ID of the indexed document in Solr. You can't use it with Alfresco.

2) It seems that Solr returns the DBID of the nodes. DBID is the property sys:node-dbid from aspect sys:referenceable defined in the file systemModel.xml and which refers to the database id of the node. You can build an Alfresco repo webscript which takes this DBID as parameter and returns the document.

But as imagine said, you'd better directly ask Alfresco to execute your Solr query. It would return a list of documents with all the metadata you need, including the download URL of each document.

0
votes

Adding a partial answer to your 2nd question because locating this info was hard and took quite some time. (2. How do I get the relevant Alfresco document ID to be able to retrieve the same from the search result ?)

To find the document associated with that DBID, you can use the following search syntax:

  1. Go to Admin Tools -> Node Browser
  2. Change query type to lucene
  3. Enter the following search term: @sys\:node-dbid:THE_DBID_YOU_WANT_TO_FIND

For example, looking at our local solr4 error report:

{
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "q":"ERROR*"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "id":"_DEFAULT_!800000000000008c!8000000000002289",
        "_version_":0,
        "DBID":4499},
...

To find that document, search for: @sys\:node-dbid:4499

You can add quotes around the numeric DBID - it works with and without them.

The '@' and the first backslash '\' (escaping the first colon) are REQUIRED - the query breaks if these are removed and an error will be logged in catalina.out.

The second colon MUST NOT include a backslash escape - it is NOT an error (nothing in the log) but no result will be found.

If necessary change the search scope from workspace://SpacesStore to archive://SpacesStore to locate docs that have been deleted.

You can join the DBID's as shown below to find them all at once (at least those in the same spaces store):

@sys\:node-dbid:1234 OR @sys\:node-dbid:2345 OR @sys\:node-dbid:...