$ORDER vs counting to scan global range

Question

I have a choice between two ways of scanning through a key level in a large global array and am trying to figure out if one method is more efficient than the other.

This is a vendor supplied application and database on the Intersystems Caché database platform. It is written in the old MUMPS style and does not use any of Caché's object persistence functions: all data is stored in globals directly and any indexes are application maintained.

There is a common convention for repeating data elements attached to entities where the first record will contain a count of child records and then each child record is numbered sequentially at the next key level. For example:

^GBDATA(12345,100)="3"
^GBDATA(12345,100,1)="A^Record"
^GBDATA(12345,100,2)="B^Record"
^GBDATA(12345,100,3)="C^Record"

Where "12345" is the entity key, and "100" is one of the attached detail types. Note that the first "100" record with no other keys has the count of subrecords. There could be anywhere between 0 and hundreds of subrecords attached. The entities are often very wide and there is a lot of other data besides this subrecord type (not shown in example).

Given an entity key, I want to scan through all the subrecords of one type. Would it be faster to use $ORDER to go through the subkeys or to use a FOR loop to anticipate the key values? Does it matter?

$ORDER method:

SET EKEY=12345
SET SEQ=""

FOR
{
 SET SEQ=$ORDER(^GBDATA(EKEY,100,SEQ), 1, ROWDATA)
 QUIT:SEQ=""

 WRITE ROWDATA,!
}

FOR count method:

SET EKEY=12345
SET LIM=^GBDATA(EKEY,100)

FOR SEQ=1:1:LIM
{
 WRITE ^GBDATA(EKEY,100,SEQ),!
}

Does anyone know how $ORDER vs $GET is implemented internally in Caché?

I'm having trouble testing this empirically since we only have one production instance with appropriate data and I can't take it offline to clear the cache. I'm most interested in from-disk performance as opposed to from-cache performance.

If the convention is ever not followed, and there is a gap in the index, or the limit value is lower than the number of items in the list, what should your application do? Is this more important than speed? Also, you could pretty easily set up a test with fake data off the production system. — psr
@psr - The app has been very consistent about this kind of convention in the past, but I'd probably use $GET with a default to hedge against missing values. I have been using $ORDER because it always works and seems like a more direct solution. I could probably test it by setting up a test Caché instance but asking on SO is easier and I wanted to see if anyone knew the implementation details. — Chris Smith

mcbainpc mcbainpc · Accepted Answer · 2012-11-20T20:48:17

You could use %SYS.MONLBL to figure out definitively. My guess is that $ORDER is slightly better.

http://docs.intersystems.com/cache20122/csp/docbook/DocBook.UI.Page.cls?KEY=GCM_monlbl

$ORDER vs counting to scan global range

2 Answers