0
votes

I would like to know which statement (see below) will be more efficient for determining the size of a Cluster Table. Or at least determine, whether the table size reaches a certain threshhold {n}. Efficiency meaning using less PSAPTEMP tablespace.

The problem with Cluster Tables is, that in order to get an entry for a table the fields of one entry need to be looked up in several tables of the Cluster where they are dispersed. Thus, more than just the counted table need to be looked at. So for every entry several entries need to be looked up. This makes it inefficient for reads and this can make it dump because the COUNT uses an INT datatype that can overflow.

SELECT COUNT(*) 
  ... 
  UP TO {n} rows.


SELECT *
  ...
  UP TO {n} ROWS.
ENDSELECT.             `and then determine the size of the result. `

To me they seem equivalent, but maybe they are not when using a threshold. Maybe the limitation makes a difference depending how the data is read. EDIT: Of course, SELECT .. ENDSELECT is a loop and thus less efficient principally.

But I would like to know how it actually works under the hood and understand the difference better. So far it seems like I will have to try it out. I assume the database will differ but will most often be Oracle.

3
I edited your question to highlight that it concerns the Cluster Table, that's very special in SAP world. Please correct me if I understood incorrectly. - Sandra Rossi
I'd say that concerning Cluster Tables, you're correct, but if it's just about ABAP, you should use COUNT just because you should use the statement that corresponds to what you need - I don't see the interest of writing some extra code just to obfuscate your intention. Note that cluster tables have been deprecated since ABAP 7.53. - Sandra Rossi
you need table size in rows or in megabytes? if the latter it cannot be determined by ABAP, basis tools needed, if the former, only COUNT should be used. Your task is not clear - Suncatcher
Both would be ok. But why should only COUNT be used? - Ela
If you just need to count, why would you use something else than COUNT? If you want to paint a wall, why would you use a toothbrush rather than a paintbrush? - Sandra Rossi

3 Answers

0
votes

Basically Select Endselect will run a loop and there will be multiple trips to DB Server.

Technically select SELECT COUNT(*) will perform all the data on the DB server itself and in one shot.

After which you can simply put the data in an internal table and work on the same.

As per the standards, this is not at all recommended even for normal transparent tables leave aside Cluster tables.

Access to Cluster tables is very expensive. Also, to make the matter worse you cannot use any indexes on Cluster tables. Its always better to provide as much data in the where clause as possible.

The priority is always given to fetch the data in one shot from the Database Server using

select * from table into table where ....

and then loop on it on the local server.

Specifically in your use case It will be fastest if you will be using count(*) and not select endselect.

Certified SAP ABAP Consultant

0
votes

We could not really create the test environment we needed. So no final answer. But some learnings:

  • Reading the data from cluster tables should be done based on a full primary key sequence (Should be accessed via primary key - very fast retrieval otherwise very slow)
  • There are no secondary indexes
  • Select * is Ok because all columns are retrieved anyways. Performing an operation on multiple rows is more efficient than single row operations. -> Therefore you still want to select into an internal table.
  • If many rows are being selected into the internal table, you might still like to retrieve specific columns to cut down on the memory required.
  • There is a way to convert cluster to transparent but with downtime and this no way for us
  • Aggreate SQL functions (SUM, AVG, MIN, MAX, etc) are not supported
0
votes

Using native SQL with COUNT BIG instead of COUNT can make it not memory efficient but prevent it from dumping due to a counter overflow.