We've got an Oracle 11g installation that is starting to get big. This database is the backend to a parallel optimization system running on a cluster. Input to the process is contained in the database along with output from the optimization steps. The input includes rote configuration data and some binary files (using 11g's SecureFiles). The output includes 1D, 2D, 3D, and 4D data currently stored in the DB.
DB Structure:
/* Metadata tables */
Case(CaseId, DeleteFlag, ...) On Delete Cascade CaseId
OptimizationRun(OptId, CaseId, ...) On Delete Cascade OptId
OptimizationStep(StepId, OptId, ...) On Delete Cascade StepId
/* Data tables */
Files(FileId, CaseId, Blob) /* deletes are near instantateous here */
/* Data per run */
OnedDataX(OptId, ...)
TwoDDataY1(OptId, ...) /* packed representation of a 1D slice */
/* Data not only per run, but per step */
TwoDDataY2(StepId, ...) /* packed representation of a 1D slice */
ThreeDDataZ(StepId, ...) /* packed representation of a 2D slice */
FourDDataZ(StepId, ...) /* packed representation of a 3D slice */
/* ... About 10 or so of these tables exist */
A reaper script comes around daily and looks for cases with the DeleteFlag = 1 and proceeds with the DELETE FROM Case WHERE DeleteFlag = 1, allowing the cascades to continue.
This strategy works great for read/write, but is now outstripping our capabilities when we want to purge data! The rub is deleting a Case takes ~20-40 minutes depending on the size and often overloads our archiver space. The next major version of the product will take a "from the ground up" approach to solving the problem. The next minor release needs to stay within the confines of data stored in the database.
So, for the minor release we need an approach that can improve delete performance and at most require moderate changes to the database.
- REF Partitioning, but the question is HOW? I would love to do INTERVAL on
Caseand REF on the rest, but that isn't supported. Is there some way to manually partitionOptimizationRunbyCaseIdthrough a trigger? - Disable archiving/redo logs for deletes? Couldn't find a HINT to go with this one. Not sure it is even feasible.
Truncate? This likely would need some sorta complicated table setup. But maybe I'm not considering all of my option.(per answer, stricken)
To help illustrate the issue, the data in question per case ranges from 15MiB to 1.5GiB with anywhere from 20k to 2M rows.
Update: Current size of the DB is ~1.5TB.
Case -> OptimizationRun -> OptimizationStepas it appears, why do some tables carry both? If usingDELETE CASCADE, you're running the delete against those 12/13 tables twice; once for the OptID key, and again for the StepID key! If one of those foreign keys isn't indexed, the performance would be even worse! - Adam Musch