Skip to content

Automated Test: db-cleanup-optimized #333

Closed
wants to merge 1 commit into from

Conversation

admin-coderabbit
Copy link
Owner

@admin-coderabbit admin-coderabbit commented Feb 4, 2026

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Fixed potential database deadlocks that could occur during annotation cleanup operations on certain database systems.
  • Performance

    • Increased annotation cleanup frequency from 10 to 1 minute for more responsive database management.
    • Optimized cleanup operations with efficient batching to reduce performance impact.

… deadlocks on MySQL (#80329)

* Split subquery when cleaning annotations

* update comment

* Raise batch size, now that we pay attention to it

* Iterate in batches

* Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use

* Use split-out utility in outer batching loop so it respects context cancellation

* guard against empty queries

* Use SQL parameters

* Use same approach for tags

* drop unused function

* Work around parameter limit on sqlite for large batches

* Bulk insert test data in DB

* Refactor test to customise test data creation

* Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit

* Turn annotation cleanup test to integration tests

* lint

---------

Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
@coderabbit-eval
Copy link

coderabbit-eval bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

The annotation cleanup system transitions from direct batch deletion to incremental batched deletion with retry logic to prevent MySQL deadlocks. Integration tests are restructured with new test table fields and batched database operations. The cleanup service ticker frequency increases from 10 to 1 minute.

Changes

Cohort / File(s) Summary
Annotation Cleanup Logic
pkg/services/annotations/annotationsimpl/xorm_store.go
Replaced single-statement batched deletes with incremental batch-based deletion loop (untilDoneOrCancelled). Introduced helper functions fetchIDs, deleteByIDs, asAny to load and delete by ID batches. Added SQLite parameter limit handling via inline ID insertion. Added migrator import for database type detection.
Integration Tests
pkg/services/annotations/annotationsimpl/cleanup_test.go
Renamed tests to integration variants and added short-mode guards. Expanded test table with createAnnotationsNum, createOldAnnotationsNum, annotationCleanupJobBatchSize fields. Replaced single-record inserts with batched InsertMulti operations for annotations and tags. Added per-test cleanup via t.Cleanup block for table deletion.
Cleanup Service Configuration
pkg/services/cleanup/cleanup.go
Reduced cleanup ticker interval from 10 minutes to 1 minute, increasing cleanup invocation frequency.

Sequence Diagram

sequenceDiagram
    participant CleanAnnotations
    participant untilDoneOrCancelled
    participant fetchIDs
    participant deleteByIDs
    participant Database
    
    CleanAnnotations->>untilDoneOrCancelled: Start batch cleanup loop
    loop Until no rows affected or error
        untilDoneOrCancelled->>fetchIDs: Load next batch of IDs
        fetchIDs->>Database: Query IDs with condition limit
        Database-->>fetchIDs: Return ID batch
        fetchIDs-->>untilDoneOrCancelled: IDs loaded
        
        alt IDs found
            untilDoneOrCancelled->>deleteByIDs: Delete batch by IDs
            deleteByIDs->>Database: Execute delete with ID list
            Database-->>deleteByIDs: Rows deleted
            deleteByIDs-->>untilDoneOrCancelled: Affected count
            untilDoneOrCancelled->>untilDoneOrCancelled: Check if more rows remain
        else No IDs found
            untilDoneOrCancelled->>untilDoneOrCancelled: Exit loop
        end
    end
    untilDoneOrCancelled-->>CleanAnnotations: Cleanup complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hop along, dear cleanup crew,
Batches small, deletions true,
No more locks to cause dismay,
Annotations swept away!

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Automated Test: db-cleanup-optimized' is vague and does not clearly describe the substantive changes made. It uses generic phrasing that obscures the actual implementation details of the cleanup optimization. Consider a more specific title that captures the key changes, such as 'Refactor annotation cleanup to use batched incremental deletes' or 'Convert annotation cleanup tests to integration tests and optimize delete operations'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch db-cleanup-optimized

Comment @coderabbit eval help to get the list of available commands and usage tips.

@coderabbit-eval coderabbit-eval bot closed this Feb 4, 2026
Copy link

@coderabbit-eval coderabbit-eval bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 523-559: The cleanup loops call r.log.Error for every batch even
when err==nil and include full ID lists; update the logging in the
untilDoneOrCancelled lambdas (the blocks using fetchIDs and deleteByIDs) to only
use r.log.Error when an actual error occurs (err or y != nil), and use
r.log.Info or r.log.Debug for successful batches (e.g., log count and affected
but omit or redact the full ids list); apply the same change to the orphaned tag
cleanup logs so success paths don’t emit error-level logs or full ID arrays.
🧹 Nitpick comments (1)
pkg/services/annotations/annotationsimpl/xorm_store.go (1)

597-629: Avoid quadratic string concatenation for large SQLite batches.

fmt.Sprintf in a loop is O(n²) and can be costly for large batches (e.g., 32k IDs). A builder avoids repeated allocations.

♻️ Suggested builder-based construction
-		values := fmt.Sprint(ids[0])
-		for _, v := range ids[1:] {
-			values = fmt.Sprintf("%s, %d", values, v)
-		}
+		var b strings.Builder
+		b.Grow(len(ids) * 11) // rough capacity
+		b.WriteString(strconv.FormatInt(ids[0], 10))
+		for _, v := range ids[1:] {
+			b.WriteByte(',')
+			b.WriteString(strconv.FormatInt(v, 10))
+		}
+		values := b.String()

Comment on lines +523 to +559
// Single-statement approaches, specifically ones using batched sub-queries, seem to deadlock with concurrent inserts on MySQL.
// We have a bounded batch size, so work around this by first loading the IDs into memory and allowing any locks to flush inside each batch.
// This may under-delete when concurrent inserts happen, but any such annotations will simply be cleaned on the next cycle.
//
// We execute the following batched operation repeatedly until either we run out of objects, the context is cancelled, or there is an error.
affected, err := untilDoneOrCancelled(ctx, func() (int64, error) {
cond := fmt.Sprintf(`%s AND created < %v ORDER BY id DESC %s`, annotationType, cutoffDate, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize))
ids, err := r.fetchIDs(ctx, "annotation", cond)
if err != nil {
return 0, err
}
r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)

affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
x, y := r.deleteByIDs(ctx, "annotation", ids)
r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
return x, y
})
totalAffected += affected
if err != nil {
return totalAffected, err
}
}

if cfg.MaxCount > 0 {
deleteQuery := `DELETE FROM annotation WHERE id IN (SELECT id FROM (SELECT id FROM annotation WHERE %s ORDER BY id DESC %s) a)`
sql := fmt.Sprintf(deleteQuery, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount))
affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
// Similar strategy as the above cleanup process, to avoid deadlocks.
affected, err := untilDoneOrCancelled(ctx, func() (int64, error) {
cond := fmt.Sprintf(`%s ORDER BY id DESC %s`, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount))
ids, err := r.fetchIDs(ctx, "annotation", cond)
if err != nil {
return 0, err
}
r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err)

x, y := r.deleteByIDs(ctx, "annotation", ids)
r.log.Error("cleaned annotations by count", "count", len(ids), "affected", x, "err", y)
return x, y
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use debug/info logs for successful cleanup batches.

These paths log at Error level even when err is nil and include full ID lists, which can spam error logs and inflate log volume. Consider Debug/Info for success and only emit Error on failures (apply the same pattern to the orphaned tag cleanup logs).

💡 Suggested logging adjustments
-			r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+			r.log.Debug("Annotations to clean by time", "count", len(ids), "cond", cond)

-			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
+			if y != nil {
+				r.log.Error("Failed to clean annotations by time", "error", y)
+			} else {
+				r.log.Debug("Cleaned annotations by time", "affected", x)
+			}

Also applies to: 569-580

🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 523 -
559, The cleanup loops call r.log.Error for every batch even when err==nil and
include full ID lists; update the logging in the untilDoneOrCancelled lambdas
(the blocks using fetchIDs and deleteByIDs) to only use r.log.Error when an
actual error occurs (err or y != nil), and use r.log.Info or r.log.Debug for
successful batches (e.g., log count and affected but omit or redact the full ids
list); apply the same change to the orphaned tag cleanup logs so success paths
don’t emit error-level logs or full ID arrays.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant