Automated Test: db-cleanup-optimized #333

admin-coderabbit · 2026-02-04T18:52:48Z

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Fixed potential database deadlocks that could occur during annotation cleanup operations on certain database systems.
Performance
- Increased annotation cleanup frequency from 10 to 1 minute for more responsive database management.
- Optimized cleanup operations with efficient batching to reduce performance impact.

… deadlocks on MySQL (#80329) * Split subquery when cleaning annotations * update comment * Raise batch size, now that we pay attention to it * Iterate in batches * Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use * Use split-out utility in outer batching loop so it respects context cancellation * guard against empty queries * Use SQL parameters * Use same approach for tags * drop unused function * Work around parameter limit on sqlite for large batches * Bulk insert test data in DB * Refactor test to customise test data creation * Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit * Turn annotation cleanup test to integration tests * lint --------- Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>

coderabbit-eval · 2026-02-04T18:55:47Z

📝 Walkthrough

Walkthrough

The annotation cleanup system transitions from direct batch deletion to incremental batched deletion with retry logic to prevent MySQL deadlocks. Integration tests are restructured with new test table fields and batched database operations. The cleanup service ticker frequency increases from 10 to 1 minute.

Changes

Cohort / File(s)	Summary
Annotation Cleanup Logic `pkg/services/annotations/annotationsimpl/xorm_store.go`	Replaced single-statement batched deletes with incremental batch-based deletion loop (untilDoneOrCancelled). Introduced helper functions fetchIDs, deleteByIDs, asAny to load and delete by ID batches. Added SQLite parameter limit handling via inline ID insertion. Added migrator import for database type detection.
Integration Tests `pkg/services/annotations/annotationsimpl/cleanup_test.go`	Renamed tests to integration variants and added short-mode guards. Expanded test table with createAnnotationsNum, createOldAnnotationsNum, annotationCleanupJobBatchSize fields. Replaced single-record inserts with batched InsertMulti operations for annotations and tags. Added per-test cleanup via t.Cleanup block for table deletion.
Cleanup Service Configuration `pkg/services/cleanup/cleanup.go`	Reduced cleanup ticker interval from 10 minutes to 1 minute, increasing cleanup invocation frequency.

Sequence Diagram

sequenceDiagram
    participant CleanAnnotations
    participant untilDoneOrCancelled
    participant fetchIDs
    participant deleteByIDs
    participant Database
    
    CleanAnnotations->>untilDoneOrCancelled: Start batch cleanup loop
    loop Until no rows affected or error
        untilDoneOrCancelled->>fetchIDs: Load next batch of IDs
        fetchIDs->>Database: Query IDs with condition limit
        Database-->>fetchIDs: Return ID batch
        fetchIDs-->>untilDoneOrCancelled: IDs loaded
        
        alt IDs found
            untilDoneOrCancelled->>deleteByIDs: Delete batch by IDs
            deleteByIDs->>Database: Execute delete with ID list
            Database-->>deleteByIDs: Rows deleted
            deleteByIDs-->>untilDoneOrCancelled: Affected count
            untilDoneOrCancelled->>untilDoneOrCancelled: Check if more rows remain
        else No IDs found
            untilDoneOrCancelled->>untilDoneOrCancelled: Exit loop
        end
    end
    untilDoneOrCancelled-->>CleanAnnotations: Cleanup complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 Hop along, dear cleanup crew,
Batches small, deletions true,
No more locks to cause dismay,
Annotations swept away! ✨

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Automated Test: db-cleanup-optimized' is vague and does not clearly describe the substantive changes made. It uses generic phrasing that obscures the actual implementation details of the cleanup optimization.	Consider a more specific title that captures the key changes, such as 'Refactor annotation cleanup to use batched incremental deletes' or 'Convert annotation cleanup tests to integration tests and optimize delete operations'.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch db-cleanup-optimized

_{Comment @coderabbit eval help to get the list of available commands and usage tips.}

coderabbit-eval

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 523-559: The cleanup loops call r.log.Error for every batch even
when err==nil and include full ID lists; update the logging in the
untilDoneOrCancelled lambdas (the blocks using fetchIDs and deleteByIDs) to only
use r.log.Error when an actual error occurs (err or y != nil), and use
r.log.Info or r.log.Debug for successful batches (e.g., log count and affected
but omit or redact the full ids list); apply the same change to the orphaned tag
cleanup logs so success paths don’t emit error-level logs or full ID arrays.

🧹 Nitpick comments (1)

pkg/services/annotations/annotationsimpl/xorm_store.go (1)
597-629: Avoid quadratic string concatenation for large SQLite batches.

fmt.Sprintf in a loop is O(n²) and can be costly for large batches (e.g., 32k IDs). A builder avoids repeated allocations.
♻️ Suggested builder-based construction
-		values := fmt.Sprint(ids[0])
-		for _, v := range ids[1:] {
-			values = fmt.Sprintf("%s, %d", values, v)
-		}
+		var b strings.Builder
+		b.Grow(len(ids) * 11) // rough capacity
+		b.WriteString(strconv.FormatInt(ids[0], 10))
+		for _, v := range ids[1:] {
+			b.WriteByte(',')
+			b.WriteString(strconv.FormatInt(v, 10))
+		}
+		values := b.String()

coderabbit-eval · 2026-02-04T19:08:52Z

pkg/services/annotations/annotationsimpl/xorm_store.go

+		// Single-statement approaches, specifically ones using batched sub-queries, seem to deadlock with concurrent inserts on MySQL.
+		// We have a bounded batch size, so work around this by first loading the IDs into memory and allowing any locks to flush inside each batch.
+		// This may under-delete when concurrent inserts happen, but any such annotations will simply be cleaned on the next cycle.
+		//
+		// We execute the following batched operation repeatedly until either we run out of objects, the context is cancelled, or there is an error.
+		affected, err := untilDoneOrCancelled(ctx, func() (int64, error) {
+			cond := fmt.Sprintf(`%s AND created < %v ORDER BY id DESC %s`, annotationType, cutoffDate, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize))
+			ids, err := r.fetchIDs(ctx, "annotation", cond)
+			if err != nil {
+				return 0, err
+			}
+			r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)

-		affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
+			x, y := r.deleteByIDs(ctx, "annotation", ids)
+			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
+			return x, y
+		})
 		totalAffected += affected
 		if err != nil {
 			return totalAffected, err
 		}
 	}

 	if cfg.MaxCount > 0 {
-		deleteQuery := `DELETE FROM annotation WHERE id IN (SELECT id FROM (SELECT id FROM annotation WHERE %s ORDER BY id DESC %s) a)`
-		sql := fmt.Sprintf(deleteQuery, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount))
-		affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
+		// Similar strategy as the above cleanup process, to avoid deadlocks.
+		affected, err := untilDoneOrCancelled(ctx, func() (int64, error) {
+			cond := fmt.Sprintf(`%s ORDER BY id DESC %s`, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount))
+			ids, err := r.fetchIDs(ctx, "annotation", cond)
+			if err != nil {
+				return 0, err
+			}
+			r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+
+			x, y := r.deleteByIDs(ctx, "annotation", ids)
+			r.log.Error("cleaned annotations by count", "count", len(ids), "affected", x, "err", y)
+			return x, y
+		})


⚠️ Potential issue | 🟡 Minor

Use debug/info logs for successful cleanup batches.

These paths log at Error level even when err is nil and include full ID lists, which can spam error logs and inflate log volume. Consider Debug/Info for success and only emit Error on failures (apply the same pattern to the orphaned tag cleanup logs).

💡 Suggested logging adjustments

- r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err) + r.log.Debug("Annotations to clean by time", "count", len(ids), "cond", cond) - r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) + if y != nil { + r.log.Error("Failed to clean annotations by time", "error", y) + } else { + r.log.Debug("Cleaned annotations by time", "affected", x) + }

Also applies to: 569-580

🤖 Prompt for AI Agents

In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 523 - 559, The cleanup loops call r.log.Error for every batch even when err==nil and include full ID lists; update the logging in the untilDoneOrCancelled lambdas (the blocks using fetchIDs and deleteByIDs) to only use r.log.Error when an actual error occurs (err or y != nil), and use r.log.Info or r.log.Debug for successful batches (e.g., log count and affected but omit or redact the full ids list); apply the same change to the orphaned tag cleanup logs so success paths don’t emit error-level logs or full ID arrays.

coderabbit-eval bot closed this Feb 4, 2026

coderabbit-eval bot reviewed Feb 4, 2026

View reviewed changes

Automated Test: db-cleanup-optimized #333

Automated Test: db-cleanup-optimized #333

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

coderabbit-eval bot left a comment

coderabbit-eval bot Feb 4, 2026

Automated Test: db-cleanup-optimized #333

Automated Test: db-cleanup-optimized #333

Conversation

admin-coderabbit commented Feb 4, 2026 • edited by coderabbit-eval bot Loading

Summary by CodeRabbit

Release Notes

coderabbit-eval bot commented Feb 4, 2026 • edited Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

coderabbit-eval bot left a comment

Choose a reason for hiding this comment

coderabbit-eval bot Feb 4, 2026

Choose a reason for hiding this comment

admin-coderabbit commented Feb 4, 2026 •

edited by coderabbit-eval bot

Loading

coderabbit-eval bot commented Feb 4, 2026 •

edited

Loading