Skip to content

Automated Test: db-cleanup-optimized #342

Closed
wants to merge 1 commit into from

Conversation

admin-coderabbit
Copy link
Owner

@admin-coderabbit admin-coderabbit commented Feb 4, 2026

This pull request was automatically created by @coderabbitai/e2e-reviewer.

Batch created pull request.

Summary by CodeRabbit

  • Bug Fixes

    • Improved annotation cleanup to prevent deadlocks during concurrent operations
    • Increased annotation cleanup frequency from every 10 minutes to every 1 minute for more responsive maintenance
  • Tests

    • Enhanced annotation cleanup test suite with integration-style testing and expanded scenarios, including large batch size handling

… deadlocks on MySQL (#80329)

* Split subquery when cleaning annotations

* update comment

* Raise batch size, now that we pay attention to it

* Iterate in batches

* Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use

* Use split-out utility in outer batching loop so it respects context cancellation

* guard against empty queries

* Use SQL parameters

* Use same approach for tags

* drop unused function

* Work around parameter limit on sqlite for large batches

* Bulk insert test data in DB

* Refactor test to customise test data creation

* Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit

* Turn annotation cleanup test to integration tests

* lint

---------

Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
@coderabbit-eval
Copy link

coderabbit-eval bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

The changes refactor annotation cleanup logic from single-statement batched deletes to an in-memory fetch-then-delete batching pattern to mitigate deadlocks. Test coverage is expanded with integration-style tests, larger data scenarios, and batch size handling. Cleanup service frequency is increased from 10 to 1 minute.

Changes

Cohort / File(s) Summary
Test Refactoring & Expansion
pkg/services/annotations/annotationsimpl/cleanup_test.go
Converts unit tests to integration-style tests with expanded data models. Introduces batch size configuration per test, handles large datasets (40003 annotations), and adds concurrent deletion via t.Cleanup. Renames TestAnnotationCleanUp → TestIntegrationAnnotationCleanUp and TestOldAnnotationsAreDeletedFirst → TestIntegrationOldAnnotationsAreDeletedFirst.
Core Cleanup Implementation
pkg/services/annotations/annotationsimpl/xorm_store.go
Replaces batched single-statement deletes with memory-batched fetch-then-delete pattern via untilDoneOrCancelled loop. Adds helpers: fetchIDs, deleteByIDs (with SQLite 999-parameter limit handling), and asAny. Updates CleanAnnotations and CleanOrphanedAnnotationTags to use batched workflow with context cancellation support and per-batch logging.
Cleanup Service Configuration
pkg/services/cleanup/cleanup.go
Reduces cleanup ticker interval from 10 minutes to 1 minute, increasing cleanup job frequency.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 From single sweeps to batches small,
Our annotations dance through it all,
No deadlocks here, just hop and clear,
With kindness toward the database dear,
More frequent now, we're swift and fleet!

🚥 Pre-merge checks | ✅ 1 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'Automated Test: db-cleanup-optimized' is vague and generic. While it references 'db-cleanup-optimized', the title does not clearly describe the main changes: refactoring cleanup tests to integration-style tests, implementing batched deletion logic to prevent deadlocks, and reducing cleanup tick frequency. Consider a more descriptive title that captures the primary changes, such as 'Refactor annotation cleanup to use batched deletion and integration tests' or 'Optimize database cleanup with batched operations and faster tick interval'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch db-cleanup-optimized

Comment @coderabbit eval help to get the list of available commands and usage tips.

@coderabbit-eval coderabbit-eval bot closed this Feb 4, 2026
Copy link

@coderabbit-eval coderabbit-eval bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 534-538: Change the routine-use r.log.Error calls (e.g., where
message is "Annotations to clean by time" and "cleaned annotations by time") to
non-error levels like Info or Debug, stop logging the full ids slice (log its
length and at most the first N IDs or a summarized sample), and replace terse
variables x,y returned from deleteByIDs with descriptive names (e.g.,
deletedCount, deleteErr or affected, err) so the call and subsequent r.log.*
usage reads clearly; apply the same pattern to the MaxCount and orphaned-tags
blocks and any other similar r.log.Error usages.
- Around line 569-582: The annotationTag struct is missing the primary key field
causing xorm mapping failures for fetchIDs and deleteByIDs on the annotation_tag
table; update the annotationTag type to include the ID field with the xorm tag
for primary/autoincrement (e.g., add ID int64 with `xorm:"pk autoincr 'id'"`) so
the struct matches the v3 migration schema and allows fetchIDs/deleteByIDs to
map and delete rows correctly.
🧹 Nitpick comments (3)
pkg/services/cleanup/cleanup.go (1)

77-77: Ticker interval vs. clean timeout mismatch.

The ticker is now 1 minute, but clean still has a 9-minute context timeout (Line 89). If a cleanup cycle runs long (e.g., large batch), subsequent ticks are silently dropped until clean returns. While not a correctness issue (the loop is sequential), consider reducing the timeout to something closer to the new cadence (e.g., 50–55 seconds) so a slow cycle doesn't block the next several windows.

pkg/services/annotations/annotationsimpl/xorm_store.go (1)

605-618: O(n²) string concatenation for large SQLite batches; condition checks config instead of actual slice length.

  1. Lines 609–612 build the inline ID list via repeated fmt.Sprintf("%s, %d", values, v), which copies the growing string on every iteration — O(n²) for up to 32 767 IDs. Use strings.Builder or strconv + byte-slice appending (similar to the pattern already used in getAccessControlFilter).

  2. Line 608 compares AnnotationCleanupJobBatchSize (the configured max) against the SQLite limit, but ids may be shorter than the configured batch size. Comparing against len(ids) would be more precise and avoid the unnecessary inline path for small tail batches.

Proposed fix
-	const sqliteParameterLimit = 999
-	if r.db.GetDBType() == migrator.SQLite && r.cfg.AnnotationCleanupJobBatchSize > sqliteParameterLimit {
-		values := fmt.Sprint(ids[0])
-		for _, v := range ids[1:] {
-			values = fmt.Sprintf("%s, %d", values, v)
-		}
-		sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, values)
+	const sqliteParameterLimit = 999
+	if r.db.GetDBType() == migrator.SQLite && len(ids) > sqliteParameterLimit {
+		var b strings.Builder
+		b.Grow(len(ids) * 6) // rough pre-allocation
+		b.WriteString(strconv.FormatInt(ids[0], 10))
+		for _, v := range ids[1:] {
+			b.WriteByte(',')
+			b.WriteString(strconv.FormatInt(v, 10))
+		}
+		sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, b.String())
pkg/services/annotations/annotationsimpl/cleanup_test.go (1)

237-271: Hardcoded IDs may collide with auto-increment on non-SQLite engines.

ID: int64(i + 1) explicitly sets annotation IDs starting from 1. This works for SQLite (the test DB) because it allows explicit primary-key inserts. However, if InitTestDB ever switches to MySQL/Postgres (or if these tests are ported), the hardcoded IDs may conflict with the auto-increment sequence — especially since t.Cleanup only DELETEs rows without resetting the sequence.

Not blocking since the tests currently target SQLite only.

Comment on lines +534 to +538
r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)

affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
x, y := r.deleteByIDs(ctx, "annotation", ids)
r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
return x, y

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Error-level logs used for routine operational messages; terse variable names.

Lines 534, 537, 554, 557, 576, 579 all use r.log.Error(...) for what are clearly informational/debug messages (e.g., "Annotations to clean by time"). In production this will fire alerts and pollute error dashboards. Additionally, logging the full ids slice at any level will be extremely verbose for large batches (up to 32 767 entries).

The variables x, y on Lines 536–538 (and 556–558, 578–579) should use descriptive names.

Proposed fix (showing the MaxAge block; apply the same pattern to MaxCount and orphaned tags)
-			r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+			r.log.Debug("Annotations to clean by time", "count", len(ids))

-			x, y := r.deleteByIDs(ctx, "annotation", ids)
-			r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
-			return x, y
+			affected, err := r.deleteByIDs(ctx, "annotation", ids)
+			r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", affected, "err", err)
+			return affected, err
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
affected, err := r.executeUntilDoneOrCancelled(ctx, sql)
x, y := r.deleteByIDs(ctx, "annotation", ids)
r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
return x, y
r.log.Debug("Annotations to clean by time", "count", len(ids))
affected, err := r.deleteByIDs(ctx, "annotation", ids)
r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", affected, "err", err)
return affected, err
🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 534 -
538, Change the routine-use r.log.Error calls (e.g., where message is
"Annotations to clean by time" and "cleaned annotations by time") to non-error
levels like Info or Debug, stop logging the full ids slice (log its length and
at most the first N IDs or a summarized sample), and replace terse variables x,y
returned from deleteByIDs with descriptive names (e.g., deletedCount, deleteErr
or affected, err) so the call and subsequent r.log.* usage reads clearly; apply
the same pattern to the MaxCount and orphaned-tags blocks and any other similar
r.log.Error usages.

Comment on lines 569 to +582
func (r *xormRepositoryImpl) CleanOrphanedAnnotationTags(ctx context.Context) (int64, error) {
deleteQuery := `DELETE FROM annotation_tag WHERE id IN ( SELECT id FROM (SELECT id FROM annotation_tag WHERE NOT EXISTS (SELECT 1 FROM annotation a WHERE annotation_id = a.id) %s) a)`
sql := fmt.Sprintf(deleteQuery, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize))
return r.executeUntilDoneOrCancelled(ctx, sql)
return untilDoneOrCancelled(ctx, func() (int64, error) {
cond := fmt.Sprintf(`NOT EXISTS (SELECT 1 FROM annotation a WHERE annotation_id = a.id) %s`, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize))
ids, err := r.fetchIDs(ctx, "annotation_tag", cond)
if err != nil {
return 0, err
}
r.log.Error("Tags to clean", "count", len(ids), "ids", ids, "cond", cond, "err", err)

x, y := r.deleteByIDs(ctx, "annotation_tag", ids)
r.log.Error("cleaned tags", "count", len(ids), "affected", x, "err", y)
return x, y
})
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find migration files related to annotation_tag
find . -type f \( -name "*.sql" -o -name "*.go" \) | grep -i migr | head -20

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 1194


🏁 Script executed:

#!/bin/bash
# Search for annotationTag struct definition
rg -n "type.*annotationTag" --type=go

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 265


🏁 Script executed:

#!/bin/bash
# Look for annotation_tag table creation in migrations
rg -n "annotation_tag" -A5 -B5 --type=go | grep -A5 -B5 "CREATE\|AddTable\|table" | head -50

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 4561


🏁 Script executed:

#!/bin/bash
# Search for the specific struct definition around line 663-666 mentioned in the review
rg -n "annotation_id|tag_id" --type=go -A2 -B2 | grep -A5 -B5 "type"

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 1856


🏁 Script executed:

#!/bin/bash
# Get full annotation_tag table definition from migration
rg -A20 "annotationTagTableV3 := Table{" pkg/services/sqlstore/migrations/annotation_mig.go

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 209


🏁 Script executed:

#!/bin/bash
# Also check the v2 definition
rg -B5 -A15 "Create annotation_tag table v2" pkg/services/sqlstore/migrations/annotation_mig.go

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 901


🏁 Script executed:

#!/bin/bash
# Look for Column definitions in the migration
rg -A30 'annotationTagTable := Table' pkg/services/sqlstore/migrations/annotation_mig.go | head -60

Repository: admin-coderabbit/coderabbit-grafana-mirror

Length of output: 1197


Add missing id field to annotationTag struct.

The annotation_tag table schema (v3 migration) includes an id column as the primary key with autoincrement, but the production struct at lines 663–666 omits it. The test struct at lines 591–595 correctly includes ID int64 \xorm:"pk autoincr 'id'"`. Without this field, calls to fetchIDs()(line 572) anddeleteByIDs()(line 578) will fail at runtime because xorm cannot map theid` column from query results.

Current struct definition
type annotationTag struct {
	AnnotationID int64 `xorm:"annotation_id"`
	TagID        int64 `xorm:"tag_id"`
}

Update the struct to match the schema:

type annotationTag struct {
	ID           int64 `xorm:"pk autoincr 'id'"`
	AnnotationID int64 `xorm:"annotation_id"`
	TagID        int64 `xorm:"tag_id"`
}
🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 569 -
582, The annotationTag struct is missing the primary key field causing xorm
mapping failures for fetchIDs and deleteByIDs on the annotation_tag table;
update the annotationTag type to include the ID field with the xorm tag for
primary/autoincrement (e.g., add ID int64 with `xorm:"pk autoincr 'id'"`) so the
struct matches the v3 migration schema and allows fetchIDs/deleteByIDs to map
and delete rows correctly.

Sign in to join this conversation on GitHub.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant