-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
… deadlocks on MySQL (#80329) * Split subquery when cleaning annotations * update comment * Raise batch size, now that we pay attention to it * Iterate in batches * Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use * Use split-out utility in outer batching loop so it respects context cancellation * guard against empty queries * Use SQL parameters * Use same approach for tags * drop unused function * Work around parameter limit on sqlite for large batches * Bulk insert test data in DB * Refactor test to customise test data creation * Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit * Turn annotation cleanup test to integration tests * lint --------- Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
📝 WalkthroughWalkthroughThe changes refactor annotation cleanup logic from single-statement batched deletes to an in-memory fetch-then-delete batching pattern to mitigate deadlocks. Test coverage is expanded with integration-style tests, larger data scenarios, and batch size handling. Cleanup service frequency is increased from 10 to 1 minute. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 534-538: Change the routine-use r.log.Error calls (e.g., where
message is "Annotations to clean by time" and "cleaned annotations by time") to
non-error levels like Info or Debug, stop logging the full ids slice (log its
length and at most the first N IDs or a summarized sample), and replace terse
variables x,y returned from deleteByIDs with descriptive names (e.g.,
deletedCount, deleteErr or affected, err) so the call and subsequent r.log.*
usage reads clearly; apply the same pattern to the MaxCount and orphaned-tags
blocks and any other similar r.log.Error usages.
- Around line 569-582: The annotationTag struct is missing the primary key field
causing xorm mapping failures for fetchIDs and deleteByIDs on the annotation_tag
table; update the annotationTag type to include the ID field with the xorm tag
for primary/autoincrement (e.g., add ID int64 with `xorm:"pk autoincr 'id'"`) so
the struct matches the v3 migration schema and allows fetchIDs/deleteByIDs to
map and delete rows correctly.
🧹 Nitpick comments (3)
pkg/services/cleanup/cleanup.go (1)
77-77: Ticker interval vs.cleantimeout mismatch.The ticker is now 1 minute, but
cleanstill has a 9-minute context timeout (Line 89). If a cleanup cycle runs long (e.g., large batch), subsequent ticks are silently dropped untilcleanreturns. While not a correctness issue (the loop is sequential), consider reducing the timeout to something closer to the new cadence (e.g., 50–55 seconds) so a slow cycle doesn't block the next several windows.pkg/services/annotations/annotationsimpl/xorm_store.go (1)
605-618: O(n²) string concatenation for large SQLite batches; condition checks config instead of actual slice length.
Lines 609–612 build the inline ID list via repeated
fmt.Sprintf("%s, %d", values, v), which copies the growing string on every iteration — O(n²) for up to 32 767 IDs. Usestrings.Builderorstrconv+ byte-slice appending (similar to the pattern already used ingetAccessControlFilter).Line 608 compares
AnnotationCleanupJobBatchSize(the configured max) against the SQLite limit, butidsmay be shorter than the configured batch size. Comparing againstlen(ids)would be more precise and avoid the unnecessary inline path for small tail batches.Proposed fix
- const sqliteParameterLimit = 999 - if r.db.GetDBType() == migrator.SQLite && r.cfg.AnnotationCleanupJobBatchSize > sqliteParameterLimit { - values := fmt.Sprint(ids[0]) - for _, v := range ids[1:] { - values = fmt.Sprintf("%s, %d", values, v) - } - sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, values) + const sqliteParameterLimit = 999 + if r.db.GetDBType() == migrator.SQLite && len(ids) > sqliteParameterLimit { + var b strings.Builder + b.Grow(len(ids) * 6) // rough pre-allocation + b.WriteString(strconv.FormatInt(ids[0], 10)) + for _, v := range ids[1:] { + b.WriteByte(',') + b.WriteString(strconv.FormatInt(v, 10)) + } + sql = fmt.Sprintf(`DELETE FROM %s WHERE id IN (%s)`, table, b.String())pkg/services/annotations/annotationsimpl/cleanup_test.go (1)
237-271: Hardcoded IDs may collide with auto-increment on non-SQLite engines.
ID: int64(i + 1)explicitly sets annotation IDs starting from 1. This works for SQLite (the test DB) because it allows explicit primary-key inserts. However, ifInitTestDBever switches to MySQL/Postgres (or if these tests are ported), the hardcoded IDs may conflict with the auto-increment sequence — especially sincet.CleanuponlyDELETEs rows without resetting the sequence.Not blocking since the tests currently target SQLite only.
| r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err) | ||
|
|
||
| affected, err := r.executeUntilDoneOrCancelled(ctx, sql) | ||
| x, y := r.deleteByIDs(ctx, "annotation", ids) | ||
| r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) | ||
| return x, y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error-level logs used for routine operational messages; terse variable names.
Lines 534, 537, 554, 557, 576, 579 all use r.log.Error(...) for what are clearly informational/debug messages (e.g., "Annotations to clean by time"). In production this will fire alerts and pollute error dashboards. Additionally, logging the full ids slice at any level will be extremely verbose for large batches (up to 32 767 entries).
The variables x, y on Lines 536–538 (and 556–558, 578–579) should use descriptive names.
Proposed fix (showing the MaxAge block; apply the same pattern to MaxCount and orphaned tags)
- r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+ r.log.Debug("Annotations to clean by time", "count", len(ids))
- x, y := r.deleteByIDs(ctx, "annotation", ids)
- r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
- return x, y
+ affected, err := r.deleteByIDs(ctx, "annotation", ids)
+ r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", affected, "err", err)
+ return affected, err📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err) | |
| affected, err := r.executeUntilDoneOrCancelled(ctx, sql) | |
| x, y := r.deleteByIDs(ctx, "annotation", ids) | |
| r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) | |
| return x, y | |
| r.log.Debug("Annotations to clean by time", "count", len(ids)) | |
| affected, err := r.deleteByIDs(ctx, "annotation", ids) | |
| r.log.Debug("cleaned annotations by time", "count", len(ids), "affected", affected, "err", err) | |
| return affected, err |
🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 534 -
538, Change the routine-use r.log.Error calls (e.g., where message is
"Annotations to clean by time" and "cleaned annotations by time") to non-error
levels like Info or Debug, stop logging the full ids slice (log its length and
at most the first N IDs or a summarized sample), and replace terse variables x,y
returned from deleteByIDs with descriptive names (e.g., deletedCount, deleteErr
or affected, err) so the call and subsequent r.log.* usage reads clearly; apply
the same pattern to the MaxCount and orphaned-tags blocks and any other similar
r.log.Error usages.
| func (r *xormRepositoryImpl) CleanOrphanedAnnotationTags(ctx context.Context) (int64, error) { | ||
| deleteQuery := `DELETE FROM annotation_tag WHERE id IN ( SELECT id FROM (SELECT id FROM annotation_tag WHERE NOT EXISTS (SELECT 1 FROM annotation a WHERE annotation_id = a.id) %s) a)` | ||
| sql := fmt.Sprintf(deleteQuery, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize)) | ||
| return r.executeUntilDoneOrCancelled(ctx, sql) | ||
| return untilDoneOrCancelled(ctx, func() (int64, error) { | ||
| cond := fmt.Sprintf(`NOT EXISTS (SELECT 1 FROM annotation a WHERE annotation_id = a.id) %s`, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize)) | ||
| ids, err := r.fetchIDs(ctx, "annotation_tag", cond) | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| r.log.Error("Tags to clean", "count", len(ids), "ids", ids, "cond", cond, "err", err) | ||
|
|
||
| x, y := r.deleteByIDs(ctx, "annotation_tag", ids) | ||
| r.log.Error("cleaned tags", "count", len(ids), "affected", x, "err", y) | ||
| return x, y | ||
| }) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find migration files related to annotation_tag
find . -type f \( -name "*.sql" -o -name "*.go" \) | grep -i migr | head -20Repository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 1194
🏁 Script executed:
#!/bin/bash
# Search for annotationTag struct definition
rg -n "type.*annotationTag" --type=goRepository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 265
🏁 Script executed:
#!/bin/bash
# Look for annotation_tag table creation in migrations
rg -n "annotation_tag" -A5 -B5 --type=go | grep -A5 -B5 "CREATE\|AddTable\|table" | head -50Repository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 4561
🏁 Script executed:
#!/bin/bash
# Search for the specific struct definition around line 663-666 mentioned in the review
rg -n "annotation_id|tag_id" --type=go -A2 -B2 | grep -A5 -B5 "type"Repository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 1856
🏁 Script executed:
#!/bin/bash
# Get full annotation_tag table definition from migration
rg -A20 "annotationTagTableV3 := Table{" pkg/services/sqlstore/migrations/annotation_mig.goRepository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 209
🏁 Script executed:
#!/bin/bash
# Also check the v2 definition
rg -B5 -A15 "Create annotation_tag table v2" pkg/services/sqlstore/migrations/annotation_mig.goRepository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 901
🏁 Script executed:
#!/bin/bash
# Look for Column definitions in the migration
rg -A30 'annotationTagTable := Table' pkg/services/sqlstore/migrations/annotation_mig.go | head -60Repository: admin-coderabbit/coderabbit-grafana-mirror
Length of output: 1197
Add missing id field to annotationTag struct.
The annotation_tag table schema (v3 migration) includes an id column as the primary key with autoincrement, but the production struct at lines 663–666 omits it. The test struct at lines 591–595 correctly includes ID int64 \xorm:"pk autoincr 'id'"`. Without this field, calls to fetchIDs()(line 572) anddeleteByIDs()(line 578) will fail at runtime because xorm cannot map theid` column from query results.
Current struct definition
type annotationTag struct {
AnnotationID int64 `xorm:"annotation_id"`
TagID int64 `xorm:"tag_id"`
}
Update the struct to match the schema:
type annotationTag struct {
ID int64 `xorm:"pk autoincr 'id'"`
AnnotationID int64 `xorm:"annotation_id"`
TagID int64 `xorm:"tag_id"`
}
🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 569 -
582, The annotationTag struct is missing the primary key field causing xorm
mapping failures for fetchIDs and deleteByIDs on the annotation_tag table;
update the annotationTag type to include the ID field with the xorm tag for
primary/autoincrement (e.g., add ID int64 with `xorm:"pk autoincr 'id'"`) so the
struct matches the v3 migration schema and allows fetchIDs/deleteByIDs to map
and delete rows correctly.
This pull request was automatically created by
@coderabbitai/e2e-reviewer.Batch created pull request.
Summary by CodeRabbit
Bug Fixes
Tests