-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
… deadlocks on MySQL (#80329) * Split subquery when cleaning annotations * update comment * Raise batch size, now that we pay attention to it * Iterate in batches * Separate cancellable batch implementation to allow for multi-statement callbacks, add overload for single-statement use * Use split-out utility in outer batching loop so it respects context cancellation * guard against empty queries * Use SQL parameters * Use same approach for tags * drop unused function * Work around parameter limit on sqlite for large batches * Bulk insert test data in DB * Refactor test to customise test data creation * Add test for catching SQLITE_MAX_VARIABLE_NUMBER limit * Turn annotation cleanup test to integration tests * lint --------- Co-authored-by: Sofia Papagiannaki <1632407+papagian@users.noreply.github.com>
📝 WalkthroughWalkthroughThe annotation cleanup system transitions from direct batch deletion to incremental batched deletion with retry logic to prevent MySQL deadlocks. Integration tests are restructured with new test table fields and batched database operations. The cleanup service ticker frequency increases from 10 to 1 minute. Changes
Sequence DiagramsequenceDiagram
participant CleanAnnotations
participant untilDoneOrCancelled
participant fetchIDs
participant deleteByIDs
participant Database
CleanAnnotations->>untilDoneOrCancelled: Start batch cleanup loop
loop Until no rows affected or error
untilDoneOrCancelled->>fetchIDs: Load next batch of IDs
fetchIDs->>Database: Query IDs with condition limit
Database-->>fetchIDs: Return ID batch
fetchIDs-->>untilDoneOrCancelled: IDs loaded
alt IDs found
untilDoneOrCancelled->>deleteByIDs: Delete batch by IDs
deleteByIDs->>Database: Execute delete with ID list
Database-->>deleteByIDs: Rows deleted
deleteByIDs-->>untilDoneOrCancelled: Affected count
untilDoneOrCancelled->>untilDoneOrCancelled: Check if more rows remain
else No IDs found
untilDoneOrCancelled->>untilDoneOrCancelled: Exit loop
end
end
untilDoneOrCancelled-->>CleanAnnotations: Cleanup complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go`:
- Around line 523-559: The cleanup loops call r.log.Error for every batch even
when err==nil and include full ID lists; update the logging in the
untilDoneOrCancelled lambdas (the blocks using fetchIDs and deleteByIDs) to only
use r.log.Error when an actual error occurs (err or y != nil), and use
r.log.Info or r.log.Debug for successful batches (e.g., log count and affected
but omit or redact the full ids list); apply the same change to the orphaned tag
cleanup logs so success paths don’t emit error-level logs or full ID arrays.
🧹 Nitpick comments (1)
pkg/services/annotations/annotationsimpl/xorm_store.go (1)
597-629: Avoid quadratic string concatenation for large SQLite batches.
fmt.Sprintfin a loop is O(n²) and can be costly for large batches (e.g., 32k IDs). A builder avoids repeated allocations.♻️ Suggested builder-based construction
- values := fmt.Sprint(ids[0]) - for _, v := range ids[1:] { - values = fmt.Sprintf("%s, %d", values, v) - } + var b strings.Builder + b.Grow(len(ids) * 11) // rough capacity + b.WriteString(strconv.FormatInt(ids[0], 10)) + for _, v := range ids[1:] { + b.WriteByte(',') + b.WriteString(strconv.FormatInt(v, 10)) + } + values := b.String()
| // Single-statement approaches, specifically ones using batched sub-queries, seem to deadlock with concurrent inserts on MySQL. | ||
| // We have a bounded batch size, so work around this by first loading the IDs into memory and allowing any locks to flush inside each batch. | ||
| // This may under-delete when concurrent inserts happen, but any such annotations will simply be cleaned on the next cycle. | ||
| // | ||
| // We execute the following batched operation repeatedly until either we run out of objects, the context is cancelled, or there is an error. | ||
| affected, err := untilDoneOrCancelled(ctx, func() (int64, error) { | ||
| cond := fmt.Sprintf(`%s AND created < %v ORDER BY id DESC %s`, annotationType, cutoffDate, r.db.GetDialect().Limit(r.cfg.AnnotationCleanupJobBatchSize)) | ||
| ids, err := r.fetchIDs(ctx, "annotation", cond) | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err) | ||
|
|
||
| affected, err := r.executeUntilDoneOrCancelled(ctx, sql) | ||
| x, y := r.deleteByIDs(ctx, "annotation", ids) | ||
| r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y) | ||
| return x, y | ||
| }) | ||
| totalAffected += affected | ||
| if err != nil { | ||
| return totalAffected, err | ||
| } | ||
| } | ||
|
|
||
| if cfg.MaxCount > 0 { | ||
| deleteQuery := `DELETE FROM annotation WHERE id IN (SELECT id FROM (SELECT id FROM annotation WHERE %s ORDER BY id DESC %s) a)` | ||
| sql := fmt.Sprintf(deleteQuery, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount)) | ||
| affected, err := r.executeUntilDoneOrCancelled(ctx, sql) | ||
| // Similar strategy as the above cleanup process, to avoid deadlocks. | ||
| affected, err := untilDoneOrCancelled(ctx, func() (int64, error) { | ||
| cond := fmt.Sprintf(`%s ORDER BY id DESC %s`, annotationType, r.db.GetDialect().LimitOffset(r.cfg.AnnotationCleanupJobBatchSize, cfg.MaxCount)) | ||
| ids, err := r.fetchIDs(ctx, "annotation", cond) | ||
| if err != nil { | ||
| return 0, err | ||
| } | ||
| r.log.Error("Annotations to clean by count", "count", len(ids), "ids", ids, "cond", cond, "err", err) | ||
|
|
||
| x, y := r.deleteByIDs(ctx, "annotation", ids) | ||
| r.log.Error("cleaned annotations by count", "count", len(ids), "affected", x, "err", y) | ||
| return x, y | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use debug/info logs for successful cleanup batches.
These paths log at Error level even when err is nil and include full ID lists, which can spam error logs and inflate log volume. Consider Debug/Info for success and only emit Error on failures (apply the same pattern to the orphaned tag cleanup logs).
💡 Suggested logging adjustments
- r.log.Error("Annotations to clean by time", "count", len(ids), "ids", ids, "cond", cond, "err", err)
+ r.log.Debug("Annotations to clean by time", "count", len(ids), "cond", cond)
- r.log.Error("cleaned annotations by time", "count", len(ids), "affected", x, "err", y)
+ if y != nil {
+ r.log.Error("Failed to clean annotations by time", "error", y)
+ } else {
+ r.log.Debug("Cleaned annotations by time", "affected", x)
+ }Also applies to: 569-580
🤖 Prompt for AI Agents
In `@pkg/services/annotations/annotationsimpl/xorm_store.go` around lines 523 -
559, The cleanup loops call r.log.Error for every batch even when err==nil and
include full ID lists; update the logging in the untilDoneOrCancelled lambdas
(the blocks using fetchIDs and deleteByIDs) to only use r.log.Error when an
actual error occurs (err or y != nil), and use r.log.Info or r.log.Debug for
successful batches (e.g., log count and affected but omit or redact the full ids
list); apply the same change to the orphaned tag cleanup logs so success paths
don’t emit error-level logs or full ID arrays.
This pull request was automatically created by
@coderabbitai/e2e-reviewer.Batch created pull request.
Summary by CodeRabbit
Release Notes
Bug Fixes
Performance