Skip to content

Conversation

@cherkanovart
Copy link
Contributor

@cherkanovart cherkanovart commented Jan 23, 2026

Summary

Automatically deduplicates i18n.lock when YAML contains duplicate keys (e.g. after Git merges), using a shared implementation in both lockfile and delta code paths, with last-occurrence-wins and overwrite + log when duplicates are removed.

Changes

  • deduplicateLockfileYaml() in lockfile.ts — New exported helper: YAML.parseDocument → collect key positions in checksums → last occurrence wins → remove in reverse order → return { deduplicatedContent, duplicatesRemoved }. Idempotent and deterministic.
  • _loadLockfile() (lockfile.ts) — Uses deduplicateLockfileYaml; when duplicatesRemoved > 0, overwrites i18n.lock and logs: Removed N duplicate entry/entries from i18n.lock.
  • loadLock() (delta.ts) — Replaced inline deduplication with deduplicateLockfileYaml; same overwrite and console.log when duplicatesRemoved > 0. Keeps behavior the same for status, run, i18n, frozen.

Testing

Business logic tests added:

  • deduplicateLockfileYaml — no duplicates — Unchanged content and duplicatesRemoved === 0 when lockfile is clean.
  • Single/multiple duplicates, last wins — Duplicate keys in one path pattern; last value is kept and earlier ones removed.
  • Conflicting checksums (same key, different values) — Last occurrence wins; duplicatesRemoved and parsed structure are correct.
  • Multiple path patterns — Deduplication is applied per path pattern; counts and last-wins are correct across patterns.
  • Empty lockfile and empty checksums — No errors; duplicatesRemoved === 0 and structure preserved.
  • Idempotent — Second run on deduplicated content yields duplicatesRemoved === 0 and identical output.
  • Git-merge-style scenario — Several duplicate keys in one section; last wins, correct duplicatesRemoved, and final keys/checksums.
  • All tests pass locally

Visuals

Required for UI/UX changes:

  • Before/after screenshots attached
  • Video demo for interactions (< 30s)

N/A — no UI changes.

Checklist

  • Changeset added (if version bump needed)
  • Tests cover business logic (not just happy path)
  • No breaking changes (or documented below)

Closes #[issue-number]

Summary by CodeRabbit

  • New Features

    • Lockfiles now automatically detect and remove duplicate entries, preserving the last occurrence.
    • Handles Git merge-conflict markers in lockfiles and resolves duplicates automatically.
  • Refactor

    • Consolidated deduplication into a single, unified implementation to improve maintainability and consistency.
  • Tests

    • Added extensive test coverage for lockfile deduplication, including merge-conflict scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

📝 Walkthrough

Walkthrough

A universal lockfile deduplication function replaces three duplicate implementations, removing duplicate entries from i18n.lock files while preserving the last occurrence of each key. Integration into the load flow ensures automatic deduplication and persistence of cleaned content. Comprehensive test coverage validates behavior across edge cases including Git merge conflicts.

Changes

Cohort / File(s) Summary
Lockfile deduplication implementation
packages/cli/src/cli/utils/lockfile.ts
Adds exported deduplicateLockfileYaml() that removes duplicate YAML entries in lockfile checksums (last occurrence wins). _loadLockfile() now uses it, parses deduplicated content, and writes back when duplicates are removed while logging the change.
Deduplication integration
packages/cli/src/cli/utils/delta.ts
Integrates deduplicateLockfileYaml into loadLock() to deduplicate i18n.lock before parsing; persists deduplicated content when modified. Also adjusts type casting for loadChecksums and saveChecksums.
Test coverage
packages/cli/src/cli/utils/lockfile.test.ts
Adds comprehensive tests for deduplicateLockfileYaml() covering no duplicates, duplicate keys, conflicting values, multiple duplicates, empty files, Git merge conflicts, idempotence, and cross-pattern counting.
Changeset
.changeset/lockfile-deduplication.md
Documents refactor of lockfile deduplication into a single function with Git merge conflict handling support.

Sequence Diagram

sequenceDiagram
    actor Caller
    participant LoadLock as loadLock()
    participant Dedup as deduplicateLockfileYaml()
    participant FS as Filesystem
    participant Parser as YAML Parser

    Caller->>LoadLock: request load lockfile
    LoadLock->>FS: read `i18n.lock`
    FS-->>LoadLock: raw YAML content
    LoadLock->>Dedup: deduplicateLockfileYaml(content)
    Dedup-->>LoadLock: deduplicated content, duplicates count

    alt duplicates removed
        LoadLock->>FS: write deduplicated content
        FS-->>LoadLock: write complete
        LoadLock->>LoadLock: log removal message
    end

    LoadLock->>Parser: parse deduplicated YAML
    Parser-->>LoadLock: LockSchema object
    LoadLock-->>Caller: return lock object
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled through YAML, hopped over keys,
Pulled duplicates out with elegant ease.
The last one stands in the tidy lock yard,
Clean files now travel light and unscarred. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: unifying lockfile deduplication logic across multiple code paths.
Description check ✅ Passed The description is comprehensive and well-structured, covering all required template sections: summary, detailed changes, extensive test coverage with checkboxes, visuals section marked N/A, and checklist items completed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/cli/src/cli/utils/lockfile.ts`:
- Around line 131-168: The code currently only deduplicates translation keys
within each pathItem but doesn’t merge multiple path-hash blocks, causing
earlier blocks to be dropped by doc.toJSON(); update the logic that iterates
checksums.items to first coalesce/merge path-hash blocks by key (use the
pathItem.key value as the identifier) into a single YAML.Map per path-hash,
concatenating their .items arrays, then run the existing per-block dedup of
translation keys on the merged map; ensure you update duplicatesRemoved to count
both merged duplicate blocks and removed duplicate translation keys so the file
rewrite still occurs, and keep references to YAML.isMap(pathItem.value),
pathItem.value.items, checksums, duplicatesRemoved and doc.toJSON() to locate
where to change the code.

Comment on lines +131 to +168
if (doc.contents && YAML.isMap(doc.contents)) {
const checksums = doc.contents.get('checksums');
if (checksums && YAML.isMap(checksums)) {
// Iterate through each path pattern hash
for (const pathItem of checksums.items) {
if (YAML.isMap(pathItem.value)) {
// Track key positions - last occurrence wins
const keyPositions = new Map<string, number[]>();

// First pass: collect all positions for each key
for (let i = 0; i < pathItem.value.items.length; i++) {
const translationItem = pathItem.value.items[i];
const key = String(YAML.isScalar(translationItem.key) ? translationItem.key.value : translationItem.key);

if (!keyPositions.has(key)) {
keyPositions.set(key, []);
}
keyPositions.get(key)!.push(i);
}

// Second pass: identify duplicates to remove (all but the last occurrence)
const indicesToRemove: number[] = [];
for (const positions of keyPositions.values()) {
if (positions.length > 1) {
// Keep the last occurrence, remove all earlier ones
indicesToRemove.push(...positions.slice(0, -1));
duplicatesRemoved += positions.length - 1;
}
}

// Remove items in reverse order to maintain correct indices
indicesToRemove.sort((a, b) => b - a);
for (const index of indicesToRemove) {
pathItem.value.items.splice(index, 1);
}
}
}
}
Copy link

@coderabbitai coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle duplicate path-hash blocks to avoid silent key loss.

If the YAML contains the same path-hash key multiple times, the current loop only deduplicates within each block. When doc.toJSON() runs, earlier blocks are discarded entirely, which drops unique translation keys and violates the “last occurrence wins per translationKey” contract. Please merge duplicate path-hash blocks before deduping translation keys (and count those removals so the file is rewritten). Consider adding a regression test for duplicate path-hash blocks.

🔧 Possible fix (merge duplicate path-hash blocks before per-key dedup)
-      // Iterate through each path pattern hash
-      for (const pathItem of checksums.items) {
-        if (YAML.isMap(pathItem.value)) {
+      // Group duplicate path pattern hashes so we can merge them (last occurrence wins)
+      const pathGroups = new Map<string, { indices: number[]; items: typeof checksums.items }>();
+      checksums.items.forEach((pathItem, index) => {
+        const pathKey = String(
+          YAML.isScalar(pathItem.key) ? pathItem.key.value : pathItem.key,
+        );
+        const group = pathGroups.get(pathKey) ?? { indices: [], items: [] };
+        group.indices.push(index);
+        group.items.push(pathItem);
+        pathGroups.set(pathKey, group);
+      });
+
+      const pathIndicesToRemove: number[] = [];
+      for (const { indices, items } of pathGroups.values()) {
+        const target = items[items.length - 1];
+        if (YAML.isMap(target.value)) {
+          // Merge all items in original order so later entries still win
+          target.value.items = items.flatMap((item) =>
+            YAML.isMap(item.value) ? item.value.items : [],
+          );
+          if (indices.length > 1) {
+            pathIndicesToRemove.push(...indices.slice(0, -1));
+            duplicatesRemoved += indices.length - 1;
+          }
+
           // Track key positions - last occurrence wins
           const keyPositions = new Map<string, number[]>();
           ...
         }
       }
+
+      pathIndicesToRemove.sort((a, b) => b - a);
+      for (const index of pathIndicesToRemove) {
+        checksums.items.splice(index, 1);
+      }
🤖 Prompt for AI Agents
In `@packages/cli/src/cli/utils/lockfile.ts` around lines 131 - 168, The code
currently only deduplicates translation keys within each pathItem but doesn’t
merge multiple path-hash blocks, causing earlier blocks to be dropped by
doc.toJSON(); update the logic that iterates checksums.items to first
coalesce/merge path-hash blocks by key (use the pathItem.key value as the
identifier) into a single YAML.Map per path-hash, concatenating their .items
arrays, then run the existing per-block dedup of translation keys on the merged
map; ensure you update duplicatesRemoved to count both merged duplicate blocks
and removed duplicate translation keys so the file rewrite still occurs, and
keep references to YAML.isMap(pathItem.value), pathItem.value.items, checksums,
duplicatesRemoved and doc.toJSON() to locate where to change the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this, “the current loop only deduplicates within each block”, is expected behavior.

@cherkanovart Let’s add a test for this case.

The test case “should deduplicate across multiple path patterns independently” explicitly verifies this: the same key (key1) in both pathHash1 and pathHash2 is preserved.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants