-
Notifications
You must be signed in to change notification settings - Fork 828
feat: unify lockfile deduplication logic #1873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughA universal lockfile deduplication function replaces three duplicate implementations, removing duplicate entries from i18n.lock files while preserving the last occurrence of each key. Integration into the load flow ensures automatic deduplication and persistence of cleaned content. Comprehensive test coverage validates behavior across edge cases including Git merge conflicts. Changes
Sequence DiagramsequenceDiagram
actor Caller
participant LoadLock as loadLock()
participant Dedup as deduplicateLockfileYaml()
participant FS as Filesystem
participant Parser as YAML Parser
Caller->>LoadLock: request load lockfile
LoadLock->>FS: read `i18n.lock`
FS-->>LoadLock: raw YAML content
LoadLock->>Dedup: deduplicateLockfileYaml(content)
Dedup-->>LoadLock: deduplicated content, duplicates count
alt duplicates removed
LoadLock->>FS: write deduplicated content
FS-->>LoadLock: write complete
LoadLock->>LoadLock: log removal message
end
LoadLock->>Parser: parse deduplicated YAML
Parser-->>LoadLock: LockSchema object
LoadLock-->>Caller: return lock object
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@packages/cli/src/cli/utils/lockfile.ts`:
- Around line 131-168: The code currently only deduplicates translation keys
within each pathItem but doesn’t merge multiple path-hash blocks, causing
earlier blocks to be dropped by doc.toJSON(); update the logic that iterates
checksums.items to first coalesce/merge path-hash blocks by key (use the
pathItem.key value as the identifier) into a single YAML.Map per path-hash,
concatenating their .items arrays, then run the existing per-block dedup of
translation keys on the merged map; ensure you update duplicatesRemoved to count
both merged duplicate blocks and removed duplicate translation keys so the file
rewrite still occurs, and keep references to YAML.isMap(pathItem.value),
pathItem.value.items, checksums, duplicatesRemoved and doc.toJSON() to locate
where to change the code.
| if (doc.contents && YAML.isMap(doc.contents)) { | ||
| const checksums = doc.contents.get('checksums'); | ||
| if (checksums && YAML.isMap(checksums)) { | ||
| // Iterate through each path pattern hash | ||
| for (const pathItem of checksums.items) { | ||
| if (YAML.isMap(pathItem.value)) { | ||
| // Track key positions - last occurrence wins | ||
| const keyPositions = new Map<string, number[]>(); | ||
|
|
||
| // First pass: collect all positions for each key | ||
| for (let i = 0; i < pathItem.value.items.length; i++) { | ||
| const translationItem = pathItem.value.items[i]; | ||
| const key = String(YAML.isScalar(translationItem.key) ? translationItem.key.value : translationItem.key); | ||
|
|
||
| if (!keyPositions.has(key)) { | ||
| keyPositions.set(key, []); | ||
| } | ||
| keyPositions.get(key)!.push(i); | ||
| } | ||
|
|
||
| // Second pass: identify duplicates to remove (all but the last occurrence) | ||
| const indicesToRemove: number[] = []; | ||
| for (const positions of keyPositions.values()) { | ||
| if (positions.length > 1) { | ||
| // Keep the last occurrence, remove all earlier ones | ||
| indicesToRemove.push(...positions.slice(0, -1)); | ||
| duplicatesRemoved += positions.length - 1; | ||
| } | ||
| } | ||
|
|
||
| // Remove items in reverse order to maintain correct indices | ||
| indicesToRemove.sort((a, b) => b - a); | ||
| for (const index of indicesToRemove) { | ||
| pathItem.value.items.splice(index, 1); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle duplicate path-hash blocks to avoid silent key loss.
If the YAML contains the same path-hash key multiple times, the current loop only deduplicates within each block. When doc.toJSON() runs, earlier blocks are discarded entirely, which drops unique translation keys and violates the “last occurrence wins per translationKey” contract. Please merge duplicate path-hash blocks before deduping translation keys (and count those removals so the file is rewritten). Consider adding a regression test for duplicate path-hash blocks.
🔧 Possible fix (merge duplicate path-hash blocks before per-key dedup)
- // Iterate through each path pattern hash
- for (const pathItem of checksums.items) {
- if (YAML.isMap(pathItem.value)) {
+ // Group duplicate path pattern hashes so we can merge them (last occurrence wins)
+ const pathGroups = new Map<string, { indices: number[]; items: typeof checksums.items }>();
+ checksums.items.forEach((pathItem, index) => {
+ const pathKey = String(
+ YAML.isScalar(pathItem.key) ? pathItem.key.value : pathItem.key,
+ );
+ const group = pathGroups.get(pathKey) ?? { indices: [], items: [] };
+ group.indices.push(index);
+ group.items.push(pathItem);
+ pathGroups.set(pathKey, group);
+ });
+
+ const pathIndicesToRemove: number[] = [];
+ for (const { indices, items } of pathGroups.values()) {
+ const target = items[items.length - 1];
+ if (YAML.isMap(target.value)) {
+ // Merge all items in original order so later entries still win
+ target.value.items = items.flatMap((item) =>
+ YAML.isMap(item.value) ? item.value.items : [],
+ );
+ if (indices.length > 1) {
+ pathIndicesToRemove.push(...indices.slice(0, -1));
+ duplicatesRemoved += indices.length - 1;
+ }
+
// Track key positions - last occurrence wins
const keyPositions = new Map<string, number[]>();
...
}
}
+
+ pathIndicesToRemove.sort((a, b) => b - a);
+ for (const index of pathIndicesToRemove) {
+ checksums.items.splice(index, 1);
+ }🤖 Prompt for AI Agents
In `@packages/cli/src/cli/utils/lockfile.ts` around lines 131 - 168, The code
currently only deduplicates translation keys within each pathItem but doesn’t
merge multiple path-hash blocks, causing earlier blocks to be dropped by
doc.toJSON(); update the logic that iterates checksums.items to first
coalesce/merge path-hash blocks by key (use the pathItem.key value as the
identifier) into a single YAML.Map per path-hash, concatenating their .items
arrays, then run the existing per-block dedup of translation keys on the merged
map; ensure you update duplicatesRemoved to count both merged duplicate blocks
and removed duplicate translation keys so the file rewrite still occurs, and
keep references to YAML.isMap(pathItem.value), pathItem.value.items, checksums,
duplicatesRemoved and doc.toJSON() to locate where to change the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this, “the current loop only deduplicates within each block”, is expected behavior.
@cherkanovart Let’s add a test for this case.
The test case “should deduplicate across multiple path patterns independently” explicitly verifies this: the same key (key1) in both pathHash1 and pathHash2 is preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
Summary
Automatically deduplicates
i18n.lockwhen YAML contains duplicate keys (e.g. after Git merges), using a shared implementation in both lockfile and delta code paths, with last-occurrence-wins and overwrite + log when duplicates are removed.Changes
deduplicateLockfileYaml()in lockfile.ts — New exported helper:YAML.parseDocument→ collect key positions inchecksums→ last occurrence wins → remove in reverse order → return{ deduplicatedContent, duplicatesRemoved }. Idempotent and deterministic._loadLockfile()(lockfile.ts) — UsesdeduplicateLockfileYaml; whenduplicatesRemoved > 0, overwritesi18n.lockand logs:Removed N duplicate entry/entries from i18n.lock.loadLock()(delta.ts) — Replaced inline deduplication withdeduplicateLockfileYaml; same overwrite andconsole.logwhenduplicatesRemoved > 0. Keeps behavior the same forstatus,run,i18n,frozen.Testing
Business logic tests added:
deduplicateLockfileYaml— no duplicates — Unchanged content andduplicatesRemoved === 0when lockfile is clean.duplicatesRemovedand parsed structure are correct.checksums— No errors;duplicatesRemoved === 0and structure preserved.duplicatesRemoved === 0and identical output.duplicatesRemoved, and final keys/checksums.Visuals
Required for UI/UX changes:
N/A — no UI changes.
Checklist
Closes #[issue-number]
Summary by CodeRabbit
New Features
Refactor
Tests
✏️ Tip: You can customize this high-level summary in your review settings.