Within our MongoDB collection, we have identified duplicate revisions pertaining to the same transaction.
Our goal is to clean up this collection by retaining only the most recent revision for each transaction. I have devised a script that successfully removes duplicates; however, in certain instances, it fails to preserve the latest revision. For reference, you can access the script here: https://gist.github.com/praveenkrjha/1f02ed675e297a2c85e9020e39320a41
The initial record (with transaction id (tid) and revision number (rv)):
_id: 61b994a48d546d882462da25, tid: b1def2a8-01e0-4f8c-a859-7cd63287617c, rv: 42
_id: 61b994a48d546d882462da27, tid: b1def2a8-01e0-4f8c-a859-7cd63287617c, rv: 42
_id: 61b994a48d546d882462da29, tid: b1def2a8-01e0-4f8c-a859-7cd63287617c, rv: 42
_id: 61b994a48d546d882462da2b, tid: d9b9e197-6f91-480c-8757-2a24330bc4c5, rv: 21
_id: 61b994a48d546d882462da2d, tid: d9b9e197-6f91-480c-8757-2a24330bc4c5, rv: 22
_id: 61b994a48d546d882462da2f, tid: df0f1252-51d3-4471-be68-a81e8b9a11e2, rv: 31
_id: 61b994a48d546d882462da31, tid: df0f1252-51d3-4471-be68-a81e8b9a11e2, rv: 32
_id: 61b994a48d546d882462da33, tid: bc92c1ee-91b4-4eea-b9ad-752f8dba6c54, rv: 11
_id: 61b994a48d546d882462da35, tid: bc92c1ee-91b4-4eea-b9ad-752f8dba6c54, rv: 12
_id: 61b994a48d546d882462da37, tid: 2a131e9a-07f1-413f-a21f-4f1bbbb3d460, rv: 1
_id: 61b994a48d546d882462da39, tid: e2c7fb60-a1e2-46ea-8d6a-e94e14ded2d3, rv: 1
_id: 61b994a48d546d882462da3b, tid: 314f7cb2-1aad-4add-b2e1-17e910d97713, rv: 1
The resulting data after cleaning up:
_id: 61b994a48d546d882462da25, tid: b1def2a8-01e0-4f8c-a859-7cd63287617c, rv: 42
_id: 61b994a48d546d882462da2b, tid: d9b9e197-6f91-480c-8757-2a24330bc4c5, rv: 21
_id: 61b994a48d546d882462da2f, tid: df0f1252-51d3-4471-be68-a81e8b9a11e2, rv: 31
_id: 61b994a48d546d882462da33, tid: bc92c1ee-91b4-4eea-b9ad-752f8dba6c54, rv: 11
_id: 61b994a48d546d882462da37, tid: 2a131e9a-07f1-413f-a21f-4f1bbbb3d460, rv: 1
_id: 61b994a48d546d882462da39, tid: e2c7fb60-a1e2-46ea-8d6a-e94e14ded2d3, rv: 1
_id: 61b994a48d546d882462da3b, tid: 314f7cb2-1aad-4add-b2e1-17e910d97713, rv: 1
Upon reviewing the final results, it appears that the script mistakenly deletes the latest revision for some transactions. Any suggestions on how this issue can be addressed?