CSV Round-Trip Broke On Real-World Notes And Timestamps

One of the most dangerous engineering bugs is the one that looks correct in output, passes simple tests, and still loses user data in the real world.

That is exactly what happened in this CSV round-trip case.

Here is the trap in one sentence:

a valid exported row can still be destroyed by a bad import assumption.

At a surface level, everything looked fine:

export produced a CSV file
quoted fields were present
rows looked normal on happy-path records

But that surface success hid a deeper problem: the pipeline only really worked on toy rows.

The first failure mode showed up in notes.

Users could write multi-line notes. Export handled that correctly by quoting the field. So if you only inspected the CSV output, you might think the system was sound.

It was not.

On import, the old implementation split the full file on raw newlines before doing quote-aware CSV parsing. That meant a single valid record with a newline inside a quoted notes field got cut into two half-rows, and both halves became unparseable.

That is a classic integrity trap:

the export is technically valid, but a later stage invalidates the guarantee.

The second failure mode showed up in timestamps.

The export format used minute precision for createdAt. That looked harmless until two distinct events occurred within the same minute. Once exported and re-imported, they could collide on the dedup key and be treated as one event instead of two.

Again, the problem was not obvious by casually reading the file. The file looked normal. The silent damage emerged only when import semantics met realistic usage.

That is what makes this case so useful.

It was not "a CSV parser bug."

It was a broader round-trip integrity bug class caused by assumptions that seemed reasonable in isolation:

rows end at newlines
minute precision is enough
dedup keys only need obvious fields
if export looks right, import is probably fine

Those assumptions fail under user-shaped data.

The hardening pass fixed the bug class in a way that is worth copying:

row splitting became quote-aware
timestamps moved to second precision
dedup keys became stronger
regression tests were added for multiline notes
regression tests were added for same-minute records
regression tests were added for pathological note content and newline normalization

The biggest lesson is simple:

If you claim round-trip support, test round-trip integrity, not just output syntax.

That means using records with:

embedded newlines
commas
quotes
emoji
locale-shaped numbers
near-colliding timestamps

Because that is what real users generate.

Simple CSV fixtures are fine for getting started. They are not enough to prove you are preserving data.

If this kind of deep bug breakdown is useful, I can keep building a library of engineering cases where "the output looked fine" was exactly the trap.