UPDATE vs DELETE and INSERT
We have about 3b rows (and growing) for simplicity sake let’s say the table consists of a duplicating id, a timestamp, a calculated DOUBLE and original value DOUBLE. The id timestamp should be unique.
The data is captured roughly in timestamp_ sequence. Sometimes it is necessary to recalculate the calculated value for some of the ids tens of thousands of rows at a time. When that happens should we delete and re-insert with the new calculated value will the original extents be reused? Or I’m I better off updating rows keeping the timestamp in its original extent? Or do I completely not understand?
I'm thinking about not just performance of the delete & insert vs the update but also future queries....
Answer Answered by David Hall in this comment.
You absolutely understand the issue.
For best future query performance, it's better to do an update to keep the same extents. Delete and insert will not necessarily use the same extents. For a table of that size, it would be unlikely to do so. Furthermore, delete can leave "holes" in your data. ColumnStore does not reuse deleted space within an active extent.