Interesting post over at the Flare blog, where they had to delete over 900 million records from a MySQL table. They couldn’t simply run a DELETE FROM table
query, as that would lock it:
When a deletion query using a
WHERE
clause starts, it will lock that table. All other queries that are run against the table will have to wait.Because of the massive size of the table, the deletion query would probably run for hours. The lock would be in place for hours, essentially halting the entire app.
What I learnt from the post is that you can add a LIMIT
clause to a DELETE
query.
# Will only delete the first 1000 rows that match
DELETE FROM `events`
WHERE `added_on` < "2020-12-13 23:00:00"
LIMIT 1000;
That way you can chunk your DELETE
s, so that your table isn’t locked for an indefinite amount of time. Here’s a little script that I built to run locally:
<?php
$cutoffDate = new \DateTime('-1 month');
$batchSize = 1000;
$db = …;
echo "♺ Deleting events before " . $cutoffDate->format(\DateTime::RFC3339) . PHP_EOL;
$total = 0;
do {
$res = $db-> executeStatement('DELETE FROM `events` WHERE `added_on` < ? LIMIT ' . $batchSize . ';', [ $cutoffDate->format('U') ]);
echo " - [" . date(\DateTime::RFC3339) . "] Deleted " . $res . " rows" . PHP_EOL;
$total += $res;
} while ($res == $batchSize);
echo "\x07✔ Deleted a total of $total rows" . PHP_EOL . PHP_EOL;
In their blog post they mention other ways to do so, including Laravel’s built-in jobs mechanism.
How to safely delete records in massive tables on AWS using Laravel →