How to optimize ORDER BY RANDOM()

Doing a ORDER BY RAND() in SQL is bad. Very bad. As Tobias Petry details (and Bernard Grymonpon always used to tell at local meetups):

Ordering records in a random order involves these operations:

  1. Load all rows into memory matching your conditions
  2. Assign a random value RANDOM() to each row in the database
  3. Sort all the rows according to this random value
  4. Retain only the desired number of records from all sorted records

His solution is to pre-add randomness to each record, in an extra column. For it he uses a the Geometric Datatype POINT type. In Postgres he then uses the following query that orders the records by distance measured against a new random point.

SELECT * FROM repositories ORDER BY randomness <-> point(0.753,0.294) LIMIT 3;

~

In MySQL you also have a POINT class (ever since MySQL 5.7.6) that you can use. However I don’t really see how that would work there, as you’d need to calculate the distance for each record using a call to ST_Distance:

SET @randomx = RAND();
SET @randomy = RAND();
SELECT *, ST_Distance(POINT(@randomx, @randomy), randomness) AS distance FROM repositories ORDER BY distance DESC LIMIT 0,3;

💁‍♂️ Using EXPLAIN on the query above verifies it doesn’t use an index, and thus goes over all records.

What I do see working instead, is use of a single float value to hold pre-randomness:

-- Add column + index
ALTER TABLE `repositories` ADD `randomness` FLOAT(17,16) UNSIGNED NOT NULL AFTER `randomness`;
ALTER TABLE `repositories` ADD INDEX(`randomness`);

-- Update existing records. New records should have this number pre-generated before inserting
UPDATE `repositories` SET randomness = RAND() WHERE 1;

With that column in place, you could then do something like this:

SET @randomnumber = RAND(); -- This number would typically be generated by your PHP code, and then be injected as a query param
SELECT * FROM repositories WHERE randomness < @randomnumber ORDER BY randomness DESC 0,3;

Unlike the query using POINT(), this last query will leverage the index created on the randomness column 🙂

~

How to optimize ORDER BY RANDOM()

Via Freek

How to Clean up Async Effects in React

Dmitri Pavlutin walks us through properly cleaning up side-effects in React:

From time to time you might have difficulties at the intersection of component lifecycle (initial render, mount, update, unmount) and the side-effect lifecycle (start, in progress, complete).

Tackled are fetch requests, timers like setTimeout(), debounce or throttle functions, etc.

With the techniques applied, you should no longer see warnings like the one below:

Warning: Can't perform a React state update on an unmounted component.

How to Clean up Async Effects in React →

Chrome 92 — What’s New In DevTools

New in DevTools that ship with Chrome 92 (selection):

What’s New In DevTools (Chrome 92) →

Viewport Unit Based Typography vs. Safari

font-size-vw-tamed

A common thing to do regarding font-sizing is to use Viewport Unit Based Typography, nowadays often combined with CSS min() or clamp():

:root {
  font-size: min(calc(1em + 1vw), 4em);
}

However, as Sara Soueidan details, Safari doesn’t co-operate here:

In Safari on macOS, the fluid text wasn’t really fluid—resizing the viewport did nothing to the font size, even though the latter is supposed to respond to the change in viewport width.

It’s a bug, slated to be fixed in the next version of Safari (Safari TP already has the fix). In the meantime there’s an easy workaround we can use.

More details + demo on Sara’s blog.

Working around the viewport-based fluid typography bug in Safari →

Re-reading that Viewport Unit Based Typography post from 2016 I now see that it also mentions that Safari doesn’t play nice with it. Let this underline the importance of filing bugs: because Sara filed a bug the Safari team came to know about the bug and fixed it (very fast too).

You might as well timestamp it

Jerod Santo:

There are plenty of times in my career when I’ve stored a boolean and later wished I’d had a timestamp. There are zero times when I’ve stored a timestamp and regretted that decision.

Hear hear! Over the years I’ve come to include 9 meta fields for most of the tables I create: added_at, added_by, added_ip, edited_at, edited_by, edited_ip, deleted_at, deleted_by, and deleted_ip. Handy for whenever you receive a phone call saying that a record has disappeared, allowing you to pinpoint it to a user and a specific timestamp.

You might as well timestamp it →