CREATE TRIGGER (mysql)
Jeremy Cole wrote about using calculated indexes (or indices, as the stickler in me prefers) to speed up queries. Doing this in MySQL involves the use of triggers, which are:
[N]amed database object[s] that [are] associated with a table, and that activate[] when a particular event occurs for the table. — http://dev.mysql.com/doc/refman/5.0/en/using-triggers.html
An event in this context would include INSERTs and UPDATEs.
So let’s assume a table:
CREATE TABLE `store` ( `id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT, `shopper` VARCHAR(45) NOT NULL, `begin_time` VARCHAR(6) NOT NULL, `end_time` VARCHAR(6) NOT NULL, PRIMARY KEY (`id`)) ENGINE = MyISAM;
And some data:
1,Tom,060606,060607 2,Dick,100203,110203 3,Harry,084600,084700
Let’s assume I want to know how much time each shopper spent shopping. The naive query might look something like this:
SELECT shopper, TIMEDIFF(begin_time,end_time) FROM store;
Now, let’s also assume that the table contains 10M records. That’s a lot of calls to TIMEDIFF! Let’s also assume for the sake of argument that sometimes the begin time is later than the end time, and I want to calculate the number of seconds that elapsed between begin_time and end_time, irrespective of direction; i.e., the absolute value.
To MySQL’s credit, the date and time functions make this pretty easy:
SELECT shopper, ABS(TIME_TO_SEC(TIMEDIFF(begin_time,end_time))) FROM store;
Assume I also want to add a SELECT condition to restrict the result set to records where the time difference is within a certain range:
SELECT shopper, ABS(TIME_TO_SEC(TIMEDIFF(begin_time,end_time))) FROM store WHERE ABS(TIME_TO_SEC(TIMEDIFF(begin_time,end_time))) <5;
The above represents sixty-million function calls (10M rows*3 functions, twice).
A far better solution to this problem is to create an additional column, index it, and use insert/update triggers to populate it with the time difference:
ALTER TABLE `store` ADD COLUMN `duration` INTEGER(10) UNSIGNED NOT NULL AFTER `end_time`;
ALTER TABLE `store` ADD INDEX `duration`(`duration`);
CREATE TRIGGER shop_duration
BEFORE INSERT ON store
FOR EACH ROW
SET NEW.duration = ABS(TIME_TO_SEC(TIMEDIFF(NEW.begin_time,NEW.end_time)));
And now the query is optimal:
SELECT shopper, duration FROM store FORCE INDEX (duration) # this may not be necessary WHERE duration <5;
Hopefully MySQL will add function-based indexes, like the ones Oracle has at some point. MySQL’s Jim Starkey’s recent comments are encouraging:
A bunch of happy applications each doing super-fast in-thread queries in nice sandboxes disgorging their results in a single blast of data is a pretty good way to structure things.