Tag Archives: database design

Me count(*)

I just finished upgrading to WordPress 2.3: so, curious as any developer would be, I took a look in the WordPress database definitions and noticed that three new tables were added. All of them take care of categories and (the new) tagging system. This means, Wp_Cat is out and has been replaced with wp_terms: the actual distinction (i.e., which terms is a category and which one is a tag) is now made in the the wp_term_taxonomy table.

I have issues with that last table because it has a count column (to track the number of posts). This is the second table that has a count column (earlier): I mentioned before that version 2.0 introduced a comment_count in the wp_posts table. Why not make use of the regular aggregate functions (like the standard COUNT(*))? After all, these aggregates are generally highly optimized functions (written in C) for tables with the same primary key(s). Also, as a good database designer, during database design you should take the use of aggregate functions in account when setting up your tables structure.

Notice that count of posts for a tag/category can be done simply by:

select count(*),
p.term_taxonomy_id,
q.taxonomy,
t.name
from wp_term_relationships p,
wp_term_taxonomy q,
wp_terms t
where p.term_taxonomy_id
= q.term_taxonomy_id
and q.term_id = t.term_id
group by p.term_taxonomy_id,
q.taxonomy, t.name

Sure: a query on that same table with that count column may look as easy as ‘select count from wp_term_taxonomy’, but remember, it (always) takes an extra write to put a number in that table (via a regular UPDATE). If your transaction on that table ever failed, your fancy report will most likely report wrong numbers.