For a recent project (PeakVibe.com) there was a need to create a tag system to link various data together. In this case it was event data with various tags associated to each event. Since each event could have many tags and each tag could have many events a many-to-many relational database schema was needed.
Our database design involved three tables:
1) Events – holds all events and their associated event information
2) Tags – holds all the tags and their id number
3) Event_Tags – creates the relational link tying events and tags together (junction table)
Below is a visual representation of the table structure:
There are several sites that discuss the advantages to different database designs involved in a tagging system. Some even dive deep into the performance on each different type and the associated queries. I am not going to cover those details here as I simply just want to show a quick use case for this data model that worked for an event listing web application. For this project the many-to-many relational structure works great and performs well for data around 10,000+ rows using the queries outlined below.
Further down below you will see a quick PHP example of how to insert tags into the various tables we mentioned above. Depending on what your language of preference is, this may or may not be helpful. But first lets show quickly how to query some of the tags once we enter the various rows. Data is being pulled using SELECT statements. Here are a few handy MySQL queries to pull the data:
Return all tags for a specific event:
1 2 3 4 5 |
SELECT t.tag FROM tags t INNER JOIN event_tags et on t.id = et.tag_id INNER JOIN events e on et.event_id = e.event_id WHERE e.event_id = 254 |
Return all events having tag “beer”:
1 2 3 4 5 6 7 |
SELECT DISTINCT e. event_id, event_name FROM `events` e INNER JOIN event_tags et ON et.event_id = e.event_id INNER JOIN tags t ON t.id = et.tag_id WHERE t.tag IN ('beer') |
Return top tags (most used):
1 2 3 4 5 6 |
SELECT t.* FROM pv_tags AS t LEFT JOIN event_tags AS et ON t.id = et.tag_id GROUP BY t.id ORDER BY COUNT(t.id) DESC LIMIT 50 |
As you can see, we use JOIN to align the tag query with the event entry. This could be an area worth looking at to determine which has better performance. I prefer the JOIN method as I think it is easy to understand and performance is good for typical uses on most data sets.
For inserting the tags into a relational database design I have come up with the below rough PHP. Again, this may or may not be helpful depending on your use, but its worth mentioning for context of this article and as an example of where to get started.
Here we first check if the tag already exists, if it does we use the existing ID, if not we insert the new tag and ID into the ‘tags’ table. We then add the relation link into the ‘event_tags’ table to make the relational link between event ID and tag ID.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
$event_id = mysqli_insert_id($conn); //reference the insert ID of the event data row // (this insert is not shown here - it was assumed to be done ahead of this) // $conn was already completed and not shown here in full $tags_rel = array_keys($tags); // The $tags variable is already completed and not shown in this example for ($x = 0; $x < count($tags_rel); $x++) { //Check if tag already exists $tagquery = $conn->query('SELECT id FROM tags WHERE tag ="'.$tags_rel[$x].'"')->fetch_object()->id; if (isset($tagquery)) { //if the tag ID already exists... $tag_id = $tagquery; //set the tag_id for the relation to the existing ID }else { //Add new tag if does not exist $queryt = "INSERT INTO `tags` (`id`, `tag`) VALUES ('', '" . $tags_rel[$x] . "')"; $maket = $conn->prepare($queryt); if($maket->execute()){ print 'Success! Tag '. $tags_rel[$x] .' has been inserted: <br />'; }else{ print('Error : ('. $conn->errno .') '. $conn->error . '<br />'); } //Get tag id $tag_id = mysqli_insert_id($conn); $maket->close(); } //Add the relational link $querytm = "INSERT INTO `event_tags` (`event_id`, `tag_id`) VALUES ('$event_id', '$tag_id')"; $maketm = $conn->prepare($querytm); if($maketm->execute()){ print 'Success! ID of last inserted record is : ' .$maketm->insert_id .'<br />'; }else{ print('Error : ('. $conn->errno .') '. $conn->error . '<br />'); } $maketm->close(); } |
Again, the above snippet is for reference more than anything else. You could bring aspects of this into your own scripts if using PHP. This concept works great for my use and has proven itself in a production setting.
As always: Ask any questions in the comments field below. Post any suggestions on how you would do this differently or improve on the above.