13 December, 2007

How to create a tag cloud? (With formula and sample calculation)

I googled on how to create a tag cloud. I found some, but, I didn't like their way of doing it because I think they did it the improper way. That's why I wrote this blog so that it's my turn to post something educational.

But before anything else, what is a cloud tag? Let me define it in my own words. Visually, it is a group of terms displayed with varying font sizes that are packed together so that it resembles a cumulus cloud. It is usually arranged alphabetically and center-aligned. Some tag clouds also have varying colors. In HTML, each tag is usually a hyper link. Conceptually, each tag isn't just a mere term; a tag in a cloud tag is a representation of an idea, a concept, or something that can be weighted; so, a bigger tag means a greater value or interest. (For example, the flicker tag cloud: http://www.flickr.com/photos/tags/ )

Now the question is how. How are the sizes of tags made vary? Simple. In HTML, just use the CSS font-size attribute.

Example:
<_a href="http://www.blogger.com/mylink"> tag item <_/a>

Look at the example above. If that looks strange to you, then stop reading right now and go away because you're not my target reader.

If you're still reading, then you know that that's an HTML tag for a link.

To have a tag cloud, you need many tags but with varying font sizes among them. That's easy, isn't it? But the hard part is generating those tags dynamically and computing the right size for the right tag.

What you need is a database of tags. Then query your database so that you have with you the list of tags and their number of occurrences. See the following table for example.

tags | occurrences
----------------------------------------------------------
birthday | 144
christmas | 108
valentines | 211
thanksgiving | 168
liberation | 88
halo ween | 114
new year | 140

The above table is our sample data. Each tag represents your customers favorite holiday. How can you present the tags as a cloud tag being the valentines day as the biggest (with 50px font-size) and the liberation day as the smallest (with 12px font-size)?

We will use the following variables, namely:
a = the smallest count (or occurrence).
b = the count of the tag being computed.
c = the largest count.
w = the smallest font-size.
x = the font-size for the tag. It is the unknown.
y = the largest font-size.


Now let's substitute the given values to their respective variables. Assuming that we are solving for the "thanksgiving" font-size.
a = 88
b = 168
c = 211
w = 12
x = ?
y = 50

And here's the formula:

x = (b-a) (y-w)
----------- + w
(c-a)

Or to put it in one liner (using c-like syntax):

x = ( ((b-a) * (y-w)) / (c-a) ) + w

And that's it. That's the formula. You might be wondering where I get that formula. Well, it's hard to explain here in words but let me still try. Using the "ratio and proportion" in Mathematics, the ratio of the distance between a and b and the distance between a and c is equated with the distance between w and x and the distance between w and y.

Or to make it simple,

b-a x-w
----- = -----
c-a y-w

Let's now continue computing the font-size for the thanksgiving. By substituting the values to the equation above, we will have...

x = ( ((168-88) * (50-12)) / (211-88) ) + 12
x = 36.715446
x = 37

The thanksgiving tag should have 37px font-size in the tag cloud. Try computing for the rest of the tag. You will get:
birthday = 29px
christmas = 18px
valentines = 50px
thanksgiving = 37px
liberation = 12px
halo ween = 20px
new year = 28px

--End

tip: When using Java, operate on float data type, not integer.

23 comments:

Ratn Dwivedi said...

Good one.I usually use this formula for tag cloud.It gives good controll and linear proportion on font-size vs occurance.

Unknown said...

Thanks for the article. It should be mentioned, however, that for most purposes a logarithmic (rather than linear) size relationship between the tags works better. Just google tag cloud logarithmic and you can see some examples.

Ratn Dwivedi said...

Adding to Kevin's comment logarithmic is good one and at the same time exponential gives reverse effect of logarithmic.

Anonymous said...

hmm.. stumbled by accident here...
but nevertheless it's a good formula. I use it for color instead of font size and not just shades of a base color.. i mean from green to red ^^

Anonymous said...

Fantastic, thank you.

Lucky said...

thats greate that you write and simple but you forget for goal how to post that label in div

V.B.Sharma said...

Thanks it works for me. But i also want a formula to compute the Max and Min Font Size according to Total Words length.
Ex. if only 5 words then screen should not be blank. if 100 words then screen should try to adjust as maximum as it can.

V.B.Sharma said...

Thanks it works for me. But i also want a formula to compute the Max and Min Font Size according to Total Words length.
Ex. if only 5 words then screen should not be blank. if 100 words then screen should try to adjust as maximum as it can.

Milox said...

Thank you man

Sqiar said...

Thanks a lot for sharing this with all folks you really recognise what you are talking about! In this complex environment business need to present there company data in meaningful way.Sqiar (http://www.sqiar.com/consultancy/tableau/) which is in UK,provide services like Tableau and Data Warehousing etc .In these services sqiar experts convert company data into meaningful way.

Unknown said...

Hi, This is Manish from Chennai. Your blog is really awesome and I got some useful information regarding cloud computing. This is really useful for me. Thanks for sharing such a informative blog. Keep posting.

Regards..
Cloud Computing Training Institutes in Chennai | Cloud Computing Course in Chennai

Unknown said...

I am feeling great to read this.you gave a nice info for us.
please update more.
best android training institute in bangalore
Android Training in Nolambur
Android Training in Saidapet
Android Training in Perungudi

Unknown said...

such an effective blog you are posted.this blog is full of innovative ideas and i really like your information's. i expect more ideas from your site please add more details in future.
Selenium Training in Navalur
Selenium Training in Ashok Nagar
selenium testing institutes in bangalore
selenium testing course in bangalore

sathyaramesh said...

thanks for sharing such a nice info.I hope you will share more information like this. please keep on sharing!
RPA Training in Chennai
Robotics Process Automation Training in Chennai
RPA training in bangalore
RPA course in bangalore
Robotic Process Automation Training

Unknown said...

Nice way of expressing your ideas with us.
thanks for sharing with us and please add more information's.
devops training institutes in bangalore
devops institute in bangalore
devops Certification Training in Anna nagar
devops Training in Ambattur

Anbarasan14 said...

I am really interested to continue reading your blog. You have shared valid info. Waiting for more updates from you.

Spoken English Classes in Chennai
Spoken English Class in Chennai
Spoken English in Chennai
Best Spoken English Classes in Chennai
Best Spoken English Institute in Chennai

sultana said...


Thanks Admin For sharing this massive info with us. it seems you have put more effort to write this blog , I gained more knowledge from your blog. Keep Doing..
Regards,
RPA Training in Chennai
RPA Classes in Chennai
CCNA Training in Chennai
DevOps Training in Chennai
SEO Training in Chennai
RPA Training in OMR
RPA Training in Velachery
RPA Training in Tambaram

Lucky Patcher said...

copa toon apk

luckys said...

gomovies

luckys said...

yesmovies

tech zone said...

movies whatsapp groups

parvina said...

Your article is very helpful.You can visit my website:adventure capitalist trailer

Lily said...

I Love your article. You can visit my website : solitaire world of solitaire