Auto-tagging Content with OpenCalais

by: Allan Thræn

One of my big passions has always been various forms of intelligent textual analysis – probably a remnant of my search-engine days. Anyway – Over the last couple of years I’ve done a bunch of different prototypes which have it in common that they focus on working with the content part of content management. That is, they look at the actual text editors insert – and try to do something semi-clever with it. I’ve done prototypes that expand the links collection property-type with a “Suggest Related” button, stuff that attempts to suggest meta-tag keywords automatically, etc. Most of this I never got around to polish off and blog – it’s mainly been fun little experiments. One of those was done 8-9 months ago, when I was playing around with OpenCalais one day. OpenCalais is an awesome service that allows you to submit textual content and then it tags it – and extract key parts of the information – like which people, companies, technologies, etc that are mentioned in the text. I know I haven’t been alone in playing around with OpenCalais – but I figured I’d share my basic prototype here.

My initial idea was to make a “hidden” connection to OpenCalais – so whenever a page was published, it would automatically be tagged – and the tags would be stored in the categories. However this turned out to be quite a slow procedure so I abandoned that idea. My current approach is to have a tab in edit-mode where you can ask to have the page analyzed – and if you like the result, you can push a button and save that into the categories.

image

If you choose to save it to categories, it will be setup in a category structure like this:

image

The code is still “very prototypish” so unless I hear a great, instant demand I’m not planning to release the code now. If anybody want to play around with the binaries, you are welcome – just unzip this package and copy to your site folder. Works in both CMS 5 and CMS 6. NOTE: This is a prototype, not production ready code. But feel free to be inspired – this was done in an afternoon – imagine what you can do with it in a full day.

22 June 2010


Comments

  1. Very cool. We have developed a similar solution on top of FAST ESP, but this product has its price :) Will be cool to deliver this functionality for "free". Would be cool to play around with the code if possible :)
Post a comment    
User verification Image for user verification  
Allan Thræn

About me

I am a product manager @ EPiServer, with a passion for the more geeky side of things. My technical interests are typically focused around user problems, user experience,  search, information management, artificial intelligence and  personalization

On top of this blog I have the blog Allan On Technology and I often crosspost.

DISCLAIMER: Unless otherwise stated in the posts, this blog expresses my personal opinions, experiments and views, not necessarilly the views of EPiServer AB.

 826 page views this week.

 

 

Syndications


Archive


Tag cloud

EPiTrace logger