Saturday, October 21, 2006

Unstructured Information Management

Unstructured information makes up most of the information content on the internet today. Estimates are as high as 90% of available information on the internet is unstructured. So with all the databases, portals, websites, repositories, hard drives and trillions of files that exist today how do you harness this information? This is where the field of Information Management has the technical challenge to turn all this information into useful information and knowledge. This is entirely conceiveable given sufficient time, computing power and storage. The challenge is making this happen in near realtime.

To meet this challenge, DARPA has funded IBM Research in 2005 to create UIMA which stands for the Unstructured Information Management Architecture. It is an open, industrial-strength, scaleable and extensible platform for creating, integrating and deploying unstructured information management solutions from combinations of semantic analysis and search components. IBM makes UIMA available as a free SDK (alpha), and makes the core Java framework available as open source software (UIMA at SourceForge) to provide a common foundation for industry and academia to collaborate and accelerate the world-wide development of technologies critical for discovering the vital knowledge present in the fastest growing sources of information today. IBM developerWorks has a tutorial for using the UIMA SDK with Eclipse.

Since IBM released UIMA as open source in early 2006, it has been widely adopted. Open source projects such as GATE, OntoText, and many other have been utilizing UIMA as the framework for unstructured information management research. As research into managing and harnessing unstructured information grows, there will be more available solutions to solve these problems.

Sunday, October 01, 2006

The Expanding Google Earth

If you have not used Google Earth lately then you are missing out on one of the killer applications that merges the web with rich native applications. The 3-D Warehouse of Google Earth plugins has been growing steadily after Google released Sketchup, the 3-D modeling aplication. Sketchup comes in a free version and a professional version. The nice thing about this approach is that you get to use a fully functioning program that scales to the professional level if you need that capability.

"The growing world of Google Earth" provides some insight into just how Google Earth has been evolving. I use Google Earth quite frequently as the means to find places and to explore places that I have never been. You can spend hours doing this and is quite entertaining.

The real power of Google Earth is the API provided by Google that allows you to build upon Google Earth's capabilities by integrating it into your custom applications. Additionally an SDK is provided to allow even more customization. Google Earth is becoming a platform for creating new applications or mash ups.

The imagery database is constantly being updated and I can see the day when the 3-D models for just about everything is integrated and available. I can think of several applications of integrating Google Earth technology and designing 3-D programs for everyday use. For now, we all get to watch this killer application evolve slowly which is ok with me.