ClearForest lauches a semantic web web service web site

Tags: | | |

I just had to throw another "web" in there somewhere.... I haven't seen this type of offering from anyone else yet, but the company ClearForest have made their text analysis and classification system "web enabled" so that others can analyze documents and get results in the form of XML documents.

They are even holding a "mash-up" contest to bring attention to their new service and to see how people can use the results of their system in some creative combination with other web services like Google maps etc.

So what does this thing do exactly ? You can try it out yourself using their web interface here:

DEMO

Currently, their system only performs "named entity recognition", returning results for people, places, organizations etc... things that have names. They seem to hint at being able to detect facts and events too, but not yet.

Some "poetic" mash-up ideas ? You could:

  • swap a collection of named entites from one document with those of another
  • strip them out and use more generic terms (she/it/them/that country etc.)
  • strip them out completely and use the resulting document as a template to populate with random nouns of your choice.

There are a number of non "web enabled" freely available systems which do this same thing and have been out for a while... like the Java based ones LingPipe and GATE.

Once you know exactly where in a document these types of things are, it's quite easy to do some interesting aesthetic transformations. The GTR Language Workbench has a variety of pattern detectors for basic linguistic structures (lines, paragraphs, sentences, words), parts of speech, ontological concepts (using a plugin to WordNet), and detectors for locating words with particular sounds or phonemic sequences. The Workbench itself is very close to a release, still working out a few bugs and pumping up the documentation. The named entity recognition systems and other pattern detectors found in LingPipe and GATE will be added to the Workbench in a post 1.0 release.

Ok... back to bug fixin and document writing...


Reply

The content of this field is kept private and will not be shown publicly.