Perl for poets

Tags: | |
Perl CamelPerl is a programming language that has been around since the late 80’s. In that time it has established a loyal following and reputation as a useful tool for the manipulation and analysis of textual data. For those new to programming, the syntax of Perl is relatively simple. It is also a very tolerant language in that there are many ways in the language to perform a desired task. Because of these facts, Perl is an ideal language for poets interested in using the computer to transform, analyse and generate text.

Below are links to information about the language, books on learning and using Perl, links to various distributions for download and a long list of various Perl modules one can download to add functionality to their programs. Experiment and have fun !

UPDATED October 19/2006 - Added links to some Natural Langauge Processing (NLP) Perl libraries

 

About Perl
Perl FAQ
Learning

Perl

The Perl Cookbook

 

Perl distributions/downloads
Microsoft Windows ActiveState’s Perl distribution
Mac OS X

Perl is a standard feature on OS X, but this document may be useful if you wish to upgrade your existing version.

Mac OS 9 MacPerl
Linux Perl is a standard feature on nearly all Linux distributions.

 

Perl poetry articles/programs
Ron Starr’s poem tools A collection of perl programs for generating poetry.
Searching for rhymes with Perl A Perl Journal article (in PDF format) describing how to construct a perl program for generating rhymes.

 

Perl Natural Language Processing  (NLP) Libraries
Dan Melamed's NLP Research Software Library 160+ perl scripts for comparing, calculating, extracting, tokenizing, sorting words.
CLAIR project "The Clair library is intended to simplify a number of generic tasks in Natural Language Processing (NLP) and Information Retrieval (IR). Its architecture also allows for external software to be plugged in with very little effort." It's written in PERL, has a nice variety of features for web search, indexing, extraction, summarization, statistics and clustering and looks to be quite easy to use.
Computational Linguistics Toolset
A set of tools for doing Permutation Statistics on corpora, and for other computational linguistics tasks (like corpus cleaning, examination, and sensing using WordNet).

 

Perl Modules Description
Lingua::Wordnet Perl extension for accessing and manipulating WordNet databases. WordNet is a lexical database which can act as a combination of dictionary/thesaurus. It has a rich set of lexical relationships between each word. A very useful resource.
Lingua::LinkParser Perl module implementing the Link Grammar Parser by Sleator, Temperley and Lafferty at CMU.
Lingua::EN::Gender Inflect pronouns for gender
Lingua::EN::Fathom readability and general measurements of English text (num words/sentends/text lines etc..)
Lingua::EN::Infinitive Determine the infinitive form of a conjugated word
Lingua::EN::Keywords Automatically extracts keywords from text
Lingua::EN::Nickname Genealogical nickname matching (Liz=Beth)
Lingua::EN::Sentence Module for splitting text into sentences.
Lingua::EN::Summarize A simple tool for summarizing bodies of English text.
Lingua::EN::Syllable Routine for estimating syllable count in words
Lingua::Rhyme MySQL-based rhyme-lookups.
Lingua::Stem Stemming of words. Stemming is a process of word shortening which is useful when creating search indexes for the words in a text.
Lingua::Translate Translate text from one language to another (via Altavista).
Class::MakeMethods::Template::TextBuilder Basic text substitutions (great for text generation !)
DBIx::FullTextSearch Indexing documents with MySQL as storage (make your own search engine!)
HTML::FromText Create an HTML file from a plain text file.
HTML::FormatText Create a plain text file from an HTML file.
Text::Aligner Justify strings to various alignment styles.
Text::Affixes Prefixes and suffixes analisys of text
Text::Brew An implementation of the Brew edit distance. This can be used to measure how "similar" two strings of text are.
Text::Autoformat Automatic and manual text wrapping and reformating formatting
Text::Banner Create text resembling Unix ‘banner’ command (great for birthday parties !)
Text::Bastardize A corruptor of innocent text ( great for corporate reports !?!? )
Text::Contraction Find possible expansions for a contraction.
Text::DeDuper Detect similar (near-duplicate) documents based on their text.
Text::DeSupercite Remove supercite quotes and other non-standard quoting from text
Text::Document A text document subject to statistical analysis (has document similarity functions !)
Text::DoubleMetaphone Phonetic encoding of words. See also Text::Metaphone and Text::Soundex.
Text::ExtractWords Extracts words from text.
Text::Forge A templating system for creating dynamic web pages.
Text::Greeking Generates meaningless text that creates the illusion of the finished document.
Text::Lorem Generate random Latin looking text.
Text::Metaphone A modern soundex. Phonetic encoding of words. See also Text::Soundex, and Text::DoubleMetaphone
Text::Ngrams Flexible Ngram analysis (for characters, words, and more). N-grams are useful for generating text which mimic the style of another text.
Text::Echelon Create random Echelon related words. Use this to generate words suitable for achieving a status as a "person of heightened interest" from any number of national security agencies.
Text::Flowchart ASCII Flowchart maker (very cool….)
Text::GenderFromName Guess the gender of a "Christian" first name.
Text::Graphics A text graphics rendering toolkit (ASCII graphics… !)
Text::ParseWords Parse text into an array of tokens or array of arrays.
Text::Pluralize simple pluralization routine
Text::English Porter’s stemming algorithm. See also Lingua::Stem..
Text::Munge::Vowels Strips vowels spaces from words and phrases to shorten the length of simple text messages. This one may be hard to find.
Text::Soundex Implementation of the Soundex Algorithm as Described by Knuth. Useful for finding similar sounding words. See also Text::DoubleMetaphone and Text::Metaphone.