CRM114 - the Controllable Regex Mutilator

January 28th, 2007 by andreas

Last week I spent quite a lot of time on CRM114 which according to the website is “a system to examine incoming e-mail, system log streams, data files or other data streams, and to sort, filter, or alter the incoming files or data streams according to the user’s wildest desires. Criteria for categorization of data can be via a host of methods, including regexes, approximate regexes, a Hidden Markov Model, Orthogonal Sparse Bigrams, WINNOW, Correllation, KNN/Hyperspace, or Bit Entropy ( or by other means- it’s all programmable).”

CRM114’s programming language is not similar to anything I’ve seen before, with it’s declensional syntax instead of the more common ordinary positional syntax, strange keywords and lack of types (Everything is a String), but once you get ahold of it all, the author promises that you’ll be able to “write the filter of your dreams”.

A couple of example programs
First, a ROT13 implementation.

#!/usr/bin/crm
translate /a-zA-Z/ /n-za-mN-ZA-M/
accept

This code takes input from stdin and gives output to stdout.
Example

$ echo "CRM is so great."|./rot13.crm
PEZ vf fb terng.

A Reverse Polish Notation calculator

#!/usr/bin/crm
{
eval (:_dw:) / :@:R:*:_dw: : /
output /:*:_dw:\n/
}

Example usage

$ echo "2 5 2 * + 5 +"|./rpn.crm
17

But the real strengths of CRM114 lies in it’s ability to learn and classify text and other streams of data. At this point, CRM114 has seven different classifiers with different advantages. Some can do N-way choices while others do simple Yes/No-choices.

Posted in Software, MSc | No Comments »

How to get research done

January 16th, 2007 by andreas

Last week I attended a short presentation given by Mark Burgess entitled “How to get research done”. While I’ve heard most of the advices before, it still seems valuable to repeat them from time to time. So without further introduction, here’s some key elements:

Just do it!

  • It’s better to try and fail, at least you will learn something
  • Don’t wait until you think you are sure

Time management

  • Plan for the next day while your head is still focused on your work
  • Stop when you believe you know how to proceed the following day
  • Avoid interrupting progress by taking unnecessary breaks.

How to make progress

  • Dream of your long term ambitions but focus on immediate goals
  • Always have a TODO list
  • Set aside whole days for research and postpone trivia
  • Give priority to things that move you forward

Interruption management

  • Protect yourself from interruptions or you will exhaust yourself doing nothing
  • Consider shutting down applications and devices that provides email, sms, IM and music

Read, read, read

  • Read about anything, not just things you think are relevant
  • Your brain needs all kinds of food, not just specific nutrients — don’t be a vitamin freak
  • You don’t have to understand everything to get something out of it

Write, write, write

  • Always have a notebook in range
  • Write down thoughts and ideas, it’s never as clear as you think it is when it’s in your head only

Optimise work and rest

  • Make sure you sleep and eat properly
  • Exercise

Posted in MSc | No Comments »

First post on andreasblaafladt.com

January 4th, 2007 by andreas

This is my first post in this new publicly available blog.

Posted in Uncategorized | No Comments »