textpow

Read TextMate syntax files and parse text with them

MIT License

Downloads
115K
Stars
16
Committers
5

Read {TextMate}[http://macromates.com/] syntax files and parse text with them.

== INSTALL:

gem install textpow

=== Ruby 1.8 Install oniguruma

Ubuntu

sudo apt-get -y install libonig-dev

OSX via brew or port

brew install onigumura port install oniguruma5

Then

gem install oniguruma

== Demo

  • {Ultraviolet}[http://grosser.github.com/ultraviolet/] uses testpow to generate html syntax highlighting
  • {Ruco (Ruby commandline editor)}[https://github.com/grosser/ruco] uses textpow to colorize the console

== USAGE:

  1. Load a Syntax File {list of included syntax}[https://github.com/grosser/textpow/tree/master/lib/textpow/syntax]

require 'textpow' syntax = Textpow.syntax('ruby') # or 'source.ruby' or 'lib/textpow/syntax/source.ruby.syntax'

  1. Initialize a processor:

processor = Textpow::DebugProcessor.new

  1. Parse some text:

syntax.parse(text, processor)

=== Syntax-files

Syntax files are written with {Textmate grammer}[http://macromates.com/textmate/manual/language_grammars#language_grammars]. They are written in text/binary/xml/yaml format (all can be translated into each other). Textpow can understand the yaml form (ending in .syntax) and the xml form (ending in .tmSyntax or .plist or tmLanguage) via plist library, but not the text or binary form. All syntax files shipped with textpow are in the yaml form, since its easiest to read and less verbose.

Adding a new syntax:

  • add download location and scopeName to update_syntax task in Rakefile
  • rake update_syntax
  • run tests rake to see if the syntax is parseable

=== Processors A processors is a hook that is called while parsing a text. It must implement these methods:

class MyProcessor def open_tag(tag_name, position_in_current_line); end def close_tag(tag_name, position_in_current_line); end

# called before processing a line
def new_line(line_content); end

# called before parsing anything
def start_parsing(scope_name); end

# called before parsing everything
def end_parsing(scope_name); end

end

Textpow ensures that the methods are called in parsing order. If there are two subsequent calls to open_tag, the first having name="text.string", position=10 and the second having name="markup.string", position=10, it means that the "markup.string" tag is inside the "text.string" tag.

== TODO

  • smart + mel + applescript have some invalid utf8 characters that are currently ignored (search for INVALID_UTF8))
  • fix markdown and php syntax
  • parse text plist files {example}[https://raw.github.com/kangax/textmate-js-language-syntax-file/master/JavaScript.plist]
  • update more languages via github urls

== LICENSE: MIT

  • Copyright (c) 2011 {Michael Grosser}[http://github.com/grosser]
  • Copyright (c) 2010 {Chris Hoffman}[http://github.com/cehoffman]
  • Copyright (c) 2009 {Spox}[http://github.com/spox]
  • Copyright (c) 2007-2008 Dizan Vasquez

{}[http://travis-ci.org/grosser/textpow]