Siaris
Simple Things
Syndicate: full/short
Siaris
Categories
General0
News2
Programming2
LanguageBits0
Perl50
Ruby10
VersionControl1
Misc1
Article Calendar
<= March, 2010
S M T W T F S
123456
78910111213
14151617181920
21222324252627
28293031
Search this blog

Key links
External Blogs
Brought to you by ...
Ruby
1and1.com

Looking at darcs

Andrew L Johnson

I like simple, so when I heard that darcs is a simple (but rich) distributed version control system, I had to take a look.

darcs is a relatively new entry on the version control landscape, officially announced in April of 2003 it passed the 1.0.0 milestone in November of 2004 and now sits at 1.0.1. It is written in Haskell, and you’ll need the Glasgow Haskell Compiler (GHC) to build it from source, but there are binaries available for many platforms. The linux static binary worked out of the box.

What makes darcs different is that you don’t check out working directories, but rather fully functional repositories — essentially branches — without any of the fuss or administrative overhead of setting up local branches / repositories that some other version control systems require.

So, if I do

  darcs get http://www.abridgegame.org/repos/darcs

I wind up with a full repository of the darcs sources — in which I can make changes, record, unrecord, revert, pull down new patches, or create patches to apply back to the "main" repository (using darcs push if I had write access to the that repository, or darcs send to send via email where they might either be manually or automatically applied). Of course, I could share my patches directly with another developer equally as easily if we both working on some fix or feature.

(Note: darcs get —partial … is recommended if you don’t really need the entire patch history of the project — it will just fetch the patches since the last ‘tagged’ version and build your repository based on that. Currently with darcs, that’s the difference between getting some 2500+ patches or just getting and applying few hundred).

This tutorial illustrates how easy using darcs is, and the manual itself is a well written guide and reference (which also discusses the underlying Theory of patches of darcs). So I won’t duplicate that material here.

I will mention one experience I had with darcs itself. I decided I’d rather build darcs from source rather than rely on a binary, so I downloaded a binary install of GHC, used my binary of darcs to get the darcs repository, and built my own version. Unfortunately, the GHC I installed was broken on my system, and any system() calls failed (and darcs uses system() calls in a few places such as bringing up your editor to write long log messages, or for running automated test commands). My first attempt to build my own GHC from source also failed.

So I dove into the darcs sources (without ever having looked at Haskell code before) and was able to come up with a workaround patch that replaced any system() calls in darcs with an execvp based alternative — this solved my immediate problem, and everything worked fine.

Now I wouldn’t expect a patch for my isolated case to be incorporated into darcs, especially since my real problem was the GHC compiler and not darcs. But then, I do have my own fully functioning branch right here which I can still keep up to date by pulling down new patches from the main repository (resolving any conflicts if they arise). In fact, a couple of new patches were added to the repository, and I just did:

  darcs pull

within my own repository and it interactively asked me about applying each patch in turn (I could have had it not ask), applied them, and I recompiled — no problem, no headache.

That was quite nice to know. Fortunately, I did manage to get a working version of GHC compiled on my machine (with a little baby-sitting during the compile), thereby solving my real problem and allowing me to unpull my specialized patch from my repository. Cool.

In reworking the code section of this site, I decided to make the few small Ruby and Perl projects currently there available as darcs repositories (along with tarballs).

A couple of other darcs features:

  • atomic commits
  • file and directory moves handled
  • token-replace patches
  • symmetric repositories (full darcs power in any copy)

But the number one feature I like is that it is damn easy to use without sacrificing power.

However, it’s not all roses and there are a couple of caveats:

  • a repository costs 2 source-trees plus in disk-space (at the present time — though hard-linking is done when possible between copies on local filesystems, and besides, diskspace is cheap)
  • Large source trees with many changes can be slow to fetch, and require lots of memory (but memory is cheap too).
  • Really complex merges may take a long time (exponential algorithm).

Work is being done to address the above issues, and in the meantime, darcs is cetainly capable of handling small to medium projects (say perhaps, into the 10’s of thousands of LOS[1] range). Darcs is self-hosting and runs some 28,000+ LOS.

FeedBack

__END__

  [1] LOS = Lines Of Stuff
Asynchronous Sequential Messages

Andrew L. Johnson

The Io language has asynchronous messages that return transparent futures — asynchronous messages are put in a per-object queue and processed sequentially by a lightweight thread (coroutine). The return value is a transparent future which _turns into_ the actual result when it becomes available (using the future blocks if the result isn’t ready). I thought it might be interesting to POC the idea in Ruby code:

  require 'thread'
  class Async
    @@keep =  %w-__id__ __send__-
    (instance_methods - @@keep).each{|m| undef_method m}

    def initialize(&blk)
      @th = Thread.new(&blk)
      @th.abort_on_exception = true
      at_exit {@th.join}
    end

    def method_missing(sym, *args, &blk)
      __getobj__.__send__(sym, *args, &blk)
    end

    def __getobj__
      @obj ||= @th.value
    end

    # control/status messages
    def arun
      @th.run
      self
    end

    def await
      @th.join
    end

    def aready?
      ! @th.alive?
    end

    def aresult_to(obj,meth)
      Async.new {obj.send(meth,__getobj__)}
    end
  end

  class Object
    def async(msg,*args,&blk)
      @async_queue ||= Queue.new
      fut = Async.new {Thread.stop;self.send(msg,*args,&blk)}
      @async_queue << fut
      @async_thread ||= Thread.new do
        loop{@async_queue.pop.arun.await; Thread.pass}
      end
      fut
    end
  end
  __END__

Each object gets its own queue for asynchronous messages, which are handled in turn (FIFO). Thread control is explicitly passed upon completion of an asynchronous message (control can also be passed within an asynchronous message). The return value is a proxy-future that will block when used (aside from a few control messages) until the result is ready and then proxy that result. This version also allows for the result of an asynchronous message to be automatically passed to another object via the aresult_to method.

There is probably more wrong than right with this proof-of-concept (deadlock, exceptions, garbage collection, etc.). Still, it was a cute little exercise — and I should mention that I freely borrowed ideas from Jim Weirich’s BlankSlate and kjana’s UDelegator (with its own versions of futures and promises).

Feedback

__END__

PLEAC -- Ruby version still growing (slowly)

Andrew L. Johnson

Progress is slowly creeping forward on the Ruby version of PLEAC (Programming Languages Alike Cookbook), which is an attempt to represent the recipes in the Perl Cookbook in a variety of languages. I’ve been aware of the project for a couple of years, but only got around to joining the mailing list a few weeks ago.

I’ve submitted some 20+ Ruby sections in the last couple of weeks, trying to top off a few early chapters to the 100% mark. Currently, chapters 1,2,3,4,9, and 10 are done. Chapter 5 is missing an ordered hash recipe (should be coming soon). Chapter 6 (patterns) is just over 81% now, and chapter 7 should be close to 60% once a few recent submissions are accepted and committed. Chapter 8 sits at 33% at the moment. The first half (chapters 1-10) of the Ruby version could be brought to 100% in a relatively short time (though a few longish programs are still needed).

There are also significant inroads in many of the chapters in the second half (chapters 11-20). It might be a nice boon for Ruby to become the first language to achieve a 100% complete PLEAC (at the moment Ruby sits at 58.71% of completion, behind only Python’s 59.14%) hint hint :-)

__END__

Use the DATA handle

Andrew L. Johnson

One thing I recommend to newcomers to both Perl and Ruby is to make use of the DATA file(handle/object) for explorative purposes. It is quite useful both in the contexts of language exploration and solution-space exploration.

The basic idea is that within your Perl or Ruby script, any text after the special __END__ token is available to be read into the program via a preopened filehandle/object named DATA (in both languages). In Perl, the token may also be called __DATA__.

Here’s a simple example (Perl and Ruby side by side):

  #!/usr/bin/perl -w                        #!/usr/bin/ruby -w
  use strict;
  while (my $line = <DATA>) {               while line = DATA.gets
    my @arr = split /:/, $line;                arr = line.split(/:/)
    print join '|', @arr;                      print arr.join('|')
  }                                         end
  __END__                                   __END__
  abc:def:ghi                               abc:def:ghi
  123:456:789                               123:456:789

Note: the __END__ token must be flush with the left margin in Ruby code.

There are many ways this can be usefully used, but for my exploratory purposes it is only half of the equation. The other half is your text editor (or perhaps IDE). Many text editors can be configured to run the text (or some selected portion thereof) of the current buffer through an external program (usually as a filter). If you are writing Perl or Ruby code, that external ‘filter’ can be the Perl or Ruby interpreter, and you can arrange for the output to be displayed in another window (pane, buffer, whatever).

For example, I have the following in my .vimrc file:

  noremap <F9> :w !perl -w > ~/.vim/p_buff 2>&1 <NL> :sv ~/.vim/p_buff<CR>
  noremap <F10> :w !ruby -w > ~/.vim/r_buff 2>&1 <NL> :sv ~/.vim/r_buff<CR>

Now hitting the F9 or F10 key sends the selected text to the interpreter, captures the output into a special file, and opens that file in a new buffer window. Both Ruby and Perl can recieve scripts via STDIN, and both leave everything following the __END__ token to be read via the DATA handle.

This means I can have a buffer window open to play around with a new language element or feature, or to explore possible solutions to a problem. And if that problem requires some data reading/munging/parsing, I can paste in some representative lines of data and have one-key convenience for trying out various snippets of code.

__END__

Ruby: Enumerators and Generators

Andrew L Johnson

Included with the Ruby distribution are the generator library and the enumerator extension — both useful tools when ordinary iteration doesn’t quite measure up.

The enumerator extension is simple in concept: create a new Enumerable object given an object and a method of that object to be used as an iterator. For example, if we add an each_even iterator to the Array class to iterate over every element with an even numbered index, we can use enumerator to create enumerable versions of an array object that use each_even as the iterator:

  require 'enumerator'

  class Array
    def each_even
      self.each_with_index do|el,i|
        yield el if i % 2 == 0
      end
    end
  end

  arr = ['a','b','c','d','e','f','g','h']
  enum = Enumerable::Enumerator.new(arr, :each_even)
  ev = enum.map {|x| x + x}
  p ev                      #=> ["aa", "cc", "ee", "gg"]

In addition to the constructor above, the following convenience functions are added to the Object class:

  to_enum(:iter, *args)
  enum_for(:iter, *args)

The Enumerable module is also extended with five additional methods:

  each_slice(n)    # iterates over non-overlapping chunks of size n
  enum_slice(n)    # new enumerator object using :each_slice(n)

    ('a'..'m').each_slice(4) {|sl| p sl}
    #  produces:
      ["a", "b", "c", "d"]
      ["e", "f", "g", "h"]
      ["i", "j", "k", "l"]
      ["m"]

  each_cons(n)     # iterates over successive chunks of size n
  enum_cons(n)     # new enumerator using :each_cons(n)

    ('a'..'m').each_cons(4) {|sl| p sl}
    # produces:
      ["a", "b", "c", "d"]
      ["b", "c", "d", "e"]
      ["c", "d", "e", "f"]
      ["d", "e", "f", "g"]
      ["e", "f", "g", "h"]
      ["f", "g", "h", "i"]
      ["g", "h", "i", "j"]
      ["h", "i", "j", "k"]
      ["i", "j", "k", "l"]
      ["j", "k", "l", "m"]

  enum_with_index  # new enumerator using :each_with_index

The generator library generates external iterators from either blocks or Enumerable objects (in the latter case, the :each iterator is externalized).

  require 'generator'

  arr = ('a' .. 'm')
  gen = Generator.new(arr)
  while gen.next?
    p gen.next
  end

This makes iterating over multiple objects relatively easy. However, the generator library also provides the SyncEnumerator class which makes multiple iteration a breeze:

  require 'generator'

  a = (4..5)
  b = ['a',nil,'c']
  c = ['x','y','x']

  enum = SyncEnumerator.new(a, b, c)
  enum.each {|row| p row}

  puts '---'
  table = [ [1,2,3], [4,5,6], [7,8,9] ]
  cols = SyncEnumerator.new(*table)
  cols.each {|col| p col}

  # produces:

    [4, "a", "x"]
    [5, nil, "y"]
    [nil, "c", "x"]
    ---
    [1, 4, 7]
    [2, 5, 8]
    [3, 6, 9]

All of which just goes to show: There’s more than one way to iterate an enumerable.

Discuss

__END__