YAML to Ruby hash to CoffeeScript object

In one of my projects I found myself in the odd position where I had data in a YAML file, that needed some processing done on and then being inserted into a CoffeeScript file.

Now I could’ve just done a YAML to JSON conversion, but seeing as I had the intermediate Ruby processing steps and because I really wanted the output to be CoffeeScript as I was likely to have to work and manipulate it further later anyway, and would want to make changes to the CS directly instead of parsing the whole beast again.

So after loading the YAML into a Ruby hash and manipulating it appropriately I needed to do the conversion to CoffeeScript.

This is my solution

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
      defaultProc = Proc.new do |output|
        print output
      end

      #input is a Ruby hash
      #spaces is a prefix string of spaces used for whitespace significance
      #proc acts on the output
      def HashToCS.convert(input, spaces, proc=defaultProc)
        if input.is_a? String
          proc.call spaces + '"' + input + '"' + "\n"
        elsif input.is_a? Array
          proc.call spaces + "[\n"
          input.each do |a|
            convert(a, spaces + "  ", proc)
          end
          proc.call spaces + "]\n"
        elsif input.is_a? Hash
          proc.call spaces + "{\n"
          input.each do |k, v|
            proc.call spaces + "  #{k}:\n"
            convert(v, spaces + "    ", proc)
          end
          proc.call spaces + "}\n"
        else
          proc.call spaces + input.to_s + "\n"
        end
      end

usage:

1
2
3
4
5
    proc = Proc.new do |output| 
      coffee_script_file.puts output
    end

    HashToCs.convert(ruby_hash, "", proc)

I am using it from a rather intricate Thor script to create a data file for my coffeescript app to act on.
More on using Thor to manage intricate application builds in a later post.

Turning my MDSL plugin into a gem

For the past year+ I’ve been doing client work for Tom Locke’s Artisan
Technology
web development consulting
firm.

The app I’ve been working on allows a user to perform various standard business
intelligence queries on a dataset using a pretty cool AJAX front-end.

We’re using MultiDimensional eXpression Language (MDX) to write our BI queries
in. We then execute the queries against our database engine using the
excellent and open source
Mondrian project.

When I was originally handed the project, all queries were built using
super general MDX snippets with several “REPLACE_ME_WITH_DATE_CLAUSE” tokens
that would get replaced or removed as needed for a particular query.

Also, once we executed a query against the database using Mondrian, the
results obviously came back as Mondrian (Java) objects with all the lovely
readability that that language is known for. Getting to the actual data in the
mondrian results turned out to be something like:

1
2
3
4
5
6
7
# to get the column headers
column_headers mondrian_result.get_axes(0).get_positions.map &:get_caption

# to get the row headers
row_header     mondrian_result.get_axes(1).get_positions.map &:get_caption

# getting at the actual grid values will make your eyes bleed so it is omitted

Even then, you couldn’t refactor this very cleanly as not all queries returned
rows, so get_axes(1) would throw errors and gah! it was really messy!

This was not fun. I spent a couple of days trying to figure out a.)
what the hell is this MDX thing and what’s it do? and b.) how am I going to
figure out what text snippets go where and what placeholder needs to get
gsub’ed (or was that just sub?) with what other snippet.

Then I approached Tom with a pretty radical idea: give me a while and I’ll rip out all this text
manipulation stuff and replace all MDX strings with calls to ruby objects.

Tom, being damn awesome, answered exactly as you would hope!

So `git checkout -b heaven` and boom! I started hacking about. I had a rough
idea of what I wanted…

  1. I don’t want to write MDX in my classes
  2. A quagmire of a java object is not ‘results’, I want ruby objects that make
  3. sense.

The first version was called MDXBuilder, a utility class that provided a
pretty basic DSL for writing MDX queries and returned a simple hash with
values for rows, columns and grid. This was already a big improvement. So much
so, in fact, that I later did a complete rewrite that allowed you to define
your entire mondrian schema using ruby classes, then run a rake task to
generate your mondrian datamart definition XML (which sucks doing by hand,
trust me.) Think an ‘ORM’-like abstraction for business intelligence queries using
Mondrian. (ORM isn’t quite the right pattern though as BI queries don’t return
objects, but rather result sets.)

Anyway, I wrote it as a plugin and we’ve been using it and steadily improving it for
months now and every day I get happier and happier using it.

However, the fact that it’s a plugin has always gnawed at me. It needs to be
a gem. Also, I currently use inheritance on classes:

1
2
3
class SalesDimension < MDSL::Dimension
  ...
end

But since I saw the newer ruby ORMs move to mixins I’ve been in love with the
idea…it makes so much more sense.

So my next task is to:

  1. Turn MDSL plugin into a gem
  2. Use include MDSL::SomeClass instead of inheriting from it

I hope to find time to work on this soon, so follow along if you’re
interested. For the next few posts I’ll be looking at cool patterns that I
find in some of the existing gems that do things I’d like to copy in mine.

Fragment caching with lambda's

Here’s some commonly seen Rails controller code:

Your controller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class PostsController < ApplicationController

  def index
    @posts = get_posts

    respond_to do |format|
      format.html
      format.js  { render :json => @posts}
      format.xml { render :xml  => @posts}
    end
  end

  private

  def get_posts
    # some complex query that
    # returns some posts, say...
  end

end

Technically the #get_posts method should be in the Post model,
but bear with me, I’m trying to prove a point.

index.html.erb

1
2
3
4
5
<h2>Posts!</h2>

<% @posts.each do |post| -%>
  <h3><%=link_to(post) %></h3>
<% end -%>

So that’s pretty standard. The site’s been doing great and everyone who is
anyone is checking out the index page!

Unfortunately your server is getting steadily dragged down and after
some profiling you realize that the main bottleneck is the #get_posts
method.

Not all is lost, you’ve heard that caching can help! Sadly the rest of your
page (that is, various bits of your layout) is quite dynamic and cannot be
cached globally so page and action caching isn’t going to help.

Luckily Rails supports fragment caching which allows you to cache a specific
bit of ERB. So your view becomes:

index.html.erb

1
2
3
4
5
6
7
<h2>Posts!</h2>

<% cache do -%>
  <% @posts.each do |post| -%>
    <h3><%=link_to(post) %></h3>
  <% end -%>
<% end -%>

So that looks great, you profile again and notice that from the second
request onwards, the server is indeed breathing a little easier…but
not nearly as much as you expected.

The reason is that the database queries are still being run in ‘#getposts’,
and the posts assigned to @posts. The fragment caching just skips looping
through the @posts array, replacing the entire ‘cache do … end’ block
with the HTML that got generated during the previous request.

This leads people to unhappy action like this:

The associated helper

1
2
3
4
5
6
7
8
module PostsHelper

  def get_posts
    # some complex query that
    # returns some posts, say...
  end

end

Your controller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class PostsController < ApplicationController

  def index
    respond_to do |format|
      format.html
      format.js  { render :json => get_posts}
      format.xml { render :xml  => get_posts}
    end
  end

  private

  def get_posts
    # some complex query that
    # returns some posts, say...
  end

end

index.html.erb

1
2
3
4
5
6
7
<h2>Posts!</h2>

<% cache do -%>
  <% get_posts.each do |post| -%>
    <h3><%=link_to(post) %></h3>
  <% end -%>
<% end -%>

Don’t get me wrong, this works. Also, in this rather simple case you can
use Rails’ #helper_method function to remove the duplication.

However, something worth looking at when you run up against more involved
problems of this kind is to use lambda’s for lazy evaluation in the view:

Your controller

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class PostsController < ApplicationController

  def index
    @posts_finder = find_posts

    respond_to do |format|
      format.html
      format.js  { render :json => @posts_finder.call}
      format.xml { render :xml  => @posts_finder.call}
    end
  end

  private

  def find_posts
    Proc.new do
      # some complex query that
      # returns some posts, say...
    end
  end

end

index.html.erb

1
2
3
4
5
6
7
<h2>Posts!</h2>

<% cache do -%>
  <% @posts.call.each do |post| -%>
    <h3><%=link_to(post) %></h3>
  <% end -%>
<% end -%>

That way @posts is lazily evaluated and you’ve cleanly removed your
bottleneck.

Let me know what you think!

Gustav