Technical Outlet: July 2011

Friday, July 29, 2011

Ack!

I'm sure you've used grep at some point in your life as a programmer to search for that piece of code in your code base. I was using it myself — until today. It turns out there's a better alternative to all that time spent trying to get the perfect grep alias so it works as it should: searching recursively, ignoring the .git/.svn directory, ignoring backup files (foo~/#foo#), printing out the file names and line numbers, grouping matches by file name. You've got to listen when Jacob Kaplan-Moss, creator of Django, has this to say about it:

"Whoa, this is *so* much better than grep it's not even funny."

It's called ack. Without further ado, you can install it on your Ubuntu system using sudo apt-get install ack-grep (for other distros, check out the ack homepage). Then use it as: ack-grep search_term. It's really that simple (although the default colours are a little irritating). You can see what files/directories it ignores by default using ack-grep --help. Here's a screenshot for the impatient reader:

'nuff said. I'm gonna go explore the man page.

Thursday, July 28, 2011

The alternative (and very useful) bash prompt

Almost all Linux users have used bash at some point, but how many of them know how to customize it to their liking? Here is a screenshot of one custom bash prompt I've created, which I find much better than the default (it took me a few hours to reach this point, owing mainly to my poor skills in bash scripting):

Click the image to view it properly.

So, other than showing useful stuff related to git repositories, it gives me a full line to type my command, and separates each command with a hyphenated separator and a newline, so I can see it clearly as I scroll upwards. The corresponding code snippet is (put it at the end of your ~/.bashrc if you wish to use it):

PS1='\[\033[01;34m\]\$\[\033[00m\] '

parse_git_dirty () {
    [[ $(__git_ps1) != "" ]] && [[ $(git status 2> /dev/null | tail -n1) != "nothing to commit (working directory clean)" ]] && echo "*"
}

print_pre_prompt () {
    # the first part of the prompt, i.e., the user's name
    local PS1U=$USER

    # the second part of the prompt, i.e., the present working directory
    local PS1WD="${PWD}"
    if [[ $PS1WD = "$HOME" ]]; then PS1WD=\~; fi
    if [[ $PS1WD = *"$HOME"/* ]]; then PS1WD=\~${PS1WD#$HOME}; fi

    # the fourth and last part of the prompt, i.e., the git stuff
    local git_dirty=$(parse_git_dirty)
    local git_dirty_color="\033[1;31m"
    local git_clean_color="\033[1;30m"
    if [[ $git_dirty = "*" ]]; then local git_color=$git_dirty_color; else local git_color=$git_clean_color; fi
    local PS1R=`__git_ps1 "%s"`$git_dirty

    # the third part of the prompt, i.e., the separator consisting of hyphens
    local PS1SEP=""
    while [ $(echo ${#PS1U}+${#PS1SEP}+${#PS1WD}+${#PS1R}+3 | bc) -lt $COLUMNS ]; do
        PS1SEP=$PS1SEP-
    done

    # putting it all together
    printf "\n\033[1;32m%s \033[1;34m%s \033[0;36m%s $git_color%s" "$PS1U" "$PS1WD" "$PS1SEP" "$PS1R"
}

PROMPT_COMMAND=print_pre_prompt

Rather than explaining the code (which I'm not very clear about myself), I'll point you to some links which helped me along the way:
1. https://wiki.archlinux.org/index.php/Color_Bash_Prompt
2. http://superuser.com/questions/187455/right-align-part-of-prompt
3. https://gist.github.com/306785/4da3e39fc012475140fdf33ef0f6bc0a6e7c04a5

-----
Liked this article? How about following me or Rajat on Twitter?

Friday, July 15, 2011

5 Cool Things in Git

Thing #1: Pretty coloured output

Command: git config --global color.ui auto

This will make all output colour-coded. So now git diff and git status will show up in red-green awesomeness (not that that's a great help to me, but it will be to you, undoubtedly).

Thing #2: Aliases

Command: git config --global alias.shortcut command

The beloved time-savers. Here are some of my aliases:

git config --global alias.l log

git config --global alias.s status

git config --global alias.co checkout

Thing #3: Awesome-looking git log

Command: git log --pretty=oneline --graph

Combine that with Thing #1, and you've got an ASCII-art style, rainbow-coloured, git log. And, if you have a lot of free time, or if you're just way too interested in git, you can read man git-log. There's ton of options in there.

Thing #4: A bucket-load of ways to refer to commits

Ok, first of all, branches are not branches. Yes, you read that right. A "branch" is actually just a reference to the latest commit in it. A commit's lineage defines its history, and so arises the idea of a "branch".

You can refer to commits in many ways. Let's agree on some common notations before I explain that. Every commit can be referred to by a commit pointer. The commit pointer may be followed by any number of modifiers, which will together constitute another commit pointer. A commit pointer without any modifiers can have any one of the following forms:

a. the SHA key of the commit (the first 6-7 characters should suffice)

b. HEAD (refers to the latest commit in the current branch)

c. branch_name (refers to the latest commit in the branch specifed by branch_name)

d. tag_name (refers to the commit pointed to by the tag called tag_name)

A modifier, on the other hand, can be either one of the following:
a. ~n (refers to the nth ancestor of the commit pointed to by the commit pointer preceding it)
b. ^ (refers to the parent of the commit pointed to by the commit pointer preceding it. If a commit has multiple parents, you can use ^n to refer to the nth one)

We all learn by example, so let's consider a possible scenario. Say we have a git repository with 2 branches, master and testing. Say the master branch has 2 commits with SHA keys 0af4780... (=HEAD) and 326399d..., and say the testing branch has 3 commits. Say the current branch is testing. Here are some examples in the context of that scenario:
a. 0af4780... = HEAD = self explanatory
b. HEAD~1 = commit 326399d...
c. testing^^ = the oldest commit in the testing branch

Thing #5: Alternate usages
In commands like git diff, which make sense for both commits and files, the commit pointers passed to them can be followed by :file_path, if you wish to compare 2 past versions of a file instead of 2 whole commits. For example, say we have a file called blah.c in the lib/ directory in our repository (which happens to be huge), and we want to see what changes were made to it between the second last and fourth last commits. We could do something like:

git diff HEAD~1:lib/blah.c HEAD~3:lib/blah.c

Another interesting thing is commit ranges, which you can pass to commands like git log to see what has changed in that range of commits. They have the form commit_pointer1..commit_pointer2.

If this got you interested, you really should read this amazing PDF on the internals of git.

----
If you liked this article, why not extend your appreciation by following me or Rajat on Twitter?

Thursday, July 14, 2011

Sed - reducing effort since '74

Sed, short for 'stream editor', is a (pretty awesome) utility created by Lee E. McMahon of Bell Labs. I would like to take a detour here and mention that I am in real awe of Bell labs. They have given us so much, starting from C and C++ to Unix itself. Another useful utility (which I should cover here soon) made by them is cscope.

Now, coming back to 'sed', it saved me quite some effort today while working with Kate. Well, frankly, the time it took me to learn some basic tricks and perfect the command might have taken a little longer than it would have taken me to manually make the change, but I won't add that time. Why ? Because the time invested would save me plenty more in the future, whereas doing the task manually wouldn't save me anything in the future.

So what was this task ? As I have explained here, I am working on the modeline variable editor in Kate. The widget that my mentor and I had created, to be used in place of the 'QLineEdit' , had some help text. I had forgotten to wrap the text in i18n() wrappers.

The initial text was something like this :-

item-> setHelpText(" Help text goes here ") ;

Now had there been just two or three of such occurrences, I wouldn't have thought about using sed; but there were a lot more (I didn't check then, but as it turns out, about 40 changes were made using sed). Anyways, I needed the text to be modified to :-

item->setHelpText(i18n(" Help text goes here"));

With sed, that wasn't too hard.

sed 's:setHelpText(\"\([a-zA-Z0-9.() ]*\)\"):setHelpText(i18n(\"\1\")):g' <old >new

Just one line and all the changes were made. And I had with me another patch to offer to the community ;-).

Combined with Regular Expression (Regex), sed has a lot to offer.

Suggested links :

For more information on Internalization (i18n), check out the Wikipedia page.
For a tutorial on Sed, I would suggest this.
Online manual for Sed is available here.

Thursday, July 7, 2011

Kate's Variable Editor

Well, as I have mentioned before, I am part of 'Season of KDE' this year and am working on Kate. My project statement is to create a 'Modeline Editor'.

Frankly speaking, modeline variables is a feature that I was unaware of before I read about this project. It is one of those subtle features that can make editing and formatting very easy. With just a line at the beginning (or end) of the file you can tell the editor things about tab-width, syntax-highlighting, background-color and what not.

With a lot of assistance from my mentor, Dominik Haumann, we began working on the variable editor by creating a new 'widget'. For now, this remains independent of the rest of the Kate but once it is ready (which it almost is), we'd merge it.

Some important lessons learnt during the process that I'd like to share :-

Make use of version control :- After spending about 24 hours on a part of the project, I e-mailed my mentor the new source code; only to learn that those very changes had already been made. So how did I miss it ? I thought I was smart enough to update my working copy in the beginning of those 24 hours, but not once during those working hours did I bother to check if some changes had been made. That was one valuable lesson. As Vicky correctly pointed out though, there was a silver lining : I wrote some code that was written by a more experienced person as well; I had the opportunity to compare and learn.
Develop complex widgets separately : As I mentioned above, the variable editor is being developed as a widget separately. It required a combination of combo-boxes to cater for boolean, integer, colors, fonts and other strings. The development of widget 'within' the editor would have made it harder. With only this segment to focus on, we can forget about the rest of the editor and develop and test the variable-editor. Here are some snapshots of the widget :-

Here you can see the combo-boxes for boolean and color values
Spin box for integer

The color combo and the line-edit for string values.

The color combo-box (KColorCombo)

The font combo-box (KFontCombo)

Yet another integer spin button and color-combo.

Soon, this widget would be a part of the Kate text editor.

Sunday, July 3, 2011

Beautiful code == poetry

This post is long and a little on the non-technical side, but I think the implications are very relevant to the readers. You might be somewhat pissed that I'm not talking about how sweet jQuery is, but here goes.

How many of us programmers just drool over clean, self-documenting code? I do. I literally strive to have my code look like elegant poetry, so much so that I spend a good portion of my time refactoring. Some might call that over-the-top. I call it an eye for detail. I believe code is like literature; you have to write it in a way so that others (and more importantly a future you) can immediately understand what you mean without getting bogged down with pathetic, obscure syntax or poor style. After all, code is not meant to be "write-once". Moreover, a few years down the line, you want to be able to look back on your code and say, "Whoa. I used to write amazing code even back then".

I try to take extreme care about naming my functions in just the right way. So whenever I read it later on, I know exactly what it does. I won't say I'm very good at it, but I believe I'm getting better.

The language syntax is crucial to how readable the code is. That is one thing Ruby does well. Let's see a contrived example:

a = [1, 2, 3] + [4, 5, 6] - [5]   # = [1, 2, 3, 4, 6]
b = "blah"
return true if a.include? 4 and not b.empty?

Another thing Ruby is good at: Domain Specific Languages (DSLs). They're an attempt to make non-geeks write code to get stuff done easily. As an example, consider this: rocket scientists are good at space-talk, but bad at writing code. What if we could create a program which exposes an API that looks like rocket science? It doesn't get any cooler. The scientist doesn't need to know Ruby, only the API. He can think in terms of space shuttles and takeoff times and things like that. For example, a rocket science DSL in Ruby might allow a rocket scientist to write stuff like:

shuttle.fill_tank 50   # tank up 50 liters
shuttle.launch_after 3 if shuttle.prepared? # 3... 2... 1... go!

You might say, "Wait a sec. Isn't that your regular object-oriented code?". Yes, but normally we create that for ourselves. A DSL's API is meant to be used by the users to write code.

A related example comes to mind. The other day my friend and I were working on a DSL for access control in a web application, which controls what user gets to access what part of the application. The most important thing in a DSL, as you might've noticed, is the API. Everything else is secondary. We needed to name a function which would take a condition, and a result to be returned if the condition was met. We thought up names like make_rule, but we weren't satisfied. After some brainstorming, we came up with an API like this one:

given :condition => user.admin?, :return => true

I feel the function name given is the most important thing in this API. It conveys the purpose unambiguously and reads like natural English. Any other name would've made it confusing or misleading. And remember again, some John Doe titled "Maintainer" at a company is going to be using this API directly, without being good at Ruby, that's why the nomenclature is so very important.

If you're more of a JavaScript guy, you should check out CoffeeScript. It's "a little language that compiles into JavaScript", with and Ruby and Python-inspired syntax. Imagine being able to write stuff like the following and have it run as JavaScript:

volume = 10 if band isnt SpinalTap
letTheWildRumpusBegin() unless answer is no

I've presented my (rather long) case. Let the comments fly.

UPDATE: We're finally done with the access control DSL, and the API now looks like this:

given :condition => { :controller.is => :admin, :action.not.in => [:create, :show] },
      :return => true

isn't that beautiful? :)

----

If you like this article, consider following me and Rajat on Twitter.

Technical Outlet

Pages