Sunday, June 5, 2011

Software projects and 'grep'

It might sound as weird title, especially when I emphasize that by 'software project' I refer to absolutely 'any' project. After a few days of trying to understand the source code of a few projects (including Kate), I have come to realize that it is hard to understand the entire code at once.

Realizing the impossibility of understanding the entire code-base (which could be well over a hundred thousand lines) I started trying other means. The first step was to identify the 'main' file. As I am working with C/C++, this file is easy to identify as it is the one containing the function :-
int main (int argc, char *argv[] ) 
Knowing what I am looking for, this simple command (or a few variations) allow me to spot the file where it all begins :-
grep "int main" * -R 
As should be clear, this would search the source code directory 'recursively' to find me the file with the keywords. At this point, I would like to highlight why it would be easy to maintain a program that called the 'first' function by the name 'main'. Whichever language it is written in, use of this convention can make it easier to maintain. 

Once the main file has been spotted, I proceed to find the functions in almost a similar fashion. This saves me the effort of going through pieces of code that are rarely (or at times never) used. I can follow the 'flow of program' more easily and I know which function lies where. As and when I proceed to understand the functions, I try to make certain notes about the functions, in the form of comments. These notes could be something describing the task of the function, or could point to some other piece of information, or at times hints for improving the function later.

'grep' has proved very useful to me, and although I am not very good with RegEx (regular expressions), I now plan to get hold of them so as to be able to use them to spot what I want in combination with 'egrep'.


  1. To 'grep "int main" * -R' you could add '-Hn', which'll give you the filename and line number of each match. Also, have a look at this:

  2. on a lighter note: :D