Pages

Friday, March 4, 2011

Bash script for Extracting text

This is a script that I made today to solve a particular problem, but I do realise that many a times I, or anyone for that matter, would need to use something similar. For the same reason, I decided I'd share the script here.

The script was made (as my IITG friends would understand) to extract a set of e-mail addresses from a text file that contained a lot more data. Basically, the file contained the information of each student (Roll no, name, email) in a single line with 'space' as the delimiter. Using this (space as delimiter) characteristic, I extracted the necessary information and dumped it in a file.

while read line
do
line=${line##* }
echo $line
done

This script was sufficient to extract the necessary information. The next task was to dump the output into another file. (if the above was saved as script.sh)

$ cat list| bash script.sh > list

This should have been sufficient, but in my case, I had to pass the 'list' file once again through the script.sh file to generate the desired result.

Consequently, to create a list with commas, the following thing could be performed.


while read line
do string=$string,$line
done
string=${string#,*}
echo $string

This takes care of concatenating the strings with a comma in between consecutive elements.

The above set of commands helped me create a list of over 580 e-mail addresses in less than 10 minutes (including the time taken to create the scripts), way faster than the manual method of 'copy-paste'. Note that I call it a manual method.

I hope this tutorial sheds some light on the use and power of simple bash scripts to solve everyday problems and save a lot of time. Do revert in case of any doubts or confusions through comments.

exit(0)    // End of post

No comments:

Post a Comment