Thursday, June 2, 2011

How to remove continuous(unnecessary) occurance of a character?

In Debian Linux we have one superb command tool called 'tr' [Translate or delete characters]. This simple handy tool has may usages, here am going to explain how to remove sequence of repeated (same)character.

The date command will produce output as follows
Ex-1-scenario:
$date
Thu Jun 2 17:10:17 IST 2011
Note: There are 2 spaces between the Month(Jun) and the date(2) because the date is single digit.
Ex-2-scenario:
$date
Thu Jun 22 17:10:17 IST 2011

So, if i try to cut the date using cut command then with empty space( ) as delimiter. then the field count will change.
For example:
Ex-1-scenario:
In f1-Thu f2-Jun f3-( ) f4-2 f5-17:10:17 f6-IST f7-2011

Ex-2-scenario:
In f1-Thu f2-Jun f3-2 f4-17:10:17 f5-IST f6-2011

So, how can i split fields correctly regardless of the single or double digit dates. Here comes our tr tool. It will be done as follows
for both scenario-1, scenario-2
$date | tr -s ' '
Note: if you use tr -s 'y' it will remove the continues occarance of 'yyyyy' and put only one single 'y' there.
will output with single space( ) between fields. Happy scripting.. :)

For more information see the man page of tr, date commands. For online man pages of tr, date.