Saturday, June 2, 2012

Sed Back References: Defining Regions

Sed has the ability to define regions in the regular expression definition area. This regions can be then back referenced using the special characters "\1" to "\9". To define a region it has to be surrounded with parentheses.

As an example, lets say you have a nine characters long line that has to be splitted into three three-character regions:


Note I'm passing the "-r" argument to sed, otherwise I would have to escape the parenthesis characters to prevent sed to take them as literals.

Following is a "real life" example. I have a plain text file with email addresses and contact names with this format:
name_1:last_name_1:mail_1@domain.com
name_2:last_name_2:mail_2@domain.com
name_3:last_name_3:mail_3@domain.com
name_4:last_name_4:mail_4@domain.com
This could be difficult to read as the file grows. With just one line using sed defining regions, the output of a cat command can be formatted in a "human readable" format:


1 comment:

  1. That is really good and informative post, going to share it with others so that they can also learn it. Thank you for posting it.

    ReplyDelete