I am trying to extract the To header from an email file using sed on linux.
The problem is that the To header could be on multiple lines.
e.g:
To: [email protected], [email protected],
[email protected], [email protected],
[email protected]
Message-ID: <[email protected]>
I tried the following:
sed -n -e '/^[Tt]o: / { N; p; }' _message_file_ |
awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
The sed command extracts the line starting with To and next line. I pipe the output to awk to put everything on a single line.
The full command outputs in one line:
To: [email protected], [email protected], [email protected], [email protected]
I don't know how to keep going and test if the next line starts with whitespace and add it to the result.
What I want is all the addresses
To: [email protected], [email protected], [email protected], [email protected], [email protected]
Any help will be appreciated.