SUMMARY: SED/AWK & null bytes

From: DAUBIGNE Sebastien - BOR ( SDaubigne_at_bordeaux-bersol.sema.slb.com ) <SDaubigne_at_bordeaux-bersol.sema.slb.com>
Date: Thu Feb 27 2003 - 05:10:34 EST
Many many thanks for so many relevant answers.

Larye D. Parkins and Sanjiv K. Bhatia suggested using the Posix version of
"tr" in /usr/xpg4/bin, which handles the null bytes properly as stated in
the "tr" man page, thus avoiding the C code (RTFM would have save me the C
code writing time :-), then using "sed" to substitute the pattern.

Finally, John Julian, Darren Dunham, Tom Payerle and Dave Mitchell suggested
using Perl do make the pattern substitution, because Perl handle null bytes
properly (Tom Payerle suggested using "s2p" to translate sed command into
Perl one). This is the best solution.
Well, as I want to make a generic tool and Perl is not always installed, I
decided to test Perl availability, and use the tr+sed method if Perl is not
installed.
But, yes the tr+sed is not totally fail-safe because of the replacing
non-null byte which must not be in the original file.

---
Sebastien DAUBIGNE
sdaubigne@bordeaux-bersol.sema.slb.com
<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  - (+33)5.57.26.56.36
SchlumbergerSema - SGS/DWH/Pessac

	-----Message d'origine-----
	De:	DAUBIGNE Sebastien  - BOR (
SDaubigne@bordeaux-bersol.sema.slb.com )
[SMTP:SDaubigne@bordeaux-bersol.sema.slb.com]
	Date:	mercredi 26 fivrier 2003 16:13
	@:	sunmanagers@sunmanagers.org; sunhelp@sunhelp.org
	Objet:	SED/AWK & null bytes

	Are you aware of any Solaris tool that handles null bytes properly ?
	I need to substitute regular expressions in files with null bytes,
and I
	can't make sed/awk work well with it (I guess it is due to the
terminating
	null bytes of libC strings).
	SED removes null bytes in the line, as do TR.
	AWK truncates the bytes after the null byte in the line.

	The only solution I've found is to make a C program which
substitutes null
	bytes by any non-null bytes not found in the file (e.g. '%'), then
sed/awk,
	then substitute back the non-null byte by null-byte.

	> echo "aaa\0bbb" | sed 's/a/f/g' |od -c
	0000000   f   f   f   b   b   b  \n

	> echo "aaa\0bbb" | nawk '{gsub("a","f");print}' |od -c
	0000000   f   f   f  \n

	---
	Sebastien DAUBIGNE
	sdaubigne@bordeaux-bersol.sema.slb.com
	<mailto:sdaubigne@bordeaux-bersol.sema.slb.com>  -
(+33)5.57.26.56.36
	SchlumbergerSema - SGS/DWH/Pessac
	_______________________________________________
	sunmanagers mailing list
	sunmanagers@sunmanagers.org
	http://www.sunmanagers.org/mailman/listinfo/sunmanagers
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers
Received on Thu Feb 27 06:37:24 2003

This archive was generated by hypermail 2.1.8 : Thu Mar 03 2016 - 06:43:04 EST