[PLUG] Translating ^M to \n
Rodney W. Grimes
freebsd at gndrsh.dnsmgr.net
Mon Aug 12 22:26:18 UTC 2019
> I have large (~111M) .csv data files exported from a Microsoft Access
> database. Each file is one large block of text using ^M (carriage return)
> embedded as the line separator.
Are you sure it is not ^J^M, your probably only seeing the ^M in emacs,
this is known as CR LF line termination.
Many ways to fix it..
dos2unix is a common utility...
tr -d '\15' < winfile.txt > unixfile.txt
vi can do it :1,$s/^v^m//g
perl, awk, ftp localhost in ascii mode *may* fix it.
> 'sed' is probably the best tool to translate that control character to a
> newline (\n) but I don't know how to write '^M' so sed recognizes it as a
> single character. In emacs it displays colored cyan rather than white.
Um, sed could work, to enter a ^M use the sequence ^v^m (^v is the quote
next character character in most unices).
> A web search told me that ^M is equivalent to the linux \r, but not how to
> specify it for sed or emacs.
Lets try to complete that information...
^M is hex 0x0d, "CR", "Carriage Return", expressed in "C" format string as \r
^J is hex 0x0a, "LF", "Line Feed", expressed in "C" format string as \n, also known as newline
WINDOWS outputs "\r\n", aka CR-LF, aka 0x0d,0x0a as its end of line marker.
UNIX outputs "\n", aka LF or newline as its end of line marker.
> Pointers needed.
struct foo {
int* where;
} pointers['\n'];
There, 10 pointers :-)
> Rich
--
Rod Grimes rgrimes at freebsd.org
More information about the PLUG
mailing list