[PLUG] Translating ^M to \n

Rodney W. Grimes freebsd at gndrsh.dnsmgr.net
Mon Aug 12 22:26:18 UTC 2019


> I have large (~111M) .csv data files exported from a Microsoft Access
> database. Each file is one large block of text using ^M (carriage return)
> embedded as the line separator.

Are you sure it is not ^J^M, your probably only seeing the ^M in emacs,
this is known as CR LF line termination.

Many ways to fix it..
dos2unix is a common utility...
tr -d '\15' < winfile.txt > unixfile.txt
vi can do it :1,$s/^v^m//g

perl, awk, ftp localhost in ascii mode *may* fix it.

> 'sed' is probably the best tool to translate that control character to a
> newline (\n) but I don't know how to write '^M' so sed recognizes it as a
> single character. In emacs it displays colored cyan rather than white.

Um, sed could work, to enter a ^M use the sequence ^v^m (^v is the quote
next character character in most unices).

> A web search told me that ^M is equivalent to the linux \r, but not how to
> specify it for sed or emacs.

Lets try to complete that information...
^M is hex 0x0d, "CR", "Carriage Return", expressed in "C" format string as \r
^J is hex 0x0a, "LF", "Line Feed", expressed in "C" format string as \n, also known as newline

WINDOWS outputs "\r\n", aka CR-LF, aka 0x0d,0x0a as its end of line marker.
UNIX outputs "\n", aka LF or newline as its end of line marker.

> Pointers needed.

struct foo {
        int* where;
} pointers['\n'];

There, 10 pointers :-)

> Rich
-- 
Rod Grimes                                                 rgrimes at freebsd.org



More information about the PLUG mailing list