[PLUG] sort option for 'natural' sequence

Reid nrwahl at protonmail.com
Mon Mar 30 23:42:00 UTC 2020


You can use a variation the "Decorate, Sort, Undecorate (DSU)" idiom (as documented in `info sort`).

$ awk -F, '{print length($1), $0}' sample.dat | sort -n | cut -f2- -d' '


Sent with ProtonMail Secure Email.

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, March 30, 2020 4:30 PM, Rich Shepard <rshepard at appl-ecosys.com> wrote:

> sample.dat:
>
> '648',17,'2011-07-11','Insecta','Plecoptera','Chloroperlidae''Suwallia'
> '652',17,'2011-07-11','Insecta','Plecoptera','Pteronarcidae''Pteronarcella'
> '895',17,'2010-09-13','Insecta','Ephemeroptera','Baetidae''Baetis'
> '899',17,'2010-09-13','Insecta','Diptera','Psychodidae''Pericoma'
> '901',17,'2010-09-13','Insecta','Coleoptera','Hydrophilidae''Cymbiodyta'
> '907',17,'2010-09-13','Insecta','Trichoptera','Glossosomatidae''Glossosoma'
> '909',17,'2010-09-13','Insecta','Diptera','Chironomidae''Cladotanytarsus'
> '914',17,'2010-09-13','Insecta','Plecoptera','Nemouridae''Zapada'
> '918',17,'2010-09-13','Insecta','Trichoptera','Hydropsychidae''Hydropsyche'
> '919',17,'2010-09-13','Insecta','Coleoptera','Dytiscidae''Hydroporus'
> '920',17,'2010-09-13','Insecta','Trichoptera','Lepidostomatidae''Lepidostoma'
> '922',17,'2010-09-13','Insecta','Coleoptera','Elmidae''Narpus'
> '1120',17,'2006-06-27','Insecta','Diptera','Chironomidae''Polypedilum'
> '134',41,'2004-06-07','Insecta','Plecoptera','Nemouridae''Amphinemura'
> '135',3,'2004-06-07','Insecta','Ephemeroptera','Baetidae''Baetus'
> '137',41,'2004-06-07','Insecta','Ephemeroptera','Baetidae''Baetis'
> '138',3,'2004-06-07','Insecta','Coleoptera','Hydrophilidae''Berosus'
> '139',3,'2004-06-07','Insecta','Plecoptera','Chloroperlidae''Sweltsa'
> '141',41,'2004-06-07','Insecta','Plecoptera','Chloroperlidae''Suwallia'
> '145',3,'2004-06-07','Insecta','Diptera','Simulidae''Prosimulium'
> '148',3,'2004-06-07','Annelida','Oligochaeta','Lumbricidae''Ilyodrilus/Tubifex'
> '151',3,'2006-06-15','Insecta','Diptera','Chironomidae''Eukiefferiella'
> '154',41,'2004-06-07','Insecta','Coleoptera','Dytiscidae''Hydrovatus'
> '155',3,'2004-06-07','Insecta','Coleoptera','Dytiscidae''Hydrovatus'
> '216',SC,'2005-07-13','Insecta','Diptera','Ephydridae'''
> '1126',17,'2006-06-27','Insecta','Ephemeroptera','Baetidae''Baetis'
> '1128',17,'2006-06-27','Insecta','Trichoptera','Brachycentridae''Brachycentrus'
> '1129',17,'2006-06-27','Insecta','Diptera','Chironomidae''Tvetenia'
> '2060',11,'2012-07-11','Insecta','Coleoptera','Elmidae''Narpus'
> '2061',11,'2012-07-11','Insecta','Diptera','Chironomidae''Natarsia'
> '2062',11,'2012-07-11','Insecta','Trichoptera','Hydroptilidae''Ochrotrichia'
> '8',11,'2000-07-18','Insecta','Ephemeroptera','Leptophlebiidae''Paraleptophlebia'
> '11',11,'2000-07-18','Insecta','Trichoptera','Glossosomatidae''Agapetus'
> '12',11,'2000-07-18','Insecta','Diptera','Tipulidae''Tipula'
> '592',17,'2011-07-11','Annelida','Oligochaeta','Tubificidae'''
>
> I've tried the various sort options (-g, -h, -n, -k) and they all sort
> character-by-character which puts the line beginning with '8' between '652'
> and '895'. I know how to do this for cidr addresses but I've no local
> reference that works for only the first field.
>
> Clue wanted,
>
> Rich
>
> PLUG mailing list
> PLUG at pdxlinux.org
> http://lists.pdxlinux.org/mailman/listinfo/plug





More information about the PLUG mailing list