[PLUG] Counting Files

wes plug at the-wes.com
Tue Aug 17 02:22:43 UTC 2021


To get the count of unique callsigns, you can just feed this same command
into wc -l.

find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c | wc -l

-wes


On Mon, Aug 16, 2021 at 7:21 PM wes <plug at the-wes.com> wrote:

> if the @ is consistent with all the files, that makes it relatively easy.
>
> find Processed -type f -printf '%f\n' | sed "s/@.*//" | uniq -c
>
> -wes
>
> On Mon, Aug 16, 2021 at 7:17 PM Michael Barnes <barnmichael at gmail.com>
> wrote:
>
>> On Mon, Aug 16, 2021 at 5:29 PM David Fleck <dcfleck at protonmail.ch>
>> wrote:
>>
>> > As Wes said, an example or two would help greatly.
>> >
>> > --- David Fleck
>> >
>> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> >
>> > On Monday, August 16th, 2021 at 7:17 PM, wes <plug at the-wes.com> wrote:
>> >
>> > > are firstnames and lastnames always separated by the same character in
>> > each
>> > >
>> > > filename?
>> > >
>> > > are the names separated from the rest of the info in the filename the
>> > same
>> > >
>> > > way for each file?
>> > >
>> > > are you doing this once, or will this be a repeating task that would
>> be
>> > >
>> > > handy to automate?
>> > >
>> > > would you be able to provide a few same filenames, perhaps with the
>> > >
>> > > personal info obfuscated?
>> > >
>> > > generally, the way I would approach this is to pare the filenames
>> down to
>> > >
>> > > the people's names, and then run uniq against that list. uniq -c will
>> > >
>> > > provide a count of how many times a given string appears in the
>> input. if
>> > >
>> > > I'm doing this once, I would generate a text file containing the list
>> of
>> > >
>> > > filenames I will be working with, for example:
>> > >
>> > > find Processed -type f > processed-files.txt
>> > >
>> > > then use a text editor to pare down the entries as described above,
>> using
>> > >
>> > > find and replace functions to remove the extra data, so only the
>> people's
>> > >
>> > > names remain. then simply uniq -c that file and you're done. I
>> personally
>> > >
>> > > use vi for this, but just about any editor will do. I like this
>> approach
>> > >
>> > > for a number of reasons, not the least of which is that I can
>> spot-check
>> > >
>> > > random samples after each editing step to try to spot unexpected
>> results.
>> > >
>> > > if you want to automate this, it may be a little more complicated, and
>> > the
>> > >
>> > > answers to my initial questions become important. if you can provide a
>> > >
>> > > little more context, I will try to help further.
>> > >
>> > > -wes
>> > >
>> > > On Mon, Aug 16, 2021 at 5:01 PM Michael Barnes barnmichael at gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > Here's a fun trivia task. For an activity I am involved in, I get
>> files
>> > > >
>> > > > from members to process. The filename starts with the member's name
>> > and has
>> > > >
>> > > > other info to identify the file. After processing, the file goes in
>> the
>> > > >
>> > > > ./Processed folder. There are thousands of files now in that folder.
>> > Right
>> > > >
>> > > > now, I'm looking for a couple basic pieces of information. First, I
>> > want to
>> > > >
>> > > > know how many unique names I have in the list. Second, I'd like a
>> list
>> > of
>> > > >
>> > > > names and how many files go with each name.
>> > > >
>> > > > I'm sure this is trivial, but my mind is blanking out on it. A
>> couple
>> > > >
>> > > > simple examples would be nice. Non-answers, like "easy to do
>> > with'xxx'" or
>> > > >
>> > > > references to man pages or George's Book, etc. are not helpful right
>> > now.
>> > > >
>> > > > Thanks,
>> > > >
>> > > > Michael
>> >
>>
>> Actually, they are callsigns instead of names. A couple of examples:
>>
>> W7ORE at K-0496-20210526.txt
>> WA7SKG at K-0497-20210714.txt
>> N8QBX at K-4386-20210725.txt
>>
>> I would like a simple count of the unique callsigns on a random basis and
>> possibly an occasional report listing each callsign and how many files are
>> in the folder for each.
>>
>> Michael
>>
>



More information about the PLUG mailing list