Counting

Counting the occurance of a word/phrase

I wanted to know how many times a searched for word/phrase occurred in a block of text. Since I was going to have to split anyway, why not just split on the word/phrase ($find) and subtract one (since $count is the number of pieces, not the number of splits, it would otherwise always be one high).
my @words = split(/\b$find\b/i); # case insensitive exact match, $_ $count += @words-1;
[ comment | link | top ]

Links

Word counting with \W+ split: To count the words, I need to break each line up by words, and then add the number of words into the counter, not the number of lines. Just a few tweaks will do it.
#!/usr/bin/perl while (<>) { @words = split(/\W+/); $count{$ARGV} += @words; } foreach $file (sort by_count keys %count) { print "$file has $count{$file} words\n"; } sub by_count { $count{$b} <=> $count{$a}; }
The list @words gets created for each line by splitting the line up by the regular expression /\W+/. This regular expression matches sequences of non-alphanumerics. The split operator drags this regular expression through the string (in this case, the contents of $_, because I didn't specify anything else). Every place the regular expression matches gets ripped out of the string as a delimiter -- everything else becomes an element of the list to be returned.

Once I have a list in @words, I can add the length of the list to the count. The name @words in a scalar context is the length of array @words. This will keep the elements of %count as a running total of words now, not lines.

http://www.stonehenge.com/merlyn/UnixReview/col02.html
[ comment | link | top ]

Word counting on \s+ split: Given a text, return a list of words and word counts
while() { $_ =~ s/^\s+//; # Good idea to always do this. If the line # starts with blanks, then the first element # of the array after splitting wound be null @words_in_line = split(/\s+/,$_); # splits the line into an array of words for ($count=0;$count<=$#words_in_line;++$count) { $word_count{$words_in_line[$count]}++; } } while(($key,$val) = each %word_count) { print ``$key $val\n''; }

http://www.cs.jhu.edu/~hajic/perlguide.txt
[ comment | link | top ]

Line, word and character count: # exercise4.2.pl: a program that reads lines from standard input until # end-of-file, then prints the number of lines, words # and characters in the input, followed by the input # in reverse order (both lines and characters). # usage: exercise4.2.pl # 2000-03-03 zavrel@uia.ua.ac.be # intitialize a line buffer @lines = (); # read lines of input while(defined($line = <>)){ chomp $line; # this means that newlines will not be counted $nlines++; # counts the lines @words = split /\s+/, $line; $nwords += @words; # counts the words @chars = split //,$line; $nchars += @chars; # counts the characters # reverse the characters in the line # and push this onto a stack @chars = reverse @chars; $string = join "", @chars; push @lines, $string; } print "lines: $nlines, words: $nwords, characters $nchars\n"; print "reversed:\n"; # by popping lines off the stack they come out in the # reverse order: while($line = pop @lines){ print "$line\n"; }

http://lcg-www.uia.ac.be/~erikt/perl/so04.html
[ comment | link | top ]

Menu

Counting

Counting the occurance of a word/phrase

Links

Back to: Perl