Warning: preg_replace(): Compilation failed: escape sequence is invalid in character class at offset 4 in /home/customer/www/theunixtips.com/public_html/wp-content/plugins/resume-builder/includes/class.resume-builder-enqueues.php on line 59

bash : Count number of recurrence of lines

Say, we have a file or data that has many duplicate rows or entries and we want to find how many time each one has repeated and maybe want to know which is repeated most of the time. Here is an elegant script that can do that in single line.

sort input.file | uniq -c | sort -n -r

Explanation:
First sort will sort the records in the file. Then uniq -c will count how many times each record is duplicated. And finally sort -n -r will sort the output of uniq -c in reverse order giving us the records that repeated most often to the least often.

Lets see an example. Lets say our data file contains following.

unixite@sandbox:~$ cat test.txt
one
one
one
two
two
one
five
one
one
two
five
two
one
one
one
two
two
one
five
one
one
two
three
four
two
five
unixite@sandbox:~$ sort test.txt  | uniq -c
      4 five
      1 four
     12 one
      1 three
      8 two
unixite@sandbox:~$ sort test.txt  | uniq -c | sort -n -r
     12 one
      8 two
      4 five
      1 three
      1 four
unixite@sandbox:~$