Learn In Detail: Linux: How to find duplicate lines count in a file from terminal.

Sunday, 6 November 2016

Linux: How to find duplicate lines count in a file from terminal.

Linux has many commands that are useful to process/analyze a file. In this post I would just explain a simple utility that would just print out the number of times each line is repeated in that file.

So here is the command:

terminal$ sort yourfilename.txt | uniq -c

Here yourfilename.txt can be any file name which I used here for an example.
Suppose the contents of yourfilename.txt be

line1

line2

line3

line1

line3

Output:

3 line1

2 line3

1 line2

Explanation:

The sort command is quite self explanatory over here its output is piped/redirected to uniq. Uniq command requires its input to be sorted(keep in mind always hard to remember). Uniq -c just prints the count of each line.

Learn In Detail

Labels

Sunday, 6 November 2016

Linux: How to find duplicate lines count in a file from terminal.

No comments:

Post a Comment