CD-HIT is a very widely used program for clustering and comparing protein or
nucleotide sequences.

Homepage:
https://github.com/weizhongli/cdhit
