CD-HIT is a very widely used program for clustering and comparing protein or
nucleotide sequences.

Homepage:
http://weizhong-lab.ucsd.edu/cd-hit/
