Showing posts from June, 2013

Finding similar file names

This is a story about the creation of a nice script... A friend of mine needed a script to find files which are similar, without reading their content. She suggested comparing the file attributes, like the file size and some other attributes. So, I gave her this script . It works great when there not many files which have the same size. So, you can safely say that the files are similar if they have the same attributes. But in some cases, this may be a real disaster because there may be lots of files that have the same attributes and are not actually similar. So, I came up with a new idea. I suggested grouping the files by size and compare their names. If the shortest name is contained in the biggest name, then the files are similar. But not so fast. We need some rules, because we can have two files like: ' A happy file name.txt ' and ' a.txt '! They are not similar, even if the shortest name ' a ' is found in the longest name ' A happy fil