Metadata-Version: 2.1
Name: duplicate-image-finder
Version: 0.2.10
Summary: duplicate image finder helps you find duplicate or similar images as well as delete them.
Home-page: https://github.com/LordAmit/duplicate_image_finder
Keywords: image,similar image,duplicate image,imagehash
Author: Amit
Author-email: lordamit@gmail.com
Requires-Python: >=3.9,<3.10
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Requires-Dist: Flask (>=2.1.2,<3.0.0)
Requires-Dist: Flask-Cors (>=3.0.10,<4.0.0)
Requires-Dist: ImageHash (>=4.2.1,<5.0.0)
Requires-Dist: Jinja2 (>=3.1.2,<4.0.0)
Requires-Dist: Pillow (>=9.1.1,<10.0.0)
Requires-Dist: more-itertools (>=8.13.0,<9.0.0)
Requires-Dist: pandas (>=1.4.2,<2.0.0)
Requires-Dist: pathlib (>=1.0.1,<2.0.0)
Requires-Dist: python-magic-bin (==0.4.14)
Requires-Dist: termcolor (>=1.1.0,<2.0.0)
Requires-Dist: types-termcolor (>=1.1.4,<2.0.0)
Project-URL: Repository, https://github.com/LordAmit/duplicate_image_finder
Description-Content-Type: text/markdown

# Duplicate Image Finder

Duplicate image finder uses image hashing to find similar/duplicate images in your local storage. All you gotta do is

1. install,
2. run (*will setup the database with table*) if no configuration is provided,
3. run specifying which directory to look for images, and finally
4. run asking it to show duplicate/similar images.

Please note that it is a prototype. Please use at your own discretion.

For example:

```sh
# 1. installing
python3.9 -m pip install --user duplicate-image-finder

# 2. show help
duplicate-image-finder --help

# 3. add directory images and calculate hashes using 4 threads
duplicate-image-finder --add <directory> --parallel 4

# 4. show the duplicate/similar images found in your browser
duplicate-image-finder --show
```

Running 4 will result in opening a browser that shows duplicate/similar images. If you click on delete, it will be moved to .Trash folder.


## Requirements

Lots, but all of them can be installed as dependencies as long as you are using `python3.9`. Unfortunately, some of its dependencies have not been made available in `python3.10` yet, so we are stuck there.

## Poetry

Installing dependencies

```sh
poetry install
```

Running

```sh
poetry run python duplicate_image_finder/duplicate_finder.py --show
```

Testing

```sh
poetry run pytest
```
etc.

This duplicate image finder source code is inspired/partially copied from https://github.com/philipbl/duplicate-images.git.

Significant changes from the referred version are:

1. moved from `mongodb` to `sqlite`
2. Is probably better in terms of finding similar images (or perhaps I misunderstood the previous code)

Concepts/Technologies I learned/tried to learn while doing this:

1. `poetry` for dependency
2. `pytest` for unit test
3. `pysqlite3` for database
4. `concurrency` for performance
5. `imagehash` for perpetual image hashing for finding similarity
6. grouping CLI arguments in python (mutually exclusive, etc) using `argparser`

