Metadata-Version: 2.1
Name: ocrpy
Version: 0.3.5
Summary: unified interface to google vision, aws textract, azure, tesseract OCR, EasyOCR tools.
Project-URL: Source, https://github.com/maxent-ai/ocrpy
Author-email: Maxentlabs <maxentlabsai@gmail.com>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.7
Requires-Dist: attrs==21.4.0
Requires-Dist: beautifulsoup4==4.11.1
Requires-Dist: beautifulsoup4==4.9.1
Requires-Dist: boto3==1.19.7
Requires-Dist: google-cloud-vision==1.0.0
Requires-Dist: numpy==1.21.1
Requires-Dist: opencv-python==4.1.2.30
Requires-Dist: pdf2image==1.14.0
Requires-Dist: pytesseract==0.3.6
Description-Content-Type: text/markdown

# ocrpy
[![Downloads](https://static.pepy.tech/personalized-badge/ocrpy?period=total&units=abbreviation&left_color=black&right_color=blue&left_text=Downloads)](https://pepy.tech/project/ocrpy)

unified interface to google vision, aws textract, azure and tesseract OCR tools.


### Sample Usage

```python
from ocrpy import TextPipeline

SOURCE_DIR = 'source-dir-to-read-data'
DESTINATION_DIR = 'destination-path-to-write'

#optional: if using aws or gcp for ocr - pass the env file
#env-file need to contain `region_name`, `aws_access_key_id` and `aws_secret_access_key` vars
AWS_ENV_FILE = 'path-to-aws-credentials-env-file'
CREDENTIALS = {'aws': AWS_ENV_FILE}
PARSER_TYPE = 'aws'

ocr_pipe = TextPipeline(SOURCE_DIR, DESTINATION_DIR, PARSER_TYPE, CREDENTIALS) 
ocr_pipe.process_data()

```

