Metadata-Version: 2.1
Name: pyresumize
Version: 0.1.9
Summary: Resume Parser Written in Python3
Author-email: Gokul Kartha <kartha.gokul@gmail.com>
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Requires-Dist: spacy==3.5.1
Requires-Dist: pdfminer.six==20221105
Requires-Dist: python.docx
Requires-Dist: pandas==1.5.3
Requires-Dist: nltk==3.8.1
Requires-Dist: click
Requires-Dist: pyspark>=3.0.0 ; extra == "spark"
Requires-Dist: bandit[toml]==1.7.5 ; extra == "test"
Requires-Dist: black==23.1.0 ; extra == "test"
Requires-Dist: check-manifest==0.49 ; extra == "test"
Requires-Dist: flake8-bugbear==23.3.12 ; extra == "test"
Requires-Dist: flake8-docstrings ; extra == "test"
Requires-Dist: flake8-formatter_junit_xml ; extra == "test"
Requires-Dist: flake8 ; extra == "test"
Requires-Dist: flake8-pyproject ; extra == "test"
Requires-Dist: pre-commit==3.2.0 ; extra == "test"
Requires-Dist: pylint==2.17.0 ; extra == "test"
Requires-Dist: pylint_junit ; extra == "test"
Requires-Dist: pytest-cov==3.0.0 ; extra == "test"
Requires-Dist: pytest-mock<3.10.1 ; extra == "test"
Requires-Dist: pytest-runner ; extra == "test"
Requires-Dist: pytest==7.2.2 ; extra == "test"
Requires-Dist: pytest-github-actions-annotate-failures ; extra == "test"
Requires-Dist: shellcheck-py==0.9.0.2 ; extra == "test"
Requires-Dist: psutil ; extra == "test"
Requires-Dist: twine ; extra == "test"
Project-URL: Documentation, https://github.com/karthagokul/pyresumize#readme
Project-URL: Source, https://github.com/karthagokul/pyresumize
Project-URL: Tracker, https://github.com/karthagokul/pyresumize/issues
Provides-Extra: spark
Provides-Extra: test


  

  

![](https://github.com/karthagokul/pyresumize/blob/main/logo.png?raw=true)

# Introduction
  

pyresumize is a python module to extract useful information from resume and generate a json string out of it. Currently it supports only pdf,docx files as input .


### Todo

* Proper Logging to be added
* Support for other formats
* Performance Improvements 
* Bug Fixes
* Custom configuration of input data

### Note

The Skills , Employers and Education is given as .csv inputs to the engine and you can see a reference implementation in the data folder.

### Todo
Log Integration
custom model

### Design  

I have changed the Design in such a way that the developers can create own parsing rules and set those to the Parser to bring in flexibility. 

Currently we have the below interfaces exposed and the developers can override the process method to bring in custom processing rules.  

- EmployerBaseInterface:

Searches for company information in the resume and identifies the employers

- EducationBaseInterface

Process Education Details together with universities .The results are stored in a map

- EmailBaseInterface

Check for email addresses in the resume and returns if found one.

- PhoneBaseInterface

Process phone numbers in the resume text, if there are more than one phone number found , returns a concatenated string with commas.

- NameBaseInterface

Proces the Name of the candidate .

- SkillBaseInterface:

Process the skills section . returns a list of identified skills in a resume.

One of these interfaces can be implemented like below . 

    class RemoteCompaniesChecker (EmployerBaseInterface):
        def process(self,resumetext):
        #Call a remote API and pass the text info
        return list[]

  
The ResumeEngine class has below member functions and with one of these you can apply your custom engine

    set_skills_engine(self, engine):    
    set_name_engine(self, engine):    
    set_name_engine(self, engine):    
    set_email_engine(self, engine):    
    set_education_engine(self, engine):    
    set_employer_engine(self, engine):

## Usage  

https://pypi.org/project/pyresumize/ . The module can be installed using


    pip install pyresumize

 
Then do the below


    python -m spacy download en_core_web_sm    
    python -m nltk.downloader words    
    python -m nltk.downloader stopwords    
    from pyresumize import ResumeEngine    
    r_parser=ResumeEngine()    
    r_parser.set_custom_keywords_folder("data")    
    json=r_parser.process_resume(file)    
    print(json)

