# sentiment_analysys_csci_e89

This package was designed to enable its users to perform end to end sentiment analysis with state of the art techniques.  
The api assumes a common data model that is described in great detail in the
documentation. In short, the modules expect tabular datasets with the following fields for training data:
 1. text_id    
 2. text    
 3. label    

 and the following fields for live test data:  
 1. text_id    
 2. text    

The api contains 5 main modules:    
 1. data_cleaning: A class that was written to support a number of popular machine learning datasets. It cleans the raw data and
 structures it in a way that the other modules can use.  
 2. pre_processing : A class that provides a number of high level functions to perform sophisticated data transformations and cleaning.    This class is responsible for preparing the raw text data for our Neural architectures.    
 3. modeling : A class that provides a number of methods, each dedicated to training a certain type of architecture. Refer to the documentation for the exact specification of each of the architectures provided.  
 4. pretrained_embeddings : A class that provides methods to prepare well known and popular word embeddings (GloVe and word2vec) in a format that our netoworks can work with. We require that the user download the raw data from the appropriate sources. Once again, details are included in the documentation.  
 5. predict_newdata: A class that provides methods to use our trained networks to make predictions on live data. Live data as I define it here can be thought of test data that is processed and prepared outside of the original efforts that processed the data our model was trained and validated against.  

 A number of different neural architectures are provided with easy to call methods, thereby allowing you to train sophisticated models with no more than a few lines of code.Some of the architectures implement transfer learning and require that certain files be downloaded
 locally.  

 Please refer to the documentation and the tutorial script.  
 The turorial is in the form of a jupyter notebook with a step by step implementation. Please find it here: https://github.com/stefano10p/-sentiment_analysis_csci_e89-/tree/master/tutorial  

 ## Installation  
 Run the following to install:  

 ```python
 pip install sentiment-analysis-csci-e89
 ```
Please download the documentation from here:  
https://github.com/stefano10p/-sentiment_analysis_csci_e89-/tree/master/docs/_build/html  
Create a local directory on your machine with each of the html files. 

You may also download this package from my github: https://github.com/stefano10p/-sentiment_analysis_csci_e89-

You will find a requirements.txt file when you clone the repository.
On your machine create a virtual environemnt:  
conda create --name sentiment_analysis  
Activate the environment and use the requirements file to configure it with all the necessary dependencies.  
conda activate sentiment_analysis  
pip install -r requirements.txt  
You are ready to use the package.
Thank you !