Pill-em-all is a Medication review app where you can search user reviews and find ratings. You can search by symptoms and see which of them are working according to your requirement.
The Project is developed in python using flask for the webapp development. The system is built on dataset available [here]https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018/home which has patient reviews on specific drugs along with related conditions and a 10 star patient rating with useful count reflecting overall patient satisfaction.
Demo video:
Webapp: http://050e0e36.ngrok.io/
Data Preprocessing
The dataset, like any other, has lot of missing values and is dropped from the dataset. Each review is converted into a bag of words representation. These are used to build the final system. Some techniques used are stop-word removal and lemmatization for removing redundant words.
Search Engine
Inverted Index method is used to build a vocabulary that stores the weights of every word in the entire dataset of reviews.
Cosine similarity is used to calculate the similarity between the query and the associated reviews that uses the tf-idf weights of words in query that are present in the vocabulary.
Follow this blog to understand the implementation of the Inverted Index.
Classifier
A multinominal Naive Bayes Classifier that is implemented in python gives the probability of a given review belonging to the top 10 condition class in the dataset.
This uses prior probabilities of every condition and the probability of the key words in a review belonging to each of these condition class to compute the score.
A simple yet effective algorithm for classification of text.
Follow the blog here to understand and implement the Naive Bayes’ classifier.
Recommender system
A content-based recommender that shows reviews of medications that are similar to the selected review which are associated with the same condition.
A fairly simple approach for recommendation, useful for when lacking user preference data or a detailed description of items, that is in this case medicine.
Understand the basic idea behind the recommender system here.
Github Repository: (https://github.com/sanath-narasimhan/Pill-em-all2.0)