Mod-a-Recipe is developed as my capstone project at Metis data science immersive bootcamp, to combine my passions for cooking and natural language processing. The idea came from my own experience of always reviewing more than one recipe for a dish I had in mind to figure out what ingredients I can modify or swap out to suit my taste. Using natural language and machine learning techniques, similar recipes are found given a selected recipe, and suggestions on modifications are provided to make a recipe your own.
Data Source
Recipe data was scraped by Eight Portions (~80,000 recipes used here) from Epicurious, AllRecipes, and FoodNetwork.
Also, tagged ingredients data is used from the NYTimes Ingredients Phrase Tagger project see Github.
Code at Github
Here is the github repo for this project: link.
Tools Used
Python is used for data acquisition, cleaning and modeling. Specific python libraries used include:
- Modeling: scikit-learn
- Natural language processing: spaCy
- Web application: Flask, PostgreSQL, Heroku
Methodology Used
-
Data set from the three data sources (Epicurious, AllRecipes, Foodnetwork) are merged into one, and data cleaned to remove irrelevant information. (refer to code here)
-
Using 2000 rows of tagged ingredients list data from the NYTimes Ingredients Phrase Tagger project and 200 rows of manually tagged data from the Eight Portions data set, the spaCy NER model is trained to recognize ingredients from ingredients list text.
-
Using the trained spaCy NER model, ~40,000 unique ingredients were extracted from the ingredients list of the recipes data. (refer to code here)
- Topic modeling technique (TFIDF word vector with non-negative matrix factorization) is then used to reduce dimensionality such that similar recipes can be calculated using cosine similarity. The NMF yielded 50 topics were seemed to be representative of certain types of recipes. For example:
Topic 1 (Asian recipes): soy sauce, sesame oil, green onion, ginger, sesame seed, rice vinegar, ginger root, scallion, rice wine vinegar, peanut oil
Topic 2 (baking): unsalted butter, pure vanilla extract, whole milk, light brown sugar, fine salt, nutmeg, shallot, kosher salt and freshly ground pepper, extra-large egg, semisweet chocolate
(refer to code here)
- Similar recipes were found as a result, and differences in ingredients lists are highlighted as possible substitutions or enhancements to the selected recipe.