Projects

fairleigh dickinson university essay topic see go to site levitra time lasts https://www.cei.utah.edu/wp-content/blogs.dir/15/files/2013/?speech=essay-some-people-think-practical-skills essay ideas to write about https://dsaj.org/buyingmg/cada-cuanto-tiempo-se-debe-tomar-viagra/200/ follow link vorticism pound essay essay comparing between two cities red viagra fiyatlar abilify infant dose through breast milk go to link https://tffa.org/businessplan/read-5-paragraph-essay/70/ click here https://people.sfs.uwm.edu/blog/compare-and-contrast-book-essay-example/21/ how to try viagra https://caberfaepeaks.com/school/cengage-homework-help/27/ account coordinator cover letter https://sigma-instruments.com/how-to-use-viagra-tablets-in-tamil-13050/ generic viagra overnight delivery https://simplevisit.com/telemedicine/herpes-durch-viagra/16/ watch http://www.danhostel.org/papers/how-to-make-your-essays-longer/11/ political corruption essay honesty is the best policy but advertising also helps essay can you get high off celebrex 200 mg https://www.arohaphilanthropies.org/heal/levitra-westminster/96/ essay in english books how to end an expository essay tadacip wikipedia go site Data Science Projects

The Data Incubator is an intense 8-week data science training fellowship for academic researchers with a 2% acceptance rate (out of 3000 applicants).

I completed a number of data science projects using the Digital Ocean cloud computing platform, including:

1. NYC Social Network Analysis – web scraping and network graph analysis

– Created a social network graph by extracting and parsing over 100,000 photo-captions from photo albums on a New York socialite blog, http://www.newyorksocialdiary.com/nysd/partypictures
– Analyzed the structure of a social network using the node degree, node pagerank and the highest weighted edges of a graph
– You can find the code on Github

Tools: lxml, BeautifulSoup, regular expressions, pandas, networkx

2. NYC restaurant inspection database analysis

– Performed data analysis on an aggregated NYC restaurant inspection database with over half a million inspection reports using SQL, pandas and R
– An interactive visualization of the average restaurant score for the five NYC boroughs in CartoDB can be found here

3. NLP analysis on Yelp reviews

– Used Neuro-linguistic programming (NLP) to perform sentiment extraction from over 1 million Yelp reviews (1GB JSON file).

4. Yelp review predictions with scikit-learn

– Used machine learning models and scikit-learn to predict a new venue’s popularity from available meta-data when the venue opens, e.g., where it is located, the type of food served, etc.

5. MapReduce in the Cloud

– Used MapReduce to perform a linguistic analysis on English (11GB) and Thai (160MB) Wikipedia articles to obtain character entropy of extracted words and n-gram statistics.
– You can find the code on Github

6. Time Series Analysis

– Developed a model to predict the temperature in major US cities using Fourier analysis of over 500,000 data points
– Developed classification models to recognize the genre of a musical piece, first from pre-computed features as well as from the raw waveform.