Data Analysis on Restaurants in Downtown Brooklyn

During my time in downtown Brooklyn, one of the things that drove me mad was looking at restaurant reviews. Friends and I would decide to go to restaurants, and we’d look at the ratings. When we did, we’d get an aggregate rating of 3.5 or 4. But what does it all mean? It really doesn’t mean anything unless we understand the distribution of the data.

It drove me nuts. I had to find out the average, and what the review landscape look like.

So I did. I cobbled together a program in Python using Scrapy, Pandas, and Matplotlib. I would have left out Sir Scrapy, but the review website’s api had this random feature where if you queried a restaurant for all its reviews, it would give you 3. Grr…

Process:

I used the a certain restaurant review website’s own filters to hone in on downtown Brooklyn within a mile radius. The website’s api request gave me two two longitude latitude pairs. They are shown below:

Interesting. So it’s either endpoints of a circle, or a rectangle. I think it’s a rectangle, so we’re going to refer to this area as the rectangle.

This search query gave me a starting point of all restaurants, but displayed only 10 or 20. I scraped them, and then used a program to move to the next page. I stored the set of restaurants in a directory. Then for every restaurant, I automated a http request, got to the main webpage which was an anchor point of a restaurants set of reviews. So I went through all those reviews and got them too.

Results:

Number of restaurants in the bounding box: ~500.

Number of restaurant reviews: ~70,000.

The average of all review ratings: 3.70

Standard deviation of all ratings: 1.32

Graphs:

So people in Brooklyn tend to rate 4s and 5s much more often than 1s and 2s. I wonder if you do this for every city, then could you gauge a friendliness metric for every city, and see if it correlates with the happiness ranking of every country? That would be AMAZING.

This is if you plot average rating of restaurants, and plot them by rounding them to the nearest 0.5.

This is simply the number of reviews and their restaurant names. Seems like it follows some power law distribution, but I’m not quite sure. It may be just one outlier.

Top 20 restaurants by review count:

I guess this can be interpreted as popularity or to some degree how much people care. One can perhaps use this information to extrapolate the length of time the restaurants have been around.

Restaurant Number of Reviews
Grimaldi’s Pizzeria 4440
Juliana’s Pizza 1955
Junior’s Restaurant 1573
Joya 1259
Rocco’s Tacos & Tequila Bar 1196
The River Café 1038
Habana Outpost 962
Shake Shack 788
Mile End Delicatessen Brooklyn 744
Clover Club 728
Yaso Tangbao 703
Ki Sushi 671
Hanco’s 658
Vinegar Hill House 634
Alamo Drafthouse Cinema Downtown Brooklyn 621
Forno Rosso 604
Two 8 Two Bar & Burger 603
Dekalb Market Hall 589
Bedouin Tent 586
Sottocasa Pizzeria 586

Top 50 Highest Restaurants by review average (ignore just 5s – probably insufficient number of reviews. Perhaps I will add review count next to the rating later…)

Restaurant Average Rating
Moshman Dental 5
Pipitone’s Pizza 5
Bird’s Eye Vietnamese 5
VALENTINE’S CAFE 5
First Wok 5
New Fresco Tortilla Plus 5
Smith Gourmet Deli 4.952380952
Thai on Wheels 4.888888889
Lillo Cucina Italiana 4.870588235
GMC Temaxcal Deli & Grocery 4.833333333
Simple NYC-Downtown Brooklyn 4.80952381
Yumpling Food Truck 4.790697674
Cafe Gitane 4.75
Sunny Delicatessen 4.75
Pret A Manger 4.666666667
Ashland 4.625
Govinda’s Vegetarian 4.621118012
Dariush Persian Cuisine 4.580645161
Grand Canyon Restaurant 4.577777778
dot & line 4.52173913
Rice & Miso 4.519230769
dumboLUV 4.5
Kazi Halal 4.5
Saint Julivert 4.5
Yemen Cafe & Restaurant 4.452173913
E-bite 4.444444444
Sanpanino 4.444444444
ACE Thai Kitchen 4.414634146
Sushi Gallery 4.4140625
Bread & Spread 4.412698413
Lavatera Grill 4.409090909
Forcella Fried Pizza 4.407407407
Chicks Isan 4.404761905
Doner Kebab NYC 4.401360544
Mr. Fulton 4.4
W XYZ Bar 4.4
Yossi’s Cart 4.4
Juliana’s Pizza 4.396930946
Shawarma & Grill 4.375
Makina Cafe 4.375
Koji Izakaya 4.358974359
Daigo Handroll Bar 4.333333333
Metro Buffet 4.333333333
Warung Roadside 4.333333333
Taiki 4.327586207
Sultan Restaurant & Cafe Lounge 4.3125
Espresso Me 4.306122449
Piz-zetta 4.305220884
Downtown Natural Market 4.304347826
Sottocasa Pizzeria 4.298634812

To-do list:

  • I need to ask someone who is really knowledgable in statistics, or do research on if there’s a law that correlates the number of reviews to its true rating. What should it converge on? For example, a review with 1000 reviews with a 3.5 should be weighted differently with a review with 10 reviews with a 3.5.
  • It would be interesting to plot the average words per review or aggregate words per restaurant.
  • What would be the most commonly used and the most least commonly used words? I want to run it through a basic NLP program to stem the word, remove stop words, etc.
  • What if we have a list of words, give them a raw numerical value of positive and negative numbers, and average that out. Would the rankings be different?
  • What if you plot out the datestamps of the reviews, and use some metric for happiness/economic activity, and see if there’s a correlation between that and the stock market or the consumer sentiment index? Would there be a correlation?
  • See what another review service from a search engine’s data looks like.

Examining all this data is a lot of fun! But for now, my experiments are on hiatus. There’s already enough things to read and build! But if anyone wants some Brooklyn data, let me know!

Tips:

  1. You should probably use this dataset to get interesting insights: https://www.yelp.com/dataset/download
  2. DO NOT SIMPLY SCRAPE ON YOUR LOCAL NETWORK. If you mess up, it’ll cause your ip to be banned. Your ISP will refresh your ip within a month, but it’ll annoy the people who share the internet connection with you. Either use a vpn, or spin up a cloud computing instance and use that to download your data.
  3. Rate limit your scraper. Don’t try to download all this data within a span of a minute. Pace your scraper to download a reasonable amount in a minute so you don’t overload their apis. Or get caught. There is no need to rush.
  4. You should learn regular expressions. I was talking to someone who was doing data collecting with DOM traversal. That maybe more painful than building a reasonable regular expression. My two cents.

I Know Nothing

Humbled. Today is the 11th day of June of the year 2019. I’ve felt this somber melancholy that I really don’t know anything. Compared to the vast knowledge that have been built, and will be built, I know so little. There’s no need to be giddy that you’ve solved problem X, when it’s one among the infinite ones out there. Similarly, maybe I shouldn’t be extremely stressed out about mistakes and mess ups, when those actions are cosmic dust in a spectrum of all available decision. I reflect back on the younger days, when I thought I knew more, and I was wiser than someone else. There’s some shame in that. I feel like an idiot for ever thinking that way.

You think you understand, then you take a second look. And you’ve realized that you’ve never really internalized nor grasped. Whether that be a simple story or a math equation, I feel that there is always deeper meaning underneath it all. It’s like scratching the surface of an iceberg. You can keep going down forever.

It’s a two sided coin. On one hand, you will never run out of things to learn, explore, and think about. How exciting! But on the other hand, you will never learn everything. Impossible.

Other things I struggle with:

  1. You can call someone out for their bs, but I find myself in similar situations and under the same conditions I can easily make the same mistakes. I.e. criticizing someone for not willing to do something then a week later I find myself not willing to do something easy for a stranger because I was in such a terrible mood.
  2. It’s hard to strike up conversation with people when you know that the relationship will never be deep. What’s the point of striking up a conversation with a person at the subway? You may never see them again. But then again, you may never know where conversations end up. They may offer you a book recommendation, or something super insightful and change the way you think. But you don’t want to annoy them either.
  3. Making good choices. For every decision in the decision space, there is a theoretical best decision. For yourself, there is a theoretical best decision that you can take at every time t. I want to take those optimal decision, or decisions close to them. This is hard. The big recommendation I keep hearing is to study philosophy deeply. It has characteristics that cannot be defined my structured math and logic.
  4. Becoming a more efficient coder. One of the things I’ve realized I need to work on, still, is coding a little and testing a little. Knock on your steps before going to deep. Dipping your toe before jumping into the ocean. Not digging myself into a cobble-web of complexity.
  5. Becoming a better communicator. Lately, I feel like I’ve had a string of.run on sentences. People can’t keep track. I should cut, verify they understand, cut, and perhaps get more feedback. Usually shorter sentences are better than longer ones Ugh. But at the same time, it annoys people if you keep asking for feedback. Maybe the better approach is to study people’s so I can detect confusion.

Drafted June 11th, revised June 13th.

Hustler on the Subway

Interesting stuff always happens on the New York Subway.

I saw this kid - maybe a middle schooler selling fruit snacks on the train. He went around the cart saying,

“Excuse me everyone. I have something to say. I’m an honest man trying to make a living. I’m not here to get your money to do drugs or anything like that it. I’m fundraising for <something I can’t remember>. Two fruit snacks for a dollar.”

It made me smile. He went around asking the cart if anyone would like fruit snacks. He wasn’t just standing there - he was actively looking for new customers.What a boss.

I wanted to buy one, but I didn’t have cash on me =(. Nothing. Nada.

When I see things like this, I wonder how much this potential this kid has - he’s doing something that other kids, even adults like me have a hard time doing. He’s also putting his own spin on it instead of just saying fruit snacks for a dollar. He’ll definitely be really successful if he gets the right mentorship and nurturing. There’s so much potential in kids like this. And fortune favors the bold. I could learn from him, and man, if I was a teacher I’d want an awesome student like him.

Originally drafted May 12, 2019.

Pascal On Love

I read this gem in the past month from Pascal’s Penses. This is his snippet on love:

“What is the self? A man goes to the window to see the people passing by; if I pass by, can I say he went there to see me? No, for he is not thinking of me in particular. But what about a person who loves someone for the sake of her beauty; does he love her? No, for smallpox, which will destroy beauty without destroying the person, will put an end to his love for her.

And if someone loves me for my judgement or my memory, do they love me? me, myself? No, for I could lose these qualities without losing my self. Where then is this self, if it is neither in the body nor the soul? And how can one love the body or the soul except for the sake of such qualities, which are not what makes up the self, since they are perishable? Would we love the substance of a person’s soul, in the abstract, whatever qualities might be in it? That is not possible, and it would be wrong. Therefore we never love anyone, but only qualities.”

Wow. One can interpret it as the fact that we never love anyone - only their qualities. It’s a very logical, cold, reductionist point of view. It kind of reminded me of principle component analysis - how you extract information to the most relevant features that explain the most.

Most qualities of a person are malleable. People change. A person can be selfless today, but tomorrow they can be selfish. They can be the nicest person you know today, but something tragic happens to them tomorrow, and they become the most bitter and cynical person you’ve ever known.

This quote kept me up at nights before bed. Is it true that we really never love anyone, except for their qualities?

After much thought, I lean towards no.

One can argue that there are qualities of things that do not change no matter what. No matter how I change, I will still be my mother’s son. No matter how I change, I will still be me. Maybe there are essential to being me, that stand the rigor of time.

Secondly, let’s take a step back and evaluate this scenario. If someone asks me, “Why do you love your brother so much?”

I can list out all the qualities of why I love my brother. he’s smart, funny, kind, and thoughtful. Hmm. But what if he’s dumb, full of lame jokes, mean, and thoughtless - would I still love him? I think so. Aren’t there siblings we love that are like that already? There’s more behind than simple qualities.

Similarly, if I love someone, I can list out the qualities of why I love that person, but I love them because of who they are. Maybe listing out qualities is one abstraction level down to explain things that you cannot explain. If someone asks you, “Why do you love Jane?” Wouldn’t the optimal answer be, “Because she’s Jane. There’s no one like her, and there will never be anyone like her. Ever.”

Also, love and building affection is a gradual process. People grow on you! So suddenly one of their qualities change, and if you decide to cut them off, maybe there wasn’t any substance in the first place. After all, you can’t love someone you don’t truly know and without understanding who they are.

The other side of change

I took a very negative approach to thinking about this in the beginning and focused on how people can change for the worse. But doesn’t Pascal’s thought apply in the reverse direction as well? People can change dramatically for the better. You can be bitter enemies with someone and abhor spending time with them because of certain traits. But everything is subject to change. They can change in ways you can’t imagine. Who knows. Maybe you two become besties later on? Similarly, you’re also subject to the forces of change. Maybe they didn’t change to become more likable - you did!

Perhaps people are like quarters. When we examine one side, we see an engraving of George Washington. Flip it over, and we see one of fifty states. Can you really predict this? You can’t predict which state, nor can you predict based on the picture of Washington that you’re going to get a state on the back. For better or for worse, no one can really know for sure how people change.

The Humble Field Metalist (2019 Book #4)

I finished reading this book in Korean, which translates to Joy of Learning. The book was originally in Japanese. It’s about mathematician Heisuke Hironaka who made a significant contributions in algebraic geometry. He received a Fields metal for his work.

This book was significantly difficult for me to read. First, I hard time with the language - my Korean vocabulary isn’t strong. I had to do a LOT of lookup: My pages looked like this:

Secondly, the book is a translation from Japanese to Korean so I think there’s still meaning lost in translation.

This is probably my favorite book that I’ve read this year. It was one of my dad’s favorite books too and I can see why. What made it so special was that it deeply thought about the meaning of life, and just the sense of warmth and wisdom you feel from Professor Hironaka makes you feel fuzzy.

During my time reading, I’ve reflected a lot about what my purpose in life.

I’ve also thought about death - perhaps I have only 60 years left in me at most under certain assumptions.

This is what I came up with while reading.

First, a big proportion of what makes up death that it is a state of no change. If we expand on this, then a life of routine, zero growth and learning - how is it any different from being dead?

It makes sense. Studies have proven that when you travel, or experience new things, your perception of time is a lot longer. So one can live a 1000 years but have marginal growth. His perception of time will be skewed to be very short compared to someone who lived 20 years and spent time, learning, experiencing life to the fullest.

If I have to summarize the book into one sentence, it’s this: Life is a quest to obtain wisdom and create things.

This book is a strong recommend from me. I hope that one day he can translate it into English. I wrote him an email that I’d do it from Korean to English. Haven’t heard from him yet. 0.0.

Notes and Commentary

  • Having a dream that makes your heart beat by just thinking about it.
  • In order to create you have to learn.
  • Life for deeper understanding of yourself and self-discovery.
  • Despite winning the field metal, this guy is really humble. Says that he’s seen people so smart that he’s question why God was so unfair. He said he had chills when he met geniuses so smart. I agree. I’ve met my fair share of smart people, and I’ve felt that way about some of them.
  • Benjamin Spock - Children need to have an advocate on their side.
  • If people cannot forget, then if some catastrophe falls on them, then they will be destroyed. That’s why forgetting is important. By forgetting, you can “reset” your brain.
  • It’s not that we forget because the content we’ve learned disappeared - we tend to maximize input absorption but brains are terribly bad at recalling anything. It’s interesting - many of our computers are like this. You shove all the data you can, and later on you retrieve. Google is also built on this philosophy as well - it’s not collecting that’s the problem. It’s retrieval.
  • Wisdom has depth, breath, and strength. The ability to think widely, think in one subject deeply, and the strength and conviction to make a wise decision. Learn to grow in wisdom.
  • Lev Pantryagin - Topological groups.
  • During his Ph.D days, he spent a lot of time with two other grad students. He said that when people ask him if he was jealous of their talents, he says that it’s not the case. Rather, he feels that he felt blessed to have learned beside them.
  • He spent two years on a problem until some youngster in Germany solved it before him. He was devastated, and he said he tunnel visioned on an approach due to a complement of his work at a conference. Looking back, he said that it’s wise to have a simple mind.
  • ***** Say a guy loves a girl ****. First, he hopes that the girl likes him back. Then, the hope turns into a delusion that the other girl might like him. Small delusions can turn into reality, and that’s a consequence of human tolerance/cultivation/imagination. In some sense, it’s ironic that the creativeness of the human mind is also its greatest weakness. It’s hard to accept the truth as what it is. And facts as simply what they are. The line between observation and speculation is very thin - and to know the difference accurately is super important.
  • There’s no need to compare yourself to others. You need to have your own, personal goal. I’ve struggled a lot with this personally, and as I’ve grown I usually don’t compare myself to others. They will never be me, and I will never be them. To each his own.
  • Difference between Japanese and American students.

    Prof: What are you researching?

    Japanese student: I’m researching algebraic topology.

    American student: I’m looking into X. My hypothesis is X.

    This requires courage. Be fearless and bold.

  • Component analysis - the reason why the West has been successful in technological innovation and scientific research is the rigorous breaking down of the problem into components, and examine them one by one in microscopic detail.
  • He gave a lecture, and a famous professor said that, “This is too abstract. You need to add conditions and solve a more defined problem.”

    He was dismayed because he thought the professor meant that the problem was too ambitious for him, and he needed to tone down the scale of the problem.

    The professor then said that, “Only when you put constraints, and iterate, you’ll come up the right approach to solve a more generalized version of the problem. The abstraction will come naturally.”

    ** So he went back to the States and worked on the problem. When he put constraints on it, the problem became more murky to understand and read. But when he removed the constraints, the essence of the problem became much more clearer to see. **

  • Thus, he says that to build something like a good company, you need to not optimize for local optimums, because it will hide you from focusing on the essence of the business.
  • This principle applies to so many areas that it requires deeper thought. For example, one of the ways Haruki Murakami’s writing is very different from other Japanese writers is that he drafts it in English and converts it to Japanese. This kind of constraint actually forces creativity.

    Or let’s say that you can only use 10 minutes to explain a hard concept in a video. You would cut and purify your reasoning to the essence. Usually shorter videos, shorter code, shorter things are more beautiful and crisp than longer things.

  • In research and in life, his attitude is:
    1. Take reality as it is.
    2. Develop a hypothesis
    3. Conduct component analysis on objects
    4. See the big picture when stuck
  • ** People always think from their point of view. If a mother says, “I’m saying this for your own good.” -> that’s actually not true - she’s probably thinking from her own point of view and her loss i.e. reputation.
  • Suppose you are solving a problem. He states that you need to flip it around and become the problem 0.0. He quotes a famous mathematician saying, “ Genius is something who can’t differentiate between himself and the problem.”
  • You have to be integrated with your goal and dream. If that doesn’t happen, then you can’t move forward.
  • “To live is to learn - there is joy in learning. Living is also creating something, and there is a certain joy that you can’t feel from learning when you are creating.”
  • Reading gives you the opportunity to think. Books are thinking devices.
  • Studying is not something that’s really difficult to do - anyone who loves to think can do it, and feel happiness from doing it.
  • Henri Poincare said something like, “Creating is like mushrooms.” First, you have to sow the roots - become grounded. But then need a catalyst/distractor, whether that be a change in weather or a foreign chemical to create spores.
  • Ninomiya kinjiro- to be looked up later.
  • To be an artist you have to be hungry. It’s the same for creators - they are always hungry.
  • Even if you think you’ve solved it, you have to check every minute detail.
  • Lessons of Creativity:
    • Flexible with solving problems.
    • The passion to create must come from within yourself. I feel that this is really hard to do. No one just randomly implants some passion on you. It has to come from within, but how does this process even happen??
    • Applications of what you create usually come after the creation.
  • Admits that he’s not a smart person, but a persistent person. That he will go to the ends of the earth to get something done.
  • Wait patiently, and when an opportunity comes grab it. Everything is persistence.
  • Life without challenges will not result in great amazement or happiness. Life is about self-discovery.
  • Recalls this one student of his at Columbia University that kept asking questions. He would call professors late into the night asking questions for an hour. He says that when this student was admitted, his skills were subpar. But after a year, he started asking really good questions, and by his fourth year in college he had an amazing thesis. Went onto become a professor at Stanford.
  • He mentions that what he liked about Americans is that they learn through asking questions, and don’t use questions as ego boosters. They also don’t differentiate between good and bad questions.This attitude of learning by asking all the questions you can possibly have.
  • What’s fascinating is that the American education optimizes for individuality and accelerating people who are ridiculously talented. I hear about how on average the American education lags behind, but this is a tradeoff. It is fascinating. I’m willing to bet that the average performance of all students in the U.S. is low, but the standard deviation between skills is so high we have a wider range of talents.As a consequence there are a lot more experts. It would be interesting to see some research on this.
Site built with http://lanyon.getpoole.com/ template. Thanks to @mdo for the original template!