top of page

DA/DS 求职刷题指南(下)

Updated: Jul 7, 2023

四、SQL


1. 常见面试问题

  • What is the difference between union and union all? where and having?

  • Table【in_app_purchase】: uid: unique user id. timestamp: specific timestamp detailed to seconds. purchase amount: the amount of a one-time purchase. This is a table containing in-app purchase data. A certain user could have multiple purchases on the same day Question 1: List out the top 3 names of the users who have the most purchase amount on '2018-01-01' Question 2: Sort the table by timestamp for each user. Create a new column named "cum amount" which calculates the cumulative amount of a certain user of purchase on the same day. Question 3: For each day, calculate the growth rate of purchase amount compared to the previous day. if no result for a previous day, show 'Null'. Question 4: For each day, calculate a 30day rolling average purchase amount.

  • Table【Friending】 time = timestamp of the action date = human-readable timestamp, i.e, 20108-01-01 action = {'send', 'accept'} actor_id = uid of the person pressing the button to take the action target_id = uid of another person who is involved in the action. Question: what was the friend request acceptance rate for requests sent out on 2018-01-01?

  • 题目二涵盖了简单的 aggregate 问题,cumulative 问题,rolling window 问题等 等。搞定这些,其他的都只是一些简单变形。 题目三涵盖了 self-join,并且有一些 tricky 的大于等于号的应用,有兴趣可以在地 里查一下 Facebook 面经的解答。 其他的题目无非是多了一些 table,join 麻烦一些或者加了一些 case when,难度 都不会有太大的变化。做好几个经典题,然后自己整理好就可以以不变应万变了。


2. 相关资料准备

  • 扫盲网站:SQL ZOO 和 W3schools,非常实用,适合翻阅。

  • 两个 Udemy 的 SQL 课:SQL - MySQL for Data Analytics and Business Intelligence 和 The Ultimate MySQL Bootcamp

  • 刷题的话,Leetcode上有一些题,可以做一下。还有好心人直接做了个整理,在这 里:summary of sql in leetcode。

  • Hackerrank 上的题自然是要全刷光的,因为难度非常简单,快的话一两天也许就做完 了。

  • DataCamp: Data Gathering- Why API Medium https://lnkd.in/gvahtsN Intro to SQLhttps://lnkd.in/giWs-3N Complete SQL Bootcamphttps://lnkd.in/gsgf_fF Data Visualization Mediumhttps://lnkd.in/g3FSRgY

  • 更多的网站:18 best sql online learning resources

  • 建议自己下载一个 My SQL 装到电脑上,模拟真实的 SQL 环境来学习。Mysql 里关于 Windows function 和 frame clause 的教程:Windows function ,Frame Clause。这 个非常重要,windows function 可以说是 SQL 面试里的大杀器,非常节省时间而且思 路清晰。

  • 建议也学会用 WITH common_table_expression。可以让你的 SQL 看起来非常整洁和 容易理解。

  • 最最重要的来了。如果你觉得刷完题或者学完以上的内容就万事大吉了,那还真的不 是。我一开始也有这样的误区。实际上刷完 Hackerank 也并不能帮你很快的做出我给的例题。而其实,对于 metrics 或者 product 的了解能够帮助你很好的准备 SQL 面 试,因为所有的 SQL 面试都是围绕着**与 business 相关的 metrics**而展开的。举例 而言,游戏公司一定会考 DAU(daily active user)或者 purchase rate, Facebook 就会 是 friend request 相关的,以此类推。所以熟悉你申请公司的业务再针对性准备 SQL,一定会事半功倍。

五、Product sense


1. 常见面试问题

  • Today you immediately notice that our app's new users are doubled. What could be the reason? Do you think it's good or not?

  • If we have an app with in-app purchase, name at least 4 metrics you would like to monitor in your dashboard.

  • If you are running an A/B testing and find that the result is very positive, thus you decide to launch it. In the first 2 weeks, the performance of our website is very positive./However, with time flying by, all metrics seem to go back to normal. How will you explain this result?

  • Assume we are Facebook and we would like to add a new 'love' button. should we do this?

  • We are running 30 tests at the same time, trying different versions of our home page. In only one case test wins against the old home page. P-value is 0.04. Would you make the change?

  • If after running an A/B testing you find the fact that the desired metric(i.e, Click Through Rate) is going up while another metric is decreasing(i.e., Clicks). How would you make a decision?

  • Assume that you are assigned to estimate the LTV(lifetime value) of our game app player. what kind of metrics would you like to calculate so as to make a good prediction?/Assume that you already collect all that you want. How would you make this prediction/estimation?

  • If you got a chance to add on new features for our app to increase our profit within a very short term. What will you do?

大多是围绕着 metrics 和如何提高 product performance 来展开的。说实话这些对于 new grads 来说非常不友好因为没有工作经验。


2. 相关资料准备


  • product school:https://www.productschool.com/。 貌似是各个湾区的 tech 公司的 DS 或 PM 大佬们来分享一些案例和学习经验。

  • Metrics:https://www.cooladata.com/19-metrics-every-mobile-games-needs-track/ 一个关于 Game metrics 的汇总

  • Critical metrics every product manager must track(https://productcoalition.com/critical-metrics-every-product-manager-must-track- c5f1e46e3423)

  • A/B testing:A/B 测试中 20 个必须知道的问题

  • A Collection of Data Science Takehome Challenges:可以让你有机会解决很多实际 的 DS 问题,并且也和 product 有关。

一些个人想法:

大多的产品问题都是围绕着产品的 metrics 或运营中遇到的问题展开的。如果针对互联网 行业来说, 一个典型的产品要从推出后经历以下阶段:user acquisition --> user engagment / retention --> monitization。

讲一个异曲同工的事件来帮助理解吧。了解头条系公司的都知道,公司内有一个流水线作 业的产品工厂模式。他们只有三个最核心的职能部门,技术,user growth 和商业化。技 术是保障了整体的运营,所以在整体的产品很稳定之后,就要看一下用户的留存率。如果 发现用户的留存率(retention)或使用率(engagement)非常高,则进入 user growth 的 推动环节,大批量的 marketing 来拉动新的用户增长(user acquisition)。最后把商业化 (广告等)的内容接入产品,迅速变现。虽然顺序和我说的不太一样,但是实际上就是在 在这三个部分对一个产品进行不断的迭代。所以 metrics 也离不开这三个部分。user acquisition 讲求新用户注册率,user engagement 讲求 DAU(日活量),monitization 讲 求 LTV 和 ARPDAU(Average revenue per daily active user)等等。逐渐加深对互联网 产品的理解,就能更好的应对 metrics 的问题。面试中 metrics 的问题,本质上一切都是 为了产品迭代和用户增长以及变现。

关于产品中被问到的 A/B testing 的问题,很多是面试官为了考察你是否能针对特 定的情况来分析 A/B testing 的结果。只看 P-value 是非常学生的思维,具体情况中的 A/B testing 是要从一开始的 new feature 的想法到后来设计整体的实验再到分析结果再到最后给建议的一整个流程。面试官想要的是一个批判思维,需要你对每一个步骤都扣的非常细 致,以此来确保实验和你的结论之间没有断层。


六、 Project

1. Projects/Competitions - Kaggle Kernels https://www.kaggle.com/ 2. Problem Solving Challenges - HackerRank https://lnkd.in/g9Ps2cb

七、ExtraSkills

1. Communication - Data Storytelling https://lnkd.in/gtiCSNT 2. Business Analytics- Geckoboard https://lnkd.in/g2X-Xtp 3. NLP - How to solve 90% of NLP https://lnkd.in/gh8bKe4 4. Recommendation Systems - Spotify https://lnkd.in/gH2GQKu 5. Time Series Analysis - Complete Guide https://lnkd.in/gFZU2Rb




详细咨询





bottom of page