News: PASS outstanding Volunteer award & stepping down as Business Analytics Virtual Group Co-leader

Standard

I am honored to get the PASS outstanding volunteer award again for June 2017! It’s been so much fun helping grow the chapter from 1K to 10K members within last 4 years — the PASS HQ Team & Dan English (Group Lead) were great to work with and there’s so much more growth left for the next few years! The Group was recently classified as a “tier-1” group and got new sponsors which mean that group has some funding to pursue paid growth opportunities that weren’t accessible before.

Outstanding Volunteer Award PASS

URL: http://www.pass.org/Community/GetInvolved/Volunteers/OutstandingVolunteers.aspx

So since the group has the perfect platform to continue growing and we have a really good process in place to keep our growth flywheel running, I figured it’s a great time to step down. Over the past few years, my career moved me from Business Intelligence -> Analytics -> Data Science and along with that, I have slowly moved away from Microsoft-centric architectures too. I started out working for a Microsoft Gold Partner and then worked for an Open-source heavy shop at a startup-mode organization in silicon valley and now I work in an organization that uses a little bit of everything. Something like best of both worlds — and so there’s a much bigger gap now between where my career is taking me and the mission of the business analytics virtual group. They don’t perfectly align anymore and even though it’s a very rewarding experience, after some reflection, I figured the group deserves a leader whose mission aligns better than mine does.

Thank you PASS for the opportunity!

And there’s an open position for new volunteers on the Virtual group and so if you like to be involved, reach out to Dan English through the group’s website: http://bavc.pass.org/

What is the difference between courses offered by Springboard vs datacamp vs dataquest? Which is better?

Standard

I am a data-camp subscriber + mentor w/ springboard + completed free-content on data-quest so familiar w/ all three products in some way.

You need two things to have a successful career:

  1. Strong Foundation
  2. Continuous learning

Let’s talk about Continuous Learning first:

In a field that’s as dynamic as data science, you should always be learning! It could be through your projects at your work, side-projects or online resources.

I would categorize both data-camp and data-quest under this and are great platforms for continuous learning. I am a subscriber on DataCamp and it’s a great platform to just dive in, do some hands-on exercises and learn something new. I love it! I have heard equally positive things about DataQuest so if you are already working in the Industry as a Data Scientist and just want to get deeper technically, then go for these platforms!

Now Let’s talk about Strong Foundation:

You need a strong foundation to get hired as Data Scientist. You would do that by typically having a relevant college degree. But:

  1. A lot of people don’t have relevant college degrees OR
  2. They graduated a few years back and are looking to do a career transition now OR
  3. They are not willing to go back to do multi-year college programs focused on data science

If that’s the case then there’s a new approach in the market where you attend these “boot camps” — you still need some foundation skills like for example: math/programming/statistics to be eligible for Data science boot camps and if you have those basic skills then you can go through these boot camps. There’s a bunch of them out there. Just search for “data science boot camps”. Springboard is one of them and I have heard nothing but positive things about them — just like I have about DataCamp & DataQuest. I have personally mentored 6 students so far and all them were looking for a career transition and had nothing but positive things to say! That’s just my empirical data though, you should do a trial w/ them and/or check out their job guarantee through their career track if that is important to you. But either ways, it’s a “Bootcamp” offering so it has regular mentor calls/check-in’s, projects, career-coaches, non-technical material like resume tips to give you a structured approach to everything that you might need to get hired as a data scientist — You can expect intense guided learning over a short period of time. The Bootcamp approach is different than self-learning and self-paced approach by DataCamp & DataQuest.

PS: I mentor for Springboard and I have a $750 OFF discount code to share if you decide to enroll. Please contact me through to get the code: Let’s Connect! – Insight Extractor – Blog (If you prefer to not use the referral link, just search for “springboard” and sign up there)

The World is not binary!

I am not saying that you can’t break into Data Science with just DataCamp and DataQuest — you would need to complement it w/ other resources and put more effort to cover everything that you may need. With enough motivation, it could be done for sure! Depending on how fast you want to break into data science + how much time you can invest in figuring out the right resources are two of the biggest factor to determine if you need to go through a Bootcamp.

Conclusion:

If you are already working as a data scientist, DataCamp and DataQuest are great for continuous learning! If you are new to this and don’t have a relevant education background then boot camps like Springboard are a great choice.

Hope that helps!

VIEW QUESTION ON QUORA

#PowerBI idea: Enable #PowerQuery Excel add-in for Mac/Apple/iOS

Standard

As a data professional, you would invariably end up spending a lot of time on data cleaning & transformation and a lot of times, you might be doing your work in Excel — if so, then check out Power Query if you haven’t already! It will save you a LOT of time and unlock Jedi powers that you didn’t know you had!

BUT…

if you are using a Mac — and there’s a lot of data scientist and data analyst who are on this platform then you are unfortunately out of luck! So for Mac users out there, I had shared this feedback which has 50 comments & 337 votes (as of 6/16/17) on the official Power BI ideas site; If you are one of the Mac users, then I encourage you to check it out and vote! Microsoft does take it seriously and their roadmap is heavily influenced by ideas site.

URL: https://ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/7157571-enable-power-query-excel-add-in-for-mac-apple-ios

Power Query Excel Microsoft

 

Rumsfeld on Analytics:

Standard

I loved the “Donald Rumsfeld on Analytics” framework shared by Avinash Kaushik in his strata talk. Even though the talk was from 5 years back, this is still relevant today! As a data analyst/data science professional, we should strive to automate the fact-checking and reporting as much as we can, so that we can focus on the good stuff: validating (or invalidating) intuition and exploring unknowns!

Rumsfeld on Analytics

And if you like frameworks to structure your thoughts, you might also like the What-why-What’s-Next (4W) framework to test your analytics maturity here — this is important because if your organization is not mature, you might get stuck in data puking (reporting/fact-checking) and never get to the good stuff that Avinash talks about in the framework above. So figure out the analytics maturity of your organization and then take steps to help your organization improve.

-Paras

How will bots impact the adoption of data platforms?

data bots
Standard

If you are a data science professional and haven’t heard about bots, you will soon! Most of the big vendors (Microsoft, Qlik, etc) have started adding capabilities and have shown some signs of serious product investments for this category. So, let’s step back and reflect how will bot impact the adoption of data platforms? and why you should care?

So, let’s start with this question: What do you need to drive a data-driven culture in an organization? You need to focus on three areas to be successful:

  1. Data (you need to access from multiple sources, merge/join it, clean it and store it in cental location)
  2. Modeling Layer/Algorithm layer (you need to add business logic, transform data and/or add machine learning algorithm to add business value to your data)
  3. Workflow (you need to embed data & insights in business user’s workflow OR help provide data/insights when they in their decision-making process)

Over the past few years, there was a really strong push for “self-service” which was good for the data professionals. A data team builds a platform for analysts and business users to self-serve whenever they needed data and so instead of focusing on one-off requests, the team could focus on continuously growing the central data platform and help satisfy a lot of requests. This is all great. Any business with more than 50-ish employees should have a self-service platform and if they don’t then consider building something like that. All the jazz comes after this! Data Science, Machine learning, Predictive modeling etc would be much easier if you have a solid data platform (aka data warehouse, operational data store) in place! Of course, I am talking at a pretty high-level and there are nuances and details that we could go into but self-service were meant for business users and power users to “self-serve” their data needs which is great!

Now, there is one problem with that! Self-service platforms don’t do a great job at the third piece which is “workflow” — they are not embedded in every business user’s workflow and management team doesn’t always get the insights when they need to make the decision. Think of it this way, since it’s self-serving platform, users will think of it to react to business problems and might not have the chance to be pro-active.Ok, That may seem vague but let me give you an example.

Let’s a take a simple business workflow of a sales professional.

  1. She has a call coming up with one of her key customers since their account is about to expire. So she logs into the CRM (customer relationship management) software to learn about the customer. She looks at some information in the CRM system and then wants to learn about the product usage by that customer over last 12 months.
  2. She opens a new browser tab and logs into the data platform. Takes about 10 minutes to navigate to data model/app that has that information. Filters the data to the customer of interest and a chart comes up.
  3. Goes back to the CRM system. Needs something else so goes back to the data platform. That searching takes another 10 minutes!

Wasn’t that painful? Having to switch between multiple applications and wasting 10 minutes each time just to answer a simple question. So business users do this if this is critical but they will ignore your platform if it’s not business-critical.

So to improve data-driven culture you need to think about your business users workflow and think of ways to integrate data/insights. This is probably one of the most under-rated things that has exponential pay-off’s!

So how do bots fit into all of this? So we talked about how workflows are important, right? To address this, tools had data alerts and embedded reports feature which works too but now we have a new thing called “bots” which enables deeper integration and helps you embed data/insights to a business user’s workflow.

Imagine this: In the previous example, instead of logging into data platform, the business user could just ask a question on one of the chat applications: show me the product usage of customer x. And a chart shows up. Boom! Saved 10 minutes but more importantly, by removing friction and adding delight, we gained a loyal user who is going to be more data-driven than ever before!

This is not fiction! Here’s a slack bot that a vendor built that does what I just talked about:

Product Usage BotsSo to wrap up, I think bots could have a tremendous impact on the adoption of the data platforms as it enables data professionals to work on the third pillar called “workflow” to further empower the business users.

And the increase in data consumption is great for both data engineers and data scientists. it’s great for data engineers because people might ask more questions and you might have to integrate more data sources. It’s great for data scientists because if more people ask questions then over time, they will get to asking bigger and bolder questions and you will be looped into those projects to help solve those.

What do you think? Do you think bot will impact the adoption of data platforms? If so, how? if not, why not? I am looking forward to hearing about what you have to say! please add your comments below.

-Paras Doshi

What are the must-know software skills for a career in data analytics after an MBA?

Standard

SQL, Excel & Tableau-like tools are good enough to start. Then add something like R eventually. And then there are tools that are specific to the industry – example: Google Analytics for the tech industry.

Other than that, you should know what do with these tools. You need to know following concepts and continuously build upon that as the industry use-cases and needs evolve:

  1. Spreadsheet modeling
  2. Forecasting
  3. Customer Segmentation
  4. Root cause Analysis
  5. Data Visualization and Dash-boarding
  6. Customer Lifetime value
  7. A/B testing
  8. Web Analytics

VIEW QUESTION ON QUORA

Is it too late to become a good Data Scientist?

Standard

If you’re looking for career change, that’s never too late!

If you’re looking to learn something new, that’s never too late!

If you’re looking to continue learning and go deeper in data science, that’s never too late!

If you don’t like Software engineering and want to switch to something else, that’s never too late!

But if you are after the “Data Science” gold rush, then you did miss the first wave! You are late.

But seriously, you should apply first-principles thinking to your career strategy and ideally not jump to whatever’s “hot” because by the time you get on that train, it’s usually too late.

VIEW QUESTION ON QUORA

[Resource] 8 Methods to calculate CLV:

Standard

There are lot of ways to apply a CLV (customer lifetime value) model. But I hadn’t seen a single document that would summarize all of them — Until I saw this: http://srepho.github.io/CLV/CLV

If you are building a CLV model, one of first things that you might want to figure out is whether you have a contractual model or non-contractual model. And then figure out which methodology would work best for you. Here are 8 methods that were summarized in the link that I shared with you:

Contractual
  • Naive
  • Recency Frequency Monetary (RFM) Summaries
  • Markov Chains
  • Hazard Functions
  • Survival Regression
  • Supervised Machine Learning using Random Forest

Non-Contractual

  • Management Heuristics
  • Distribution Based Approaches

Hope that helps!

As a student preparing for data anaylst & science roles, should I generalize vs specialize?

Standard

This question was posted on Springboard forum.

Here’s my answer:

It depends on your target industry & where they are in their life-cycle.

It has four stages: Startup, Growth, Maturity, Decline.

Industry lifecycle

Generalization is great in earlier stages. If you are targeting jobs at startups; generalize. You should know enough about lot of things.

T-shaped professionals are great for Growth stage. They specialize in something but still know enough about lot of things. E.g. Sr Growth/Marketing Analyst. Know enough about analytics & data science to be dangerous but specializes in marketing.

Specialization is great for mature industries. They know a lot about few things. E.g. Statisticians in an Insurance industry. They have made careers out of building risk models.

Any advice for moving into data science from business intelligence?

Standard

This was asked on Reddit: Any advice for moving into data science from business intelligence?

Here’s my answer:

I come from “Business Intelligence” background and currently work as Sr. Data Scientist. I found that you need two things to transition into data science:

Data Culture: A company where the data culture is such that managers/executives ask big questions that need a data science approach to solve it. If your end-consumers are still asking bunch of “what” questions then your company might NOT be ready for data science. But if your CEO comes to you and says “hey, I got the customer list with the info I asked for but can you help me understand which of these customers might churn next quarter?” — then you have a data science problem at hand. So, try to find companies that have this culture.

Skills: And you need to upgrade your skills to be able to solve data science problems. BI is focused too much on technology and automation and so may need to unlearn few things. For example: Automation is not always important since you might work on problems where a model is needed to predict just a couple of times. Trying to automate wouldn’t be optimal in that case. Also, BI relies heavily on tools but in Data science, you’ll need deeper domain knowledge & problem-solving approach along with technical skills.

Also, I personally moved from BI (as a consultant) -> Analytics (as Analytics Manager) -> Data science (Sr Data Scientist) and this has been super helpful for me. I recommend to transition into Analytics first and then eventually breaking into data science.

Hope that helps!

VIEW THREAD ON REDDIT