As a data analyst for the CEO in an e-commerce company, what kind of reports are expected of me?

Standard

Someone asked this on Quora and here’s my reply:

As a data analyst, you should work with the CEO (or other decision makers) on a quarterly (or more frequent if possible) and learn about #1 Strategic objectives and initiatives — #2 after that, you should work together and figure out how analytics could help these initiatives.

So why is learning about strategic initiatives from the executives important?

  1. Because analytics could be applied to lot of problems but you and your team might just have limited bandwidth.
  2. Also, executives want to stay focused on what’s important now and so if your priorities align then you are much likely to succeed in the role.

Let’s take an example:

Scenario 1: As a data analyst, you create bunch of reports from let’s say Google Analytics and throw them at the CEO! It has everything like visitor stats, acquisition stats, retention stats, behavior stats, conversion stats among others! Now by doing so, executives might get what they asked for but then they will still have to go through the reports and map it back to their strategic initiatives and figure out the recommendations on their own. Also, executives might not have the time to do this and may miss critical insights.

Scenario 2: You know that the one of the strategic initiate for the quarter is to improve the conversion rate from landing pages to order-complete page from 1.25% to 1.40% — so your analysis that you send to the executive would not only be focused on just that but also include “recommendations” — like it seems that there is a significant drop-off after customers learn about shipping cost. Then the executive could use those recommendations to drive actions. There’s also another benefit: Any ad-hoc data request that doesn’t align with the strategic objectives can be postponed (or de-prioritized) and let’s you focus on what’s most important for the company.

I prefer scenario #2. And try to create this culture wherever I am working. Executives should be open to sharing strategic initiatives at high-level with everyone in the company and help align everyone’s priorities.

Note: This doesn’t mean that you don’t create reports, you still do that for broader consumption — especially the Key Performance indicators that are key for success but you should look at automating most of that and focus on data analysis and find recommendations that the executives could take some action on.

VIEW QUESTION ON QUORA

How to remove line feeds (lf) and character return (cr) from a text field in SQL Server?

Standard

I was doing some data cleaning the other day, I ran into the issue of text fields having line feeds (lf) and character returns (cr) — this creates a lot of issues when you do data import/export. I had run into this problem sometime before as well and didn’t remember what I did back then so I am putting the solution here so it can be referenced later if need be.

To solve this, you need to remove LF, CR and/or combination of both. here’s the T-SQL syntax for SQL Server to do so:

SELECT REPLACE(REPLACE(@YourFieldName, CHAR(10), ' '), CHAR(13), ' ')

if you’re using some other database system then you need to figure out how to identify CR and LF’s — in SQL Server, the Char() function helps do that and there should be something similar for the database system that you’re using.

How do I pursue career in data warehousing?

Standard

Someone asked this on quora, and here’s my reply:

In the data world there are two broad sets of jobs available:

  1. Engineering-oriented: Date engineers, Data Warehousing specialists, Big Data engineer, Business Intelligence engineer— all of these roles are focused on building that data pipeline using code/tools to get the data in some centralized location
  2. Business-oriented: Data Analyst, Data scientist — all of these roles involve using data (from those centralized sources) and helping business leaders make better decisions. *

*smaller companies (or startups) tend to have roles where small teams(or just one person) do it all so the distinction is not that apparent.

So, it seems like you are interested in engineering-oriented roles — the role that focused on building data pipelines. Since you are starting out, I would suggest that you broaden the scope to learn about other tools as well. While data warehousing is still relevant and will be in some form or another for next few years, Industry (especially tech companies) have been slowly moving towards Big Data technologies and you need to be able to adapt to these changes. So learn about data warehousing, may be get a job/internship as a ETL/BI engineer but keep an eye out on other data engineering related tools like Hadoop ecosystem, spark, python, etc.

VIEW QUESTION ON QUORA

How Marketable is R programming?

Standard

Someone asked this on Quora: How Marketable is R programming?

Answer:

Let’s step back!

Why do you want to learn R? OR why do people learn R?

To solve problems that R can address. Right?

What problems do you have? OR what problems does your COMPANY have? OR what PROBLEMS your Dream company that you want to join have?

<< LIST THEM DOWN HERE>>

example:

  • I want to predict customers that are going to churn next quarter.
  • I want to identify Marketing channel that drove the revenue growth last quarter.
  • etc..

What’s Next?

NOW, take all of these problems and find ways to solve them.

R may or may not help.

You could just do it in Excel. Then do that.

OR R helps you a little bit in the process but you need something else.

In some case, R is a perfect solution! Like building a model to predict customer churn!

So, What?

you see, learning R is important and you might get a job by showing that you have “R” chops but that will not be enough for career growth. You should be focused on learning to solve business problems using data. use R sometimes. use Excel sometimes. use Python sometimes. use SQL. use Tableau. use << INSERT A TOOL HERE>>. Learn them. Apply them. Figure out their strengths and weakness. BUT learn to use all of these technology platforms to solve problems! Solve problems that are thorny. Solve problems that move the business needle. Solve problems that get your bosses boss promoted.

If you do that, marketing your skills wouldn’t be a concern anymore.

It’s NOT easy. And it WILL take time.

TL;DR: Go for it! Learn R! But more importantly, learn to solve problems with data.

VIEW QUESTION ON QUORA

How do you generally detect a fraud using analytics?

Standard

There are two broad range of algorithms that can help you detect fraud: 1) classification (supervised) 2) clustering (unsupervised)

Fraud Analytics Anomaly Data Science

Now it’s a fair assumption that fraud is pretty rare and it’s an outlier in your data. In other words, it’s a anomaly and the process of identifying them is called Anomaly Detection.

So under classification, there are algorithms out there specialize in “anomaly” detection like one-class SVM and PCA based anomaly detection. Try them out on your dataset and see if it’s able to capture “anomalies” in your dataset. While you are at it, don’t discount traditional classification algorithms either, they may be useful as well. You will have to train these algorithms and that’s why they are called “supervised”.

There an alternate approach. Which is to use unsupervised algorithms called “clustering” techniques. You could try something as simple as K-means or something more sophisticated. I haven’t used clustering much for solving fraud problems and have usually deferred to anomaly detection algorithms for this. But I am throwing this out there for making sure you know all the options! I can see these algorithms being applied to exploratory analysis where you are just exploring your data to find outliers to study them.

Hope that helps!

VIEW QUESTION ON QUORA

What’s a good chart making software that can pull online data?

Standard

So essentially you want to build a *live* chart that pulls data from some online data-store (which changes often).

To do that you can do one of three things:

  1. See if they have an API that you can use — if so, you should be able to use that. If not, continue reading…
  2. Build a web scraper on your own. There are tutorials out there that would help you do so in the language of your choice.
    Chart web scraping data
  3. Use a software service like Import.io | Web Data Platform & Free Web Scraping Tool or Web Scraper — or you could find something else. I have used Import[dot]io and was able to build an API using their service — which i used a data-store for my “charts”

Side note: just make sure you are not violating any terms by scraping the website.

VIEW QUESTION ON QUORA

 

Can I be a data analyst at a tech company without a degree in computer science?

Standard

Yes — it’s not a must have to work as a Data Analyst. In fact, a lot of people come from a non-CS background and succeed in this role!

Let’s look at the pros and cons of having a computer science (CS) degree and this should help you evaluate where you fall:

Data Analyst computer science degree

Pros of having a CS-degree:

  • If the data analyst position requires you to have this degree in CS then you qualify! Fortunately this is not that common and usually it says bachelor’s required in cs, business administration or related field so as long as you have bachelors for positions that require it then you should be fine
  • you might already have the basic tech skills that are needed for data analysis jobs and the CS degree might be used to validate that.
  • you can pick up new tech concepts and tools fast(er) — with the cs background, it’s easier to pick up new concepts & tools — and you need to continuously do that to stay relevant.

Cons of having a CS-degree:

  • Not enough business problem solving experience and/or lack depth in business knowledge — so if you have a degree in business then you come ahead! Especially if your background aligns with the role. For example: if you focused on Marketing in your bachelors and the role is focused around marketing analytics then you might have an edge
  • I have a CS degree and then I followed it up with a masters from a “business school” — so this is just based on my experience but few CS students (without real world experience) are inclined to focus on “automation” and “bleeding-edge” instead of focusing on what the problem needs. Lot of data analysis doesn’t need to be automated or shouldn’t be automated and not every company needs <<insert the latest tech trend here: big data, deep learning>> — but CS students tend to do that. That’s what they feel most comfortable with so while that doesn’t stop from getting the job, this would impede their growth as a data analyst within the org.

Conclusion:

So as you can see even if you don’t have a CS degree, you can still find roles that align with your other skills and in fact, you might be able to come out ahead if you can prove that you have basic quantitative and tech skills needed to get the job done.

Related: Paras Doshi’s answer to How do I prepare myself for a career in Data Analysis?

VIEW QUESTION ON QUORA