## Top view of DataScience and machine learning

December 31, 2016

Posted by on **As a Data Scientist you have or you should not be limited yourself with only the training you got around, you have to think far and more ahead in terms of various things around, think of yourself as an amazing or great thinker like how many variables are possible to run the system or how many variables can really effect the system at which level.**

Let’s Look at top three Questions:

__What is a Data Scientist?__

__Why does it appear to be such a hard job?__

__What are the Skill Sets a Data Scientist Need?__

First thing we have to remember or taken care is big data. Organizations have lot of data those are collecting from various sources(mostly clients and activities) but that data is HUGE and no idea how to manage that in particular order. **A Data Scientist’s job is to find meaning in that data.**

**What kind of skills one should look for dataScientist?**

For a Data Scientist’s job one need to hire a PhD,Mathematician or Statistician for the job or one can also grow a Data Scientist with-in the organization.

**What is the fundamental Job of Data Scientist?**

Data Scientist is the one who found new discoveries from the data.

That’s what Scientist do. As a scientist first make a hypothesis and then try to investigate that Hypothesis under various conditions now in the case of Data Scientist they just do it with Data!

Data Scientist look for meaning,Knowlede in data and they do that in couple of different ways.

**Visualization of Data:**

For example one is Data-Visualization. Data Scientist visualize data in various forms and look for the meaning in the data, That’s what we can say business intelligence or Data Analyst might do.

**Using Algorithms:**

Advanced Algorithms those actually run through the data by looking all the meaning. Such algorithms are like Machine-learning Algorithms, Neural-Network, SVM, Regression Algorithm or K-means. There are dozens of algorithms and those run through data Looking for the meaning that is one of the fundamental tool of Data Scientist.

**So to use those Algorithms Data Scientist must have knowledge of Mathematics,Statistics and Computer Science.**

**So how Data Scientist’s work is being Started or Done? **

A data Scientist is given a large Data-Set or may be small Data-Set with a question.

Something like what customers are likely to return?

Or What customers are likely to buy in weekends?

or How many families buy sweets/fruits on festivals.(You can find the income range of families)-that is classic statics problem.

Now it is Data Scientist job to run various Algorithms on data and look for the answer. Here is simple thing one must think about, It is like how or why specific algorithm would work out on data. If we have basic or general level knowledge of algorithms then we can identify what this algorithm would really answer such question.

**“”””””””So Data Scientist go through various algorithms until they can find some pattern in data to answer the questions””””””**

Same thing is applicable with any trading strategy, we have to look for in our research that’s why such specific algorithm would work out Or other question is that how one Data Scientist can improve the recommendations of recommendation engine.

Netflix came with competition that Netflix would pay million dollars to one who would just improve their Recommendation Algorithm by 10%.

**Five Data Scientist actually came up with that Algorithm that would do that. **

**So again we can say that Data Scientists are people who answer questions and they are using data to answer those questions Or they are using the combination of Data or algorithms to answer those questions.**

When you have large dataset with various categories and columns then you have to rely on various algorithms so fundamanetal knowledge of such algorithms is what DataScientist should be aware of.

**What a Data Scientist is not?**

There are various myths about Data Scientist, is not a Java Programmer who knows Hadoop, many people are billing themselves as Data Scientists as they have such technical skills they are not data Scientist unless they don’t know data-discovery techniques and how algorithms would work on that data!!

**What is the difference between a A data Scientist and Data Analyst?**

Now we should also not confuse a Data-Analyts or busniess analyst with Data Scientist, Data Analyst is the one who create reports,graphs,dashboards based on data. Those reports based on their own knowledge that what they think is “important” to show or consider .

**Data Scientist is the one who Hypothesis what is important and then try to prove that Hypothesis.**

Now It is great for one person to have both of skills for programming as well as for Business domain knowledge. but most important is fundamental knowledge of ‘Algorithms,mathematics and statistics’. That is the one reason it is bit difficult to find a A Data Scientist because it needs some unique Skills.

