O's Notes

Blog Home

Data Science: What is it?

<span>Photo by <a href="https://unsplash.com/@sortino?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Joshua Sortino</a> on <a href="https://unsplash.com/?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></span>
Photo by Joshua Sortino on Unsplash

Data Science is the new buzzword, a red hot career moniker and seems to be mushrooming everywhere - across career sites and social media platforms. Every job search site is filled with hundreds of listings for data scientists, data analysts, machine learning engineers etc. There are thousands of questions posted across Quora, Reddit and other platforms asking for advice about how to become a data scientist. Many bootcamps advertise 2-6 month programs to becoming a data scientist. Colleges are responding to the interest by offering bachelors and masters programs in data science. This post is for high school and college students looking for an introduction to the data science ecosystem.

What is data science?

Data science is an interdisciplinary subject that combines aspects of mathematics, statistics and computer science. A data scientist should be able to draw inferences and conclusions from the data in a real world topic, using a variety of computational and statistical techniques and present the conclusions in an efficient and relatable format. As the world has become data intensive, most companies are looking for insights based on data and making business decisions based on the results from their data science explorations.

Data Science in the workplace

Data science in the workplace is almost like a catch-all moniker. There is no one job that quite defines a data science job. While it may seem that all data science positions are similar, but in reality data science is an umbrella term with many different tracks within it.

A data science job could be any of the following or even a mixture of the following broadly defined career tracks: data engineer, data or business analyst, data scientist, and machine learning engineer. The tracks are not set in stone, but can be fluid depending on whether we are analyzing the postings from a large company, small startup, academic research, etc. However, in general, the career tracks are far more sharply defined in a larger organization as compared to a smaller organization where the job can be a mix of various responsibilities.

Data Science Tracks

Each one of the data science career tracks need a slightly different mix, breadth and depth of knowledge. Today, data science careers are still evolving and over time these tracks are going to become fairly solidified. Let’s take a look at the some of the biggest data science career tracks based on the number and the description of current job listings.

Data Analyst: A data analyst at a company collects the data, interprets and analyzes the data and offers suggestions that impact business decisions. As data can be spread across various databases and sources, a data analyst often spends a fair bit of time cleaning the data. Using cleaned relevant data, a data analyst will use statistical methods to analyze the data to understand patterns and trends in the data. Data analysts also prepare reports and dashboards showing their analysis of the data. Data analysts need to have knowledge of programming, computer science and statistics. Data analysts should know R, Python, SQL and other packages. Data analyst positions can be entry level positions for students with a bachelors or a master’s degree in computer science, statistics, math, engineering, econometrics, economics as well as any other quantitative discipline. Students with a bachelors in humanities can also become data analysts if they have a computer science or statistics minor along with some relevant programming experience.

Data Engineer: A data engineer deals with the nuts and bolts of data management, data transfer, maintaining the data infrastructure such as building data pipelines, maintaining the architecture such as databases, servers etc and data reliability. A data engineer is often someone with a background in computer science and or IT (information technology). It is an entry level position for students with a bachelors in computer science or IT, as well as a student with a master’s degree.

Data Scientist: A data scientist is not an entry level position. Most data scientist roles are for students with a masters or Ph.D. in quantitative fields or for experienced data analysts after a few years of working. A deeper and broader technical expertise in computer science and statistical modeling is required for the data scientist position. A data scientist is also required to offer actionable business insights based on the analysis and modeling of the data. Data scientists are often tasked with predicting trends which have a significant business impact or understanding customer behavior and patterns or even suggesting new product opportunities. A data scientist is expected to understand not only the technical aspect of the job, but also to have a good grasp of the business side, which is why is it is an advanced position.

Machine Learning Engineer: A machine learning engineer is similar to a data scientist in that they work with data and models. While a data scientist covers a large spectrum of duties, a ML engineer focuses a lot on the models and algorithms, automating models to be more self learning and predictive, and deploying the models in production. Machine learning also overlaps with deep learning and artificial intelligence (AI) techniques such as neural networks, or speech to text recognition, building chatbots etc. For example, machine learning algorithms are used in recommendation systems such as Netflix, Amazon, YouTube recommendations by learning about your preferences based on your choices and using that learning to offer more personalized suggestions…so it “self learns”. Often machine learning engineering is not an entry level position after a bachelors, though that could be changing as more of this is taught in undergraduate computer science programs. Most ML engineers have a degree in computer science.

Business Analyst: A business analyst uses statistical modeling to analyze the data and then presents the insights through reports, dashboards, and other visualization techniques for business purposes. Business analysts need to have knowledge of programming and statistics. Many companies prefer their business analysts know R, Python, SQL and other packages, while other business analysts at other companies use advanced Excel skills to analyze and build their reports. This would be an entry level position for a student with a bachelor’s degree in data science, statistics, business analytics and math.

The above tracks and their variations will also be influenced partly by whether the position is in software engineering, finance, government, public policy, advertising or health care. Some of these domains may need additional requirements to be fulfilled in addition to being proficient in “data science”.

Data Science and High School Students

As a high school student, it is not possible nor is it advisable to figure out at the onset of their college experience about their “final” destination in data science. We recommend that students who know that they would be interested in solving data science problems, get a degree(s) in any of the following majors: data science, computer science, statistics, math, physics, engineering, economics. Students who are more interested in business analytics can get a degree in business analytics or business and statistics. Read our post on data science major in college to learn about the critical subjects are taught as a part of the major and the depth and breadth of an undergraduate data science program.

Data Science and College Students

College students will have a better idea over their four years about the direction of their data science degree progression and what they would like to focus on during their college years. There is a growing trend in students transferring to the newly minted data science programs /tracks or doing a dual major or a major/minor combination to be competitive for data science positions. Our post on data science major in college will answer many of the questions that you might have about learning computer science or math as part of your data science degree.