Artificial Intelligence | Lesson 1.5
Data Science
AI has a complicated set of terminologies. What we want to do next is to share with you what these terms mean so that you will be able to talk about these concepts confidently and accurately with others.
In the last section, we talked about machine learning (ML) and its related terminology. We are missing another buzzword you have probably heard about, data science.
Data science (DS) is a broad, multidisciplinary field that uses scientific methods and processes to analyze data and extract insights. In practice, when people refer to a data science project, they often mean extracting insights from data using AI methods and presenting them as a presentation or report.
Machine Learning vs. Data Science Examples
To elaborate, imagine you have a housing dataset with the size of the house, number of bedrooms and whatever other information you have managed to collect. If you want to build a house predictor mobile app, the price will be your output and all the other information that you have will be your input. This is a machine learning system. It is an application of AI where the goal is to use the output of the model directly. The result of a ML project is a model that can map the inputs to outputs. The model is going to be used for individual instances and as long as the performance is decent, no further interference is necessary. In contrast, the goal of a data science project is to use AI to get insight. For example, imagine you want to invest in real estate and your budget is limited. You want to know what type of house will generate the most profit possible. A DS will use AI to find what features would increase the price of the house the most. In a DS project, ML models may be used but they are not the final goal. The goal is to use AI to turn information into insight.
To summarize, in an AI project, AI is directly used to turn your input into your desired output. However, an AI model is used to gain insight in a DS project. The organization can use this insight to turn it into knowledge. And that knowledge can be used to gain wisdom. The Data Information Knowledge Wisdom pyramid (DIKW pyramid) illustrates this process. In the pyramid, as you go higher, more value is added and decision-making becomes easier. Figure 1.3 gives a visual illustration of what this pyramid looks like, and Table 1.3 provides a comparison between AI and Data Science.
Figure 1.3 | DIKW Pyramid
Artificial Intelligence | Data Science |
---|---|
Automate tasks or predicts future events based on data | Produces insights based on data |
Is commonly used “live”: it continuously elaborates new data and produces answers | Is commonly “one-off”: it produces some insights that inform decisions |
Commonly has the form of software | Commonly has the form of a presentation or report |
Data Analysis
Data analysis is the process of inspecting, cleansing, transforming, and modeling data with the goal of discovering insightful information to inform conclusions, support decision-making and tell a story. In the following video you can explore the five steps that are taken in the data analysis process:
1. Defining the objective (problem statement question)
2. Collecting the data (quantitative or qualitative)
3. Cleaning the data (remove errors, duplicates, and unwanted data points)
4. Analyzing the data (techniques used depend on the goals – Descriptive, Diagnostic, Predictive, and Prescriptive)
5. Sharing the results (reports, dashboards, and interactive visualizations can be used while highlighting any gaps in data)
Video 1.6 | A Beginners Guide To The Data Analysis Process by CareerFoundry