It is important for any potential data scientist to understand the essential technical skills requirements for a career path in big data analytics. Being familiar with the main concepts of big data, like planning projects with exponentially-growing data and the 5 V’s [14], will go a long way. Nonetheless, for a full-fledged data scientist, solid technical skills will go even further.
The programming languages and skillsets that support big data analytics are growing, and Forbes also provides a breakdown of this trend [1]. While the diversity in big data platforms and languages leads to a large number of technical skills to acquire, it also means there are just as many career paths and specializations. The industries hiring those with big data expertise are dominantly scientific and information industries, but manufacturing, retail trade, sustainability, and finance industries made up over a third of all the national job postings in big data in 2014 [1].
This variety means a diversity of skills including programming languages, database management, data collection, and data interpretation used to manipulate the information in these industries, and in Figure 5, we can see a measurable reflection of this trend.
Skill | % of Big Data Jobs Mentioning This Skill Set (multiple responses allowed) | % Growth in Demand for This Skill Set Over the Previous Year |
Java | 6.62% | 63.30% |
Structured query language | 5.86% | 76.00% |
Apache Hadoop | 5.45% | 49.10% |
Software development | 4.70% | 60.30% |
Linux | 4.10% | 76.60% |
Python | 3.99% | 96.90% |
NoSQL | 2.74% | 34.60% |
Data warehousing | 2.73% | 68.80% |
UNIX | 2.43% | 61.90% |
Software as a Service | 2.38% | 54.10% |
Figure 5: Diversity of big data technical skills [1]
Each skill has its own big data application, and professionals can choose to specialize in any one of them [15]. Each big data application domain, in turn, represents its own career paths. In the following sections, we will explore many of the technical career paths a big data scientist can follow.