Understanding data starting with the basics

Published on 20 November 2019

Reading time 3 minutes

Electric scooters, electric bicycles, connected watches or internet, e-commerce, and even social networks. The volume of data available to companies in all sectors has literally exploded in recent years. 


A growing volume of Data

Today, it is possible for large companies and even SMEs to recover data on their customers, on the market or on competition. Maxime Lahy, a data scientist at Keyrus defines data as follows: "data is all kinds of information in all forms. This data can be structured as a data table/ Excel table or unstructured as a movie/picture."

These datasets, which have become so large that they exceed human intuition and analytical skills, have a name: "Big Data".


Gartner, the world leader in research and consulting, defines Big Data in 5 Vs: 

  • Velocity (reference to the speed at which the new data is generated and moves. Just think of the messages on social networks that become viral in a few seconds)
  • Volume (reference to the huge amounts of data generated every second. Just think of all the emails, tweets, photos, videos, sensor data that we produce and share every second)
  • Variety (Table, database, photo, web, audio, social, mobile)
  • Veracity (reference to the reliability of the data)
  • Value (it is great to have access to large data but it is still necessary to transform them into value, otherwise it is useless)

Everyone is concerned

All business sectors are now looking to use the data at their disposal.

However, to be able to use this data, companies must rely on the know-how of highly qualified professionals capable of using analytical technologies: Architect Big Data, Data Engineer, Data Analyst, Data Scientist, and many others. 


Understand the Data business lines

But what is the difference between all these professionals who know how to use this huge data? 

Patrycja-Ewa Klopotowska, Data and Big Data recruiter, explains the difference between some of these professions. 

"In order to better understand who does what, we must base ourselves on this document divided into 3 phases:


STEP 1 - Data Scientist

The profession that falls under step 1 (prospecting & federation, categorization & process setting) is Data Scientist. 

His tasks include extracting large amounts of data to identify trends and then examining data from multiple sources to apply and develop new analytical and statistical methods and machine learning models. 

The Data Analyst also designs and creates reports and visualizations in the data to better understand the situation. 


STEP 2 - Data Engineer & Data Architect  

The job that falls under step 2 (data engineering) corresponds to the Data Engineer.

The latter, using the Data Architect, architecture of distributed systems and data platforms taking into account scalability and continuous integration, creates reliable pipelines.


STEP 3 - Data Scientist

Finally, we find the Data Scientist profession in step 3 (Modeling/Implementation). The objective of this project is to design and create reports and visualizations to better understand the situation. “


Patrycja-Ewa also completes with the Data Analyst business:

"The data Analyst resolves data conflicts and creates basic descriptive statistics. It visualizes the data and communicates the statistics. He must, therefore, have a basic understanding of statistics. “


The latter also supports that to work in Big Data, you need a lot of logic, good analytical skills, curiosity, and good communication skills. 

We have understood that data or Big Data is a complex subject found at the heart of all companies. You now have the basics. If you want to go further, don't hesitate to watch this 3 minutes video sublimely made by Patrick Wampé, one of our fabulous data trainers!