Apply Data To The Problem And Find Patterns

ALL DATA IS NOT NECESSARILY GOOD DATA

All data is not necessarily good data, data can often be corrupt or missing. It is common, for example, that data is missing for certain periods or that there is incorrectly entered data, or that differences in time zones are not taken into account when collecting.

Here you have to use your intuition as well as many plots to wash the data. This is done by finding data that is missing or incorrect and replacing it with the correct data, removing the data from the sample, or setting the value to a representative mean. Creating an idea of ​​why data is missing and finding people who can explain why it is missing can also help fill in the gaps in the data. All in order for the models who are to understand the problem to get as good a picture of the problem as possible.

When the data is clean enough, you can start testing yourself. Here you can spend almost any amount of time and it is therefore important to prioritize. The goal is to find models that find interesting patterns in the data.