Introduction to Data Science
In the below link, you will find a dataset related with “online business sales 2017-2019”.
You should analyze the dataset, in order to identify possible patterns and apply the appropriate data mining algorithm (classification, regression, etc), to establish a model that can predict the target variable.
Therefore, you have to propose a model of data mining, making all the analysis and the calculations needed to generate the model.
· From the analysis of the datasets, you should determine the target variable
· You should determine the appropriate algorithm
· You should make the calculations you consider to justify the validity of the model
Details of the task
· Individual
· The expected contents should include the description of the initial context, the description of the datasets, the goal of the data science project, and the description of the model generated, with the calculations associated.
· The deliverable is:
o a presentation with two parts
§ part 1, a presentation including all the slides you consider necessary, presenting the analysis, the generated model and the calculations
§ part 2, additional slides to make a reflection which focusses on how the results could be interpreted and includes the conclusions about what you have learn regarding data analytic thinking
o the deliverable must be a single pdf file containing both parts
Formalities:
· Wordcount: between 1.500 and 2.000 words
· Cover, Table of Contents, References and Appendix are excluded of the total wordcount.
· Font: Arial 12,5 pts.
· Text alignment: Justified.
· The in-text References and the Bibliography have to be in Harvard’s citation style.