7153CEM – M138CEM Assignment Help
Big Data Analytics and Data Visualisation Assignment help
Task:
1. Select a dataset of your choice from Kaggle. The dataset should be suitable for Big Data analytics
(Please see description below).
2. Use PySpark (exclusively) to analyze the dataset. You should perform at least one of the following
data analysis tasks (regression, clustering, classification, etc). You have to explain your choice of the
techniques used.
3. Use Tableau (exclusively) to explore your dataset and/or to show the results of your analysis. All
figures must be created using Tableau
4. Critically analyze your findings: the results and the methods used.
Procedure:
You have to write a project proposal (maximum of 1 A4 page), giving the title of the project, a
brief description of the problem and the tasks you plan to apply, and the dataset you are using
This document is for Coventry University students for their own use in completing their
assessed work for this module and should not be passed to third parties or posted on any
website. Any infringements of this rule should be reported to
facultyregistry.eec@coventry.ac.uk.
(its name and a direct link to it, clear and detailed description of how you plan to use it), and a
brief work plan. You have to submit the proposal by 24 March 2023 at 18.00 via 7153CEM/
M138CEM project proposal link. It is your responsibility to make sure the dataset is suitable for
the CW according to the description, as this is part of the grade of the CW
No two students can work on the same dataset. Once you have submitted your project
proposal you have to IMMEDIATELY state, under the assigned post on Aula, that you have
chosen that dataset (direct link to it) so that no other student can work on it. Once you choose
a dataset, the choice is final (no resubmission of the project proposal is allowed), so you have
to choose carefully.
The dataset should be freely accessed (no registration is required). If there are several files in
the link, you have to clearly specify which one you plan to work on.
Before you choose a dataset, you have to check the assigned post on Aula to make sure that no
other student has chosen this dataset before you. It is your full responsibility to make sure no
other student has chosen it. If it turns out two or more students chose the same dataset, the
only project proposal that will be considered is the one that appears earlier on the
corresponding post on Aula.
The only place where you can state which dataset you have selected is the corresponding post
on Aula, and this is the only place students need to check to see if the dataset has already been
chosen by another student
If you haven’t submitted a project proposal by the project proposal deadline, or you have
submitted one but you changed your mind later and you wanted to work on another dataset,
you can still do that, but in this case you will get a zero on the project proposal. You also have
to follow the same procedures as above (check that dataset hasn’t been chosen before, and
then state your choice under the corresponding Aula post). You also have to explain, at the very
beginning of you CW, why you have changed the dataset (in the case you have submitted a
project proposal and changed your mind later)
Because your colleagues only have access to the Aula post to know which dataset you have
chosen, you get a zero on the project proposal if the dataset in the proposal doesn’t match the
one on the Aula post
Your final CW submission will include a report (up to 3000 words – strict limit) where you
present your work.
Leave A Comment