Call Us Today! +1-(760)-642-5721|plagfreework@gmail.com
4.9

# 7089CEM: Introduction to Statistical Methods for Data Science

## ย Introduction to Statistical Methods for Data Science Assignment help

Data:
The โsimulatedโ EEG time-series data and the sound signal are provided in the two separate Excel files. The X.csv file contains the EEG signals ๐ฑ๐ and ๐ฑ๐ that were measured from the prefrontal and auditory cortices respectively; and the y.csv file contains the sound signal ๐ฒ (i.e. the voice of the mediation guide). The file time.csv contains the sampling time of all three signals in seconds. There are 2 minutes of signal in total collected with sampling frequency of 20 Hz. All signals are subject to additive noise (assuming independent and identically distributed (โi.i.dโ) Gaussian with zero-mean) with unknown variance due to distortions during recording.

You should first perform an initial exploratory data analysis, by investigating:
โข Time series plots (of audio and EEG signals)
โข Distribution for each signal
โข Correlation and scatter plots (between the audio and brain signals) to examine their dependencies

ย Regression โ modelling the relationship between audio and EEG signals
We would like to determine a suitable mathematical model in explaining the relationship between the audio signal ๐ฒ and the two brain signals ๐ฑ๐ and ๐ฑ๐, assuming such a relationship can be described by a polynomial regression model. Below are 5 candidate nonlinear polynomial regression models, and only one of them can โtrulyโ describe such a relationship. The objective is to identify this โtrueโ model from those candidate models following Tasks 2.1 โ

Candidate models are with the following structures:
Model 1: y = ฮธ1×13+ฮธ2×25+ฮธbias+ฮต
Model 2: y = ฮธ1๐ฅ14+ฮธ2๐ฅ22+ฮธbias+ ฮต
Model 3: y = ฮธ1๐ฅ13+ฮธ2๐ฅ2+ฮธ3๐ฅ1+ฮธbias+ฮต
Model 4: y = ฮธ1×1+ฮธ2×12+ฮธ3×13+ฮธ4×23+ฮธbias+ฮต
Model 5: y = ฮธ1×13+ฮธ2×14+ฮธ3×2 +ฮธbias+ฮต

Estimate model parameters ๐ฝ={๐1,๐2,โฏ,๐๐๐๐๐ }๐ for every candidate model using Least Squares (๐ฝฬ=(๐๐๐)โ1๐๐๐ฒ), using the provided input and output datasets (use all the data for training).

Based on the estimated model parameters, compute the model residual (error) sum of squared errors (RSS), for every candidate model. ๐๐๐=ฮฃ(๐ฆ๐โ๐ฑ๐๐ฝฬ)2๐๐=1
Here ๐ฑ๐ denotes the ๐๐กโ row (๐๐กโ data sample) in the input data matrix ๐, ๐ฝฬ is a column vector.

Compute the log-likelihood function for every candidate model: ln๐(๐ท|๐ฝฬ)=โ๐2ln(2๐)โ๐2ln(๐ฬ2)โ12๐ฬ2RSS

By |2023-02-12T07:24:49+00:00February 12th, 2023|Categories: Database assignment help||0 Comments