7089CEM: Introduction to Statistical Methods for Data Science

    Need Solution - Download from here

    7089CEM Assignment Help

    ย Introduction to Statistical Methods for Data Science Assignment help

    The โ€˜simulatedโ€™ EEG time-series data and the sound signal are provided in the two separate Excel files. The X.csv file contains the EEG signals ๐ฑ๐Ÿ and ๐ฑ๐Ÿ that were measured from the prefrontal and auditory cortices respectively; and the y.csv file contains the sound signal ๐ฒ (i.e. the voice of the mediation guide). The file time.csv contains the sampling time of all three signals in seconds. There are 2 minutes of signal in total collected with sampling frequency of 20 Hz. All signals are subject to additive noise (assuming independent and identically distributed (โ€œi.i.dโ€) Gaussian with zero-mean) with unknown variance due to distortions during recording.

    Task 1: Preliminary data analysis

    You should first perform an initial exploratory data analysis, by investigating:
    โ€ข Time series plots (of audio and EEG signals)
    โ€ข Distribution for each signal
    โ€ข Correlation and scatter plots (between the audio and brain signals) to examine their dependencies

    ย Regression โ€“ modelling the relationship between audio and EEG signals
    We would like to determine a suitable mathematical model in explaining the relationship between the audio signal ๐ฒ and the two brain signals ๐ฑ๐Ÿ and ๐ฑ๐Ÿ, assuming such a relationship can be described by a polynomial regression model. Below are 5 candidate nonlinear polynomial regression models, and only one of them can โ€˜trulyโ€™ describe such a relationship. The objective is to identify this โ€˜trueโ€™ model from those candidate models following Tasks 2.1 โ€“

    Candidate models are with the following structures:
    Model 1: y = ฮธ1×13+ฮธ2×25+ฮธbias+ฮต
    Model 2: y = ฮธ1๐‘ฅ14+ฮธ2๐‘ฅ22+ฮธbias+ ฮต
    Model 3: y = ฮธ1๐‘ฅ13+ฮธ2๐‘ฅ2+ฮธ3๐‘ฅ1+ฮธbias+ฮต
    Model 4: y = ฮธ1×1+ฮธ2×12+ฮธ3×13+ฮธ4×23+ฮธbias+ฮต
    Model 5: y = ฮธ1×13+ฮธ2×14+ฮธ3×2 +ฮธbias+ฮต

    Estimate model parameters ๐œฝ={๐œƒ1,๐œƒ2,โ‹ฏ,๐œƒ๐‘๐‘–๐‘Ž๐‘ }๐‘‡ for every candidate model using Least Squares (๐œฝฬ‚=(๐—๐‘‡๐—)โˆ’1๐—๐‘‡๐ฒ), using the provided input and output datasets (use all the data for training).

    Based on the estimated model parameters, compute the model residual (error) sum of squared errors (RSS), for every candidate model. ๐‘…๐‘†๐‘†=ฮฃ(๐‘ฆ๐‘–โˆ’๐ฑ๐‘–๐œฝฬ‚)2๐‘›๐‘–=1
    Here ๐ฑ๐‘– denotes the ๐‘–๐‘กโ„Ž row (๐‘–๐‘กโ„Ž data sample) in the input data matrix ๐—, ๐œฝฬ‚ is a column vector.

    Compute the log-likelihood function for every candidate model: ln๐‘(๐ท|๐œฝฬ‚)=โˆ’๐‘›2ln(2๐œ‹)โˆ’๐‘›2ln(๐œŽฬ‚2)โˆ’12๐œŽฬ‚2RSS

    By |2023-02-12T07:24:49+00:00February 12th, 2023|Categories: Database assignment help|Tags: |0 Comments

    Leave A Comment