Youtube Statistics EDA
- Conducted extensive data preprocessing techniques to prepare the dataset for machine learning analysis:
- One-Hot Encoding: Transformed categorical variables into a format that could be provided to ML algorithms to improve their performance.
- Standardization: Scaled the features to have a mean of zero and a standard deviation of one, enhancing the convergence of gradient-based algorithms.
- Dimensionality Reduction: Applied techniques like PCA to reduce the number of features, minimizing redundancy and improving computational efficiency.
- Optimized Machine Learning algorithms to minimize Mean Square Error (MSE):
- Evaluated and fine-tuned various models to ensure the lowest possible MSE, enhancing the prediction accuracy of the models.
- Provided strategic guidance for maximizing YouTube channel earnings:
- Analyzed the factors influencing earnings and offered actionable recommendations to content creators.
- Leveraged advanced machine learning models for prediction:
- Random Forest Regression:
- Implemented and tuned a Random Forest model which provided robust predictions due to its ensemble learning approach.
- Artificial Neural Network (ANN) model:
- Developed and optimized an ANN model that captured complex non-linear relationships within the data.
- Both Random Forest Regression and ANN outperformed other methods, delivering the most accurate predictions.
- Random Forest Regression:
- Provided valuable insights for content creators:
- Used the predictions from the models to identify key factors that maximize YouTube earnings.
- Offered data-driven advice to optimize content strategy, enhancing revenue generation for creators.