ENGR 421 • Final • Introduction to Machine Learning
Eğitmen
Nursena Köprücü Aslan
MSc in Machine Learning and AI
Koç Üniversitesi’nde Bilgisayar Mühendisliği okudum ve aynı zamanda Matematik alanında çift anadal yaptım. Ardından Imperial College London’da Machine Learning and Artificial Intelligence alanında yüksek lisansımı tamamladım. Bu süreçte yurtdışında farklı araştırma projelerinde yer aldım ve özellikle makine öğrenmesi, yapay zekâ ve veri bilimi konularında hem akademik hem de uygulamalı deneyim kazandım. Bu derste amacım, makine öğrenmesinin temel kavramlarını anlaşılır ve uygulamaya dönük bir şekilde sizlerle paylaşmak.
Paketi Tamamla
🎓 Koç Üniversitesinde öğrencilerin %92'si tüm paketi alarak çalışıyor.
Konular
Probability Review
Counting and Probability
Conditional Probability and Independence
Bayes' Rule
Discrete Random Variables
Continuous Random Variables
Expected Value and Variance
Bernoulli and Binomial Distributions
Continuous Uniform Distribution
Exponential Distribution
Normal Distribution
Laplace and Logistic Distributions
Dimensionality Reduction
Dimensionality Reduction
Principal Component Analysis (PCA)
PCA: Choose k Using Proportion of Variance
Feature Embedding & Factor Analysis (FA)
Singular Value Decomposition and Matrix Factorization
Multidimensional Scaling
Linear Discriminant Analysis (LDA)
Canonical Correlation Analysis
Isomap, Locally Linear Embedding, Laplacian Eigenmaps
Which Dimensionality Reduction Method and Why?
Clustering
Introduction and Mixture Densities
K-Means Clustering
One Iteration of k-Means
Expectation-Maximization (EM)
Mixture Models & Practical Use of Clusters
Spectral and Hierarchical Clustering
Choose the Right Clustering Tool
“Clustering as Preprocessing” Pitfall
Combining Multiple Learners
Why Combine Multiple Learners?
Voting & Linear Combination
Bayesian Perspective & Effect of Dependence
Fixed Combination Rules & ECOC
Bagging & AdaBoost
Mixture of Experts and Stacking
Fine-Tuning an Ensemble
Cascading
Combining Multiple Sources/Views
Which Ensemble Method Fits?
Correlation vs Ensemble Gain
Design and Analysis of ML Experiments
Why do we run ML experiments?
Algorithm preference
Factors & Response
Guideline
Spot the Leakage: Is This Cross-Validation Setup Valid?
Fix the Experiment: Where Does Each Step Belong?
Sample Final Questions I
Pass Rates & Majors (Bayes; Law of Total Probability)
Weighted Least Squares (Closed-Form Solution, Matrix View & Interpretation)
MLE for α (positive support, exponential tail)
Linear Discriminant with Equal Variance
MLP with Hard-Threshold Units
Should we initialize all MLP weights to zero?
One Shared Network vs. Three Separate Networks
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Comparing Two Splits (Gini vs. Misclassification)
Prepruning vs. Postpruning (Which and Why?)
Sample Final Questions II
Why Not Regression for Classification?
From Binary to Multiclass: One-vs-All / One-vs-One with a Binary Classifier
Max-shift for SoftMax
Why Initialize Weights Near Zero?
Adaptive Learning Rates in Gradient Descent
When Do Direct Input Output Links Help in an MLP?
Mahalanobis vs. Euclidean: Why and When?
Discrete Attribute in Decision Trees
Regularized Least Squares
Gaussian Generative Model → Logistic Posterior
Naive Bayes Text Classification with Binary Features
Derivative of Softmax
Kernel Density Estimation
Choosing Between Two Splits: Gini vs. Misclassification
Sample Final Questions III
Mean Square Error for Linear Regression
Gradient Descent Update
k-NN Regression Prediction
Decision Boundary and Building a Network for Binary Classification
Derivative of Squared Error
Computing Input and Output of a Convolution Node
True/False Reasoning on Activation, Linear Networks, and Gradient Descent
Computing Total Probability
True/False on Scaling, k-NN, Intrinsic Error and Model Complexity
Output Size of a Conv Layer
Regression: Test-Set MSE
Generalization & Overfitting: True/False
Baseline Error: ZeroR vs Random Guessing
Entropy: Fair Die & Bias Effect
Decision Trees: ID3 Optimality + Key Advantage
Decision Tree Split: Remaining Entropy
Değerlendirmeler
Henüz hiç değerlendirme yok.
Ders İçeriği
Probability Review
Counting and Probability
Conditional Probability and Independence
Bayes' Rule
Discrete Random Variables
Continuous Random Variables
Expected Value and Variance
Bernoulli and Binomial Distributions
Continuous Uniform Distribution
Exponential Distribution
Normal Distribution
Laplace and Logistic Distributions
Dimensionality Reduction
Dimensionality Reduction
Principal Component Analysis (PCA)
PCA: Choose k Using Proportion of Variance
Feature Embedding & Factor Analysis (FA)
Singular Value Decomposition and Matrix Factorization
Multidimensional Scaling
Linear Discriminant Analysis (LDA)
Canonical Correlation Analysis
Isomap, Locally Linear Embedding, Laplacian Eigenmaps
Which Dimensionality Reduction Method and Why?
Clustering
Introduction and Mixture Densities
K-Means Clustering
One Iteration of k-Means
Expectation-Maximization (EM)
Mixture Models & Practical Use of Clusters
Spectral and Hierarchical Clustering
Choose the Right Clustering Tool
“Clustering as Preprocessing” Pitfall
Combining Multiple Learners
Why Combine Multiple Learners?
Voting & Linear Combination
Bayesian Perspective & Effect of Dependence
Fixed Combination Rules & ECOC
Bagging & AdaBoost
Mixture of Experts and Stacking
Fine-Tuning an Ensemble
Cascading
Combining Multiple Sources/Views
Which Ensemble Method Fits?
Correlation vs Ensemble Gain
Design and Analysis of ML Experiments
Why do we run ML experiments?
Algorithm preference
Factors & Response
Guideline
Spot the Leakage: Is This Cross-Validation Setup Valid?
Fix the Experiment: Where Does Each Step Belong?
Sample Final Questions I
Pass Rates & Majors (Bayes; Law of Total Probability)
Weighted Least Squares (Closed-Form Solution, Matrix View & Interpretation)
MLE for α (positive support, exponential tail)
Linear Discriminant with Equal Variance
MLP with Hard-Threshold Units
Should we initialize all MLP weights to zero?
One Shared Network vs. Three Separate Networks
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Comparing Two Splits (Gini vs. Misclassification)
Prepruning vs. Postpruning (Which and Why?)
Sample Final Questions II
Why Not Regression for Classification?
From Binary to Multiclass: One-vs-All / One-vs-One with a Binary Classifier
Max-shift for SoftMax
Why Initialize Weights Near Zero?
Adaptive Learning Rates in Gradient Descent
When Do Direct Input Output Links Help in an MLP?
Mahalanobis vs. Euclidean: Why and When?
Discrete Attribute in Decision Trees
Regularized Least Squares
Gaussian Generative Model → Logistic Posterior
Naive Bayes Text Classification with Binary Features
Derivative of Softmax
Kernel Density Estimation
Choosing Between Two Splits: Gini vs. Misclassification
Sample Final Questions III
Mean Square Error for Linear Regression
Gradient Descent Update
k-NN Regression Prediction
Decision Boundary and Building a Network for Binary Classification
Derivative of Squared Error
Computing Input and Output of a Convolution Node
True/False Reasoning on Activation, Linear Networks, and Gradient Descent
Computing Total Probability
True/False on Scaling, k-NN, Intrinsic Error and Model Complexity
Output Size of a Conv Layer
Regression: Test-Set MSE
Generalization & Overfitting: True/False
Baseline Error: ZeroR vs Random Guessing
Entropy: Fair Die & Bias Effect
Decision Trees: ID3 Optimality + Key Advantage
Decision Tree Split: Remaining Entropy
Sıkça Sorulan Sorular
Örneğin, Koç Üniversitesi - MATH 101 (Calculus) veya başka bir okulun benzer dersi olsun, paketlerimiz tam da o derse göre tasarlanır. Böylece nokta atışı çalışır, zaman kazanırsın.
Sınava özel videolar —konu anlatımları, çıkmış sorular ve çözümleri, özet notlar—içerir. Sınavda sıkça çıkan soruları hedefler. Eğitmenlerimiz, üniversitenin akademik takvimini takip ederek paketleri sürekli günceller. Böylece, gereksiz detaylarla vakit kaybetmeden başarını artırmaya odaklanabilirsin.
