ENGR 421 • Tüm Sınavlar • Introduction to Machine Learning
Koç Üniversitesi ENGR 421 (Introduction to Machine Learning) Midterm I sınavına hazırlık paketi.
İşlenen konular: Supervised Learning, Parametric Methods, Multivariate Methods, Linear Discrimination, Multilayer Perceptrons, Deep Learning.
Ayda 1899 TL, peşin fiyatına 3 taksit
Eğitmen

Nursena Köprücü Aslan
PhD in Computer Science
Koç Üniversitesi’nde Bilgisayar Mühendisliği okudum ve aynı zamanda Matematik alanında çift anadal yaptım. Ardından Imperial College London’da Machine Learning and Artificial Intelligence alanında yüksek lisansımı tamamladım. Şu anda University of Cambridge'te doktora çalışmalarımı sürdürüyorum.
Konular
Introduction
3 konu anlatımı
Introduction to Machine Learning
Machine Learning Notation Explained
Machine Learning Preliminaries
Supervised Learning
4 konu anlatımı
Why Supervised?
Hypothesis Space & Occam's Razor
Loss Functions: Measuring Mistakes
Example: Least-Squares Linear Regression
Probability Review
11 konu anlatımı
Counting and Probability
Conditional Probability and Independence
Bayes' Rule
Discrete Random Variables
Continuous Random Variables
Expected Value and Variance
Bernoulli and Binomial Distributions
Continuous Uniform Distribution
Exponential Distribution
Normal Distribution
Laplace and Logistic Distributions
Parametric Methods
4 konu anlatımı · 4 soru
Maximum Likelihood Estimation(MLE)
Bernoulli Likelihood
Multinomial Likelihood and Smoothing
Bayes' Theorem
Parametric Classification
Unequal Variances → Quadratic Boundary
Gaussian Classification Boundary
Parametric & Polynomial Regression
Multivariate Methods
3 konu anlatımı · 4 soru
Modeling Multivariate Data: Estimation, Normal Distributions, and Naive Bayes
Multivariate Classification: Linear, Quadratic, and Model Selection
Discrete Features & Multivariate Regression
Sample Mean and Covariance Matrix
Mahalanobis Distance
LDA vs QDA Classification
Naive Bayes Classification (Discrete Features)
Linear Discrimination
10 konu anlatımı · 2 soru
The Problem
Linear Discriminant
Two Classes/Multiple Classes/Pairwise Seperation
From Discriminants to Posteriors
Gradient Descent
Gradient Descent Update
Linear Regression + MSE: Gradient Descent Step Size Effects
Logistic Discrimination: Two Classes
Logistic Discrimination: K>2 Classes
Generalizing the Linear Model
Discrimination by Regression
Learning to Rank
Multilayer Perceptrons
7 konu anlatımı · 1 soru
Perceptron
Training a Perceptron
Limitation: XOR
MLP Architecture & Representation View
Backpropagation
Regression
Discrimination
Should we initialize all MLP weights to zero?
Deep Learning
8 konu anlatımı
Introduction to Deep Learning & Activation Functions
Training Deep Networks
Regularization Techniques
Tuning Network Structure
Learning Time
Time-Delay Neural Networks (TDNN)
RNN / LSTM / GRU
Generative Adversarial Networks (GANs)
Sample Midterm Questions I
9 soru
Pass Rates & Majors (Bayes; Law of Total Probability)
Weighted Least Squares (Closed-Form Solution, Matrix View & Interpretation)
MLE for α (positive support, exponential tail)
Linear Discriminant with Equal Variance
MLP with Hard-Threshold Units
One Shared Network vs. Three Separate Networks
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Sample Midterm Questions II
11 soru
Why Not Regression for Classification?
From Binary to Multiclass: One-vs-All / One-vs-One with a Binary Classifier
Max-shift for SoftMax
Why Initialize Weights Near Zero?
Adaptive Learning Rates in Gradient Descent
When Do Direct Input Output Links Help in an MLP?
Mahalanobis vs. Euclidean: Why and When?
Regularized Least Squares
Gaussian Generative Model → Logistic Posterior
Naive Bayes Text Classification with Binary Features
Derivative of Softmax
Bonus Questions
10 soru
Model Selection Using Validation Performance and Test MSE
k-NN Decision Boundaries and the Effect of k
Mean Square Error for Linear Regression
Gradient Descent Update
k-NN Regression Prediction
Decision Boundary and Building a Network for Binary Classification
Derivative of Squared Error
True/False Reasoning on Activation, Linear Networks, and Gradient Descent
Computing Total Probability
True/False on Scaling, k-NN, Intrinsic Error and Model Complexity
Non-parametric Methods
7 konu anlatımı · 3 soru
What Nonparametric Means
Density Estimation
Nonparametric Classification
Condensed Nearest Neighbor
Outlier Detection
Nonparametric Regression
Additive Models & How to chose h/k
Histogram density estimator
Naive / uniform kernel density estimator
k-nearest neighbor (k-NN) classifier in 1D
Decision Trees
5 konu anlatımı · 1 soru
What is a Decision Tree?
Splitting in Classification Trees
Pruning Trees
From Trees to Rules
Multivariate/Oblique Trees
Choosing Between Two Splits: Gini vs. Misclassification
Kernel Machines
10 konu anlatımı · 3 soru
What & Why
Maximum Margin Classification
Maximizing the Margin
Lagrangian Formulation of the Hard-Margin SVM
From Primal to Dual: Solving the SVM Optimization
Why only a few points matter (KKT & sparsity)
From 𝛼 to parameters
Prediction uses only support vectors
Soft Margin SVM
Soft Margin Dual
Margin, distance, and support vectors
Solving a tiny SVM dual problem (linear kernel)
Polynomial kernel and feature map
Dimensionality Reduction
8 konu anlatımı · 5 soru
Dimensionality Reduction
Principal Component Analysis (PCA)
True/False on Feature Selection and PCA Fundamentals
Derivation of the PCA Objective via Lagrange Multipliers
Numerical Computation of the First Principal Component
Feature Embedding & Factor Analysis (FA)
Singular Value Decomposition and Matrix Factorization
Multidimensional Scaling
Linear Discriminant Analysis (LDA)
LDA Objective and Its Contrast with PCA
Canonical Correlation Analysis
Isomap, Locally Linear Embedding, Laplacian Eigenmaps
Forward vs Backward Selection Trade-offs
Sample Midterm Questions I
9 soru
Linear Discriminant with Equal Variance
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Comparing Two Splits (Gini vs. Misclassification)
Prepruning vs. Postpruning (Which and Why?)
Discrete Attribute in Decision Trees
Kernel Density Estimation
Naive Bayes Text Classification with Binary Features
Sample Midterm Questions II
7 soru
Decision Trees: Gini Impurity Split Comparison
Decision Trees: Entropy & Information Gain Split Comparison
Kernel Engineering
1-NN LOOCV on Patient Dataset
k-NN Regression Prediction
Decision Boundary and Building a Network for Binary Classification
True/False on Scaling, k-NN, Intrinsic Error and Model Complexity
Dimensionality Reduction
8 konu anlatımı · 2 soru
Dimensionality Reduction
Principal Component Analysis (PCA)
PCA: Choose k Using Proportion of Variance
Feature Embedding & Factor Analysis (FA)
Singular Value Decomposition and Matrix Factorization
Multidimensional Scaling
Linear Discriminant Analysis (LDA)
Canonical Correlation Analysis
Isomap, Locally Linear Embedding, Laplacian Eigenmaps
Which Dimensionality Reduction Method and Why?
Clustering
5 konu anlatımı · 3 soru
Introduction and Mixture Densities
K-Means Clustering
One Iteration of k-Means
Expectation-Maximization (EM)
Mixture Models & Practical Use of Clusters
Spectral and Hierarchical Clustering
Choose the Right Clustering Tool
“Clustering as Preprocessing” Pitfall
Combining Multiple Learners
9 konu anlatımı · 2 soru
Why Combine Multiple Learners?
Voting & Linear Combination
Bayesian Perspective & Effect of Dependence
Fixed Combination Rules & ECOC
Bagging & AdaBoost
Mixture of Experts and Stacking
Fine-Tuning an Ensemble
Cascading
Combining Multiple Sources/Views
Which Ensemble Method Fits?
Correlation vs Ensemble Gain
Design and Analysis of ML Experiments
4 konu anlatımı · 2 soru
Why do we run ML experiments?
Algorithm preference
Factors & Response
Guideline
Spot the Leakage: Is This Cross-Validation Setup Valid?
Fix the Experiment: Where Does Each Step Belong?
Sample Final Questions I
12 soru
Pass Rates & Majors (Bayes; Law of Total Probability)
Weighted Least Squares (Closed-Form Solution, Matrix View & Interpretation)
MLE for α (positive support, exponential tail)
Linear Discriminant with Equal Variance
MLP with Hard-Threshold Units
Should we initialize all MLP weights to zero?
One Shared Network vs. Three Separate Networks
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Comparing Two Splits (Gini vs. Misclassification)
Prepruning vs. Postpruning (Which and Why?)
Sample Final Questions III
16 soru
Mean Square Error for Linear Regression
Gradient Descent Update
k-NN Regression Prediction
Decision Boundary and Building a Network for Binary Classification
Derivative of Squared Error
Computing Input and Output of a Convolution Node
True/False Reasoning on Activation, Linear Networks, and Gradient Descent
Computing Total Probability
True/False on Scaling, k-NN, Intrinsic Error and Model Complexity
Output Size of a Conv Layer
Regression: Test-Set MSE
Generalization & Overfitting: True/False
Baseline Error: ZeroR vs Random Guessing
Entropy: Fair Die & Bias Effect
Decision Trees: ID3 Optimality + Key Advantage
Decision Tree Split: Remaining Entropy