MSc in Machine Learning and AI
Koç Üniversitesi’nde Bilgisayar Mühendisliği okudum ve aynı zamanda Matematik alanında çift anadal yaptım. Ardından Imperial College London’da Machine Learning and Artificial Intelligence alanında yüksek lisansımı tamamladım. Bu süreçte yurtdışında farklı araştırma projelerinde yer aldım ve özellikle makine öğrenmesi, yapay zekâ ve veri bilimi konularında hem akademik hem de uygulamalı deneyim kazandım. Bu derste amacım, makine öğrenmesinin temel kavramlarını anlaşılır ve uygulamaya dönük bir şekilde sizlerle paylaşmak.
1799 TL
Probability Review
Counting and Probability
Conditional Probability and Independence
Bayes' Rule
Discrete Random Variables
Continuous Random Variables
Expected Value and Variance
Bernoulli and Binomial Distributions
Continuous Uniform Distribution
Exponential Distribution
Normal Distribution
Laplace and Logistic Distributions
Logistic Regression
Motivation
Probabilistic Interpretation
Parametric & Polynomial Regression
Binary Cross Entropy / Log-loss
Optimization with Gradient Descent
Classification with Logistic Regression
Summary & Multi-Class Logistic Regression
Why Not Regression for Classification?
SVM
What & Why
Maximum Margin Classification
Maximizing the Margin
Lagrangian Formulation of the Hard-Margin SVM
From Primal to Dual: Solving the SVM Optimization
Why only a few points matter (KKT & sparsity)
From 𝛼 to parameters
Prediction uses only support vectors
Soft Margin SVM
Soft Margin Dual
Margin, distance, and support vectors
Solving a tiny SVM dual problem (linear kernel)
Polynomial kernel and feature map
Multilayer Perceptrons
Perceptron
Training a Perceptron
Limitation: XOR
MLP Architecture & Representation View
Backpropagation
Regression
Discrimination
MLP with Hard-Threshold Units
Should we initialize all MLP weights to zero?
Neural Networks and Deep Learning
Introduction to Deep Learning & Activation Functions
Training Deep Networks
Regularization Techniques
Tuning Network Structure
Learning Time
Time-Delay Neural Networks (TDNN)
RNN / LSTM / GRU
Generative Adversarial Networks (GANs)
Convolution vs Fully Connected
Forward pass in a small neural network
Softmax and cross-entropy
Vanishing gradients (True/False with explanation)
Decision Trees
What is a Decision Tree?
Splitting in Classification Trees
Pruning Trees
From Trees to Rules
Multivariate/Oblique Trees
Choosing Between Two Splits: Gini vs. Misclassification
Ensemble Learning
Why Combine Multiple Learners?
Voting & Linear Combination
Bayesian Perspective & Effect of Dependence
Fixed Combination Rules & ECOC
Bagging & AdaBoost
Mixture of Experts and Stacking
Fine-Tuning an Ensemble
Cascading
Combining Multiple Sources/Views
Sample Final Questions
Pass Rates & Majors (Bayes; Law of Total Probability)
Weighted Least Squares (Closed-Form Solution, Matrix View & Interpretation)
MLE for α (positive support, exponential tail)
Linear Discriminant with Equal Variance
One Shared Network vs. Three Separate Networks
Naive Histogram Estimator vs. Parzen Windows (Kernel)
Kernel Smoother
Naive Density Estimator (Bandwidth effect & validity)
Comparing Two Splits (Gini vs. Misclassification)
Prepruning vs. Postpruning (Which and Why?)
Sample Final Questions II
From Binary to Multiclass: One-vs-All / One-vs-One with a Binary Classifier
Max-shift for SoftMax
Why Initialize Weights Near Zero?
Adaptive Learning Rates in Gradient Descent
When Do Direct Input Output Links Help in an MLP?
Mahalanobis vs. Euclidean: Why and When?
Discrete Attribute in Decision Trees
Regularized Least Squares
Gaussian Generative Model → Logistic Posterior
Naive Bayes Text Classification with Binary Features
Derivative of Softmax
Kernel Density Estimation
Past Exam Questions
Single-Neuron Sigmoid + MSE
Decision Trees: Gini Impurity Split Comparison
Decision Trees: Entropy & Information Gain Split Comparison
MLE for a Discrete PMF
Kernel Engineering
1-NN LOOCV on Patient Dataset
Linear Regression + MSE: Gradient Descent Step Size Effects
1799 TL