> [!info] >This is General Introduction to Machine Learning > written by Anvesh G. Jhuboo ## Table of Contents - [[01-Fundamentals]] - [[02-Feature Engineering]] - [[03-Feature Selection Methods]] - [[04-Supervised Learning Regressors and Classifiers]] - [[05-Supervised Learning Evaluation Metrics for Classification]] - [[06-Supervised Learning Advanced Regressors and Classifiers]] - [[07-Unsupervised Learning Algorithms]] - [[08-Regularization and Hyperparameter Tuning]] - [[09-Ensemble Methods]] - [[10-ML Workflows]] ## Machine Learning Categories Machine learning can be broadly categorized into two main types: 1. **Supervised Learning** 2. **Unsupervised Learning** ### Supervised Learning Supervised learning involves training a model on a labeled dataset, meaning the data includes both the input features and the corresponding output labels. The goal is to learn a mapping from inputs to outputs that can be used to make predictions on new, unseen data. #### Key Concepts 1. **Regression:** - **Definition:** Predicting a continuous-valued output. - **Examples:** - Predicting housing prices in New York. - Estimating the value of cryptocurrencies. 2. **Classification:** - **Definition:** Predicting a discrete set of class labels. - **Examples:** - Determining if an image is of a human or a cyborg. - Identifying if an email is spam. #### Example: Credit Card Fraud Detection A supervised learning algorithm for detecting credit card fraud would be trained on a dataset of transactions, each labeled as either fraudulent or non-fraudulent. The model would learn to predict the likelihood of a transaction being fraudulent based on the input features. ### Unsupervised Learning Unsupervised learning involves training a model on data that does not have labeled outputs. The goal is to learn the underlying structure or patterns within the data. #### Key Concepts 1. **Clustering:** - **Definition:** Grouping data into clusters based on similarity. - **Examples:** - Clustering social network posts by topic. - Grouping consumers for personalized recommendations. - Organizing search engine results into related categories. #### Example: Social Media User Categorization A social media platform can use unsupervised learning to categorize users based on their engagement with different types of content. By collecting data on the number of hours users spend reading posts, watching videos, and engaging in virtual reality, the platform can use an algorithm like k-means clustering to segment users into distinct groups. ### Supervised vs. Unsupervised Learning Both supervised and unsupervised learning are essential for different scenarios and types of data. They reflect different ways of learning: - **Supervised Learning:** Similar to a teacher providing labeled examples, allowing the model to learn the mapping from inputs to outputs. - **Analogy:** Learning music genres by listening to labeled examples (e.g., "This is indie rock"). - **Applications:** Image classification, spam detection, and credit card fraud detection. - **Unsupervised Learning:** Similar to discovering patterns through observation without explicit labels. - **Analogy:** An alien observing human meals and categorizing foods without being told what each meal is. - **Applications:** Customer segmentation, anomaly detection, and topic modeling. ### Summary - **Supervised Learning:** - Data is labeled. - The program learns to predict the output from the input data. - Examples include regression and classification problems. - **Unsupervised Learning:** - Data is unlabeled. - The program learns to recognize the inherent structure in the input data. - Examples include clustering and dimensionality reduction. Understanding these fundamental differences is crucial for selecting the appropriate machine learning approach based on the problem at hand and the nature of the available data. Continue: [[02-Feature Engineering]]