Comprehensive Machine Learning Terminology Guide

WBOY
Release: 2024-07-26 12:58:51
Original
245 people have browsed it

Comprehensive Machine Learning Terminology Guide

Introduction

Welcome to the Comprehensive Machine Learning Terminology Guide! Whether you're a newcomer to the field of machine learning or an experienced practitioner looking to brush up on your vocabulary, this guide is designed to be your go-to resource for understanding the key terms and concepts that form the foundation of ML.


Fundamental Concepts

Machine Learning (ML): A subset of artificial intelligence that focuses on building systems that can learn from and make decisions based on data.

Artificial Intelligence (AI): The broader field of creating intelligent machines that can simulate human thinking capability and behavior.

Deep Learning: A subset of machine learning based on artificial neural networks with multiple layers.

Dataset: A collection of data used for training and testing machine learning models.

Feature: An individual measurable property or characteristic of a phenomenon being observed.

Label: The target variable that we're trying to predict in supervised learning.

Model: A mathematical representation of a real-world process, learned from data.

Algorithm: A step-by-step procedure or formula for solving a problem.

Training: The process of teaching a model to make predictions or decisions based on data.

Inference: Using a trained model to make predictions on new, unseen data.


Types of Machine Learning

Supervised Learning: Learning from labeled data to predict outcomes for unforeseen data.

Unsupervised Learning: Finding hidden patterns or intrinsic structures in input data without labeled responses.

Semi-Supervised Learning: Learning from a combination of labeled and unlabeled data.

Reinforcement Learning: Learning to make decisions by interacting with an environment.

Transfer Learning: Applying knowledge gained from one task to a related task.


Model Evaluation and Metrics

Accuracy: The proportion of correct predictions among the total number of cases examined.

Precision: The proportion of true positive predictions among all positive predictions.

Recall: The proportion of true positive predictions among all actual positive cases.

F1 Score: The harmonic mean of precision and recall.

ROC Curve: A graphical plot illustrating the diagnostic ability of a binary classifier system.

AUC (Area Under the Curve): A measure of the ability of a classifier to distinguish between classes.

Confusion Matrix: A table used to describe the performance of a classification model.

Cross-Validation: A resampling procedure used to evaluate machine learning models on a limited data sample.

Overfitting: When a model learns the training data too well, including noise and fluctuations.

Underfitting: When a model is too simple to capture the underlying structure of the data.


Neural Networks and Deep Learning

Neuron: The basic unit of a neural network, loosely modeled on the biological neuron.

Activation Function: A function that determines the output of a neuron given an input or set of inputs.

Weights: Parameters within a neural network that determine the strength of the connection between neurons.

Bias: An additional parameter in neural networks used to adjust the output along with the weighted sum of the inputs to the neuron.

Backpropagation: An algorithm for training neural networks by iteratively adjusting the network's weights based on the error in its predictions.

Gradient Descent: An optimization algorithm used to minimize the loss function by iteratively moving in the direction of steepest descent.

Epoch: One complete pass through the entire training dataset.

Batch: A subset of the training data used in one iteration of model training.

Learning Rate: A hyperparameter that controls how much to change the model in response to the estimated error each time the model weights are updated.

Convolutional Neural Network (CNN): A type of neural network commonly used for image recognition and processing.

循環神經網路 (RNN):一種旨在識別資料序列中的模式的神經網路。

長短期記憶(LSTM):一種能夠學習長期依賴關係的 RNN。

Transformer:完全依賴注意力機制來繪製輸入和輸出之間的全局相依性的模型架構。


特徵工程和選擇

特徵工程:利用領域知識從原始資料中提取特徵的過程。

特徵選擇:選擇相關特徵子集用於模型建構的過程。

降維:減少資料集中輸入變數數量的技術。

主成分分析(PCA):一種統計過程,使用正交變換將一組可能相關變數的觀察值轉換為一組線性不相關變數的值。


整合方法

整合學習:組合多個模型來解決計算智能問題的過程。

Bagging:一種使用訓練資料的多個子集來訓練不同模型的整合方法。

Boosting:一種將弱學習器組合起來創建強學習器的整合方法。

隨機森林:一種建立大量決策樹的整合學習方法。


自然語言處理(NLP)

標記化:將文字分解為單字或子詞的過程。

詞幹提取:將變形單字還原為其詞幹或詞根形式的過程。

詞形還原:將單字的不同變形形式分組在一起的過程。

詞嵌入:一種學習的文本表示,其中具有相似含義的單字具有相似的表示。

命名實體識別(NER):識別和分類文本中的命名實體的任務。

情緒分析:使用自然語言處理從文本中識別和提取主觀資訊。


強化學習

Agent:強化學習場景中的學習者或決策者。

環境:智能體運作與學習的世界。

狀態:環境中代理的當前狀況或狀況。

行動:代理人所做的舉動或決定。

獎勵:來自環境的回饋,用於評估代理所採取的行動。

策略:代理根據目前狀態決定下一步操作的策略。


先進理念

生成對抗網路(GAN):一類機器學習框架,其中兩個神經網路相互競爭。

注意力機制:一種模仿認知注意力的技術,增強輸入資料的重要部分並減少不相關部分。

遷移學習:機器學習中的一個研究問題,重點是儲存在解決一個問題時獲得的知識並將其應用於另一個不同但相關的問題。

少樣本學習:一種機器學習,其中模型經過訓練僅從幾個範例中識別新類別。

可解釋的人工智慧(XAI):人類可以理解結果的人工智慧系統。

聯邦學習:一種機器學習技術,可跨多個分散的設備或保存本地資料樣本的伺服器訓練演算法。

AutoML:將機器學習應用於現實世界問題的端到端流程自動化的過程。


結論

如果您正在閱讀本文,非常感謝您!我非常感激❤️。

在 Twitter appyzdl5 上關注我,獲取有關 ML 的定期更新、見解和引人入勝的對話。

我的 Github,包含從頭開始的 miniGit 和 ML 演算法等專案:@appyzdl

The above is the detailed content of Comprehensive Machine Learning Terminology Guide. For more information, please follow other related articles on the PHP Chinese website!

source:dev.to
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!