Skip to Content
大型语言模型的隐私与安全 (Chinese Edition)
book

大型语言模型的隐私与安全 (Chinese Edition)

by Baihan Lin
January 2026
Beginner to intermediate
318 pages
3h 38m
Chinese
O'Reilly Media, Inc.
Content preview from 大型语言模型的隐私与安全 (Chinese Edition)

第二章 理解 大型语言模型

本作品已使用人工智能进行翻译。欢迎您提供反馈和意见:translation-feedback@oreilly.com

近年来,大型语言模型(LLMs)作为一项突破性技术在自然语言处理(NLP)领域崭露头角。这些强大的模型彻底改变了机器理解、生成和处理人类语言的方式,催生了语言翻译、文本摘要、问答系统和内容创作等广泛应用场景。 本章将深入探讨LLMs的基础原理,涵盖其架构设计、预训练技术、评估指标,以及开发部署过程中涉及的隐私与安全评估。

大型语言模型基础

LLMs是一类专为处理和生成人类语言设计的Deep Learning模型。它们通常基于海量文本数据进行训练,从而能够以前所未有的规模学习语言的复杂性与规律。LLMs具备捕捉文本语义含义、语法结构及语境细微差别的能力,使其在各类自然语言处理任务中表现卓越。

语言模型的基本构建模块

首先我们将从微观到宏观层面解析语言模型的基础构件。经验丰富的读者可根据需要跳过部分章节。后续章节将深入探讨特定层级,如基于人类反馈的微调与强化学习。

神经网络

语言模型的核心是人工神经网络(ANN)。ANN是受人脑结构与功能启发的计算模型,由分层互联的节点(神经元)构成,每个节点对输入进行简单运算后将结果传递至下一层。

神经元是ANN的基本单元,它接收输入,应用权重系数,再通过激活函数处理加权和以生成输出。ANN由多层神经元构成,各层通过加权连接相互关联(图2-1)。 上一层的输出即为下一层的输入,使网络能够学习数据中的复杂模式与关联关系。通过调整神经元间连接的权重,网络可学会将输入模式映射至预期输出。

Artificial neural network (ANN) architecture. The network consists of multiple layers of interconnected neurons, with each neuron applying a weight to its inputs and passing the result through an activation function to produce an output.
图2-1. 人工神经 网络(ANN)架构
提示

在使用神经网络进行语言建模时,请考虑以下要点:

  • 根据具体任务和输入数据特性选择合适的网络架构。语言建模的常见架构包括循环神经网络(RNN)、长短期记忆网络(LSTM)以及基于Transformer的模型,这些将在后续章节中详细阐述。

  • 通过调整层数、隐藏单元数量、激活函数等超参数进行实验(),以寻找最优配置。超参数选择本身即是一门艺术,但神经架构搜索等技术可辅助该过程。

  • 通过对网络进行正则化以防止过拟合,可采用dropout、L1/L2正则化或等技术实现。1

循环神经网络

循环神经网络(RNN)是一种特别适合处理序列数据(如文本)的神经网络架构。与独立处理输入的前馈神经网络不同,RNN通常以顺序方式逐个处理令牌(图2-2),并通过内部状态机制捕捉序列元素间的依赖关系。

Diagram illustrating a recurrent neural network (RNN) processing sequential data, highlighting the flow of inputs, model states, and outputs over time steps.
图2-2. RNN模型的序列 处理流程。输入以单词为单位逐次输入到RNN中,输出也以单词为单位逐次生成。

在RNN中,每个时间步的输出不仅取决于当前输入,还依赖于前一隐藏状态。这使网络能够保持对历史输入的"记忆",并学习数据中的时间依赖关系。

传统上,RNN被广泛应用于各类自然语言处理任务,例如:

语言建模

基于前文预测序列中下一个词。在后续讨论语言模型训练方法时,您将更深入理解此机制。

机器翻译

将文本从一种语言转换为另一种语言。其工作原理直观易懂:将源语言句子作为序列输入模型,模型随即按顺序生成目标语言中构成该句子的词汇。 ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

产品思维工程师 (Chinese Edition)

产品思维工程师 (Chinese Edition)

Drew Hoskins

Publisher Resources

ISBN: 0642572313869