Lesson 10:虚拟处理


Lesson 10 虚拟处理

这一讲将探讨硬件抽象的第三个支柱虚拟处理器(线程)。我们将了解系统如何打破“一个任务独占一个处理器”的假设,通过分时复用让成百上千个线程在有限的 CPU 上并发执行。

1. 从函数调用到线程抽象

要理解线程,首先要回顾函数调用的底层机制:

  • 栈帧:每个函数调用都会在栈上分配一块区域,保存局部变量、寄存器状态、参数以及最重要的返回地址(PC)
  • 执行流抽象:将“正在执行的计算”抽象为线程(Thread)
    • 核心状态:指令引用(PC)、环境引用(栈指针 SP + 寄存器)、指令集。
    • 本质:线程 = 栈顶(SP) + 栈底(Stack) + 状态描述符

2. 线程的生命周期:创建与销毁

为了管理线程,系统需要一套完整的生命周期原语,这里包含了一些非常聪明的系统设计技巧。

2.1 动态创建 (allocate_thread)

当系统需要一个新的执行流时:

  1. 分配内存:申请一块新的栈空间。
  2. 伪造现场:在新栈中预先压入两个栈帧:
    • 栈底:指向 exit_thread
    • 栈顶:指向用户指定的 starting_procedure
  3. 收益:当线程的主函数执行 return 时,CPU 会弹出栈底的返回地址,自动跳转到 exit_thread 进行自我销毁。无需程序员手动写清理代码。

2.2 销毁 (destroy_thread)

  • 自我销毁:线程执行完任务,调用 exit_thread,将状态标记为 freekill,然后主动让出 CPU。
  • 强制销毁:如果一个线程死循环了怎么办?外部调用 destroy(id),将其状态标记为 kill。下次该线程被调度到时,调度器会检查状态并顺手把它清理掉(延迟清理)。

3. 核心机制:切换与调度 (Yield)

如何在不打断计算逻辑的前提下,把 CPU 从一个线程给到另一个线程上?

为了清晰,我们将系统分为两层:

  • 线程层:运行普通的用户代码。
  • 处理器层(内核层):运行调度器代码,拥有特殊的权限。

当线程主动让出 CPU(调用 yield)时:

  1. enter_processor_layer:保存当前线程的现场(PC, SP, 通用寄存器),并将状态从 Running 改为 Runnable

  2. scheduler():调度器登场。它扫描线程表,寻找下一个状态为 Runnable 的待调度线程。

    也就是操作系统中的就绪状态

    • 问题:如果所有线程都暂停了,调度器运行在哪?
    • 解法:处理器层线程。系统启动时会为每个 CPU 创建一个特殊的空闲线程,专门用来跑调度器,确保 CPU 永远有事可做。
  3. exit_processor_layer:取出新线程的 SP,恢复其寄存器。

  4. 跳转:CPU 载入新线程的 PC,瞬间恢复到新线程上次暂停的地方继续执行。

4. 抢占式调度:强制的公平

依赖线程自觉调用 yield(协作式调度)是不靠谱的,恶意程序可以写个 while(1) 独占 CPU。我们需要抢占式调度

4.1 机制:时钟中断

  • 硬件定时器:每隔几毫秒触发一次中断。
  • 强制 Yield:中断处理程序会强行打断当前线程,假装它调用了 yield,把控制权交给内核。

4.2 并发挑战:死锁

引入中断后,系统仍然面临着死锁 风险:

  • 场景:线程 A 正在调用 yield,获取了线程表锁准备切换状态。

  • 意外:此时突然来了个中断!

  • 死锁:中断处理程序也要访问线程表,于是它试图再次加锁

    • 结果:线程 A 拿着锁等中断结束,中断处理程序等着锁释放。CPU 挂死。
  • 对策关中断

    • 原则:在获取任何自旋锁之前,必须先禁用本地中断。 只有解锁后,才能重新开启中断。

      这只是解除死锁的一种方式,更加详细的解法在操作系统课中还会学到。

5. 更好的模块化:线程 + 虚拟内存

到目前为止,系统只切换了执行流(SP/PC),但线程间还共享着同一个内存空间(假设 3)。为了实现真正的隔离(Lesson 06 的目标),我们需要引入 PMAR(Page Map Address Register)

  • 完全体切换
    exit_processor_layer 恢复上下文时,不仅要恢复 SP 和 PC,还要将 PMAR 寄存器指向新线程的页表基地址。
  • 效果
    新线程醒来后,不仅代码位置变了,内存地址空间(相当于其整个所处世界)都变了。这就是虚拟计算机的完全形态。

6. 对照

Lesson 10 Virtual Processors

This lesson explores the third pillar of hardware abstraction: Virtual Processors (Threads). We will learn how the system breaks the assumption that “one task exclusively owns one processor” and uses Time-sharing to allow hundreds of threads to execute concurrently on limited CPUs.

1. From Function Calls to Thread Abstraction

To understand threads, we must first review the underlying mechanism of function calls:

  • Stack Frame: Every function call allocates a region on the stack to save local variables, register states, parameters, and most importantly, the Return Address (PC).
  • Execution Flow Abstraction: Abstracting “a computation in progress” as a Thread.
    • Core State: Instruction Reference (PC), Environment Reference (Stack Pointer SP + Registers), Instruction Set.
    • Essence: Thread = Stack Top (SP) + Stack Bottom (Stack) + State Descriptor.

2. Thread Lifecycle: Creation and Destruction

To manage threads, the system needs a complete set of lifecycle primitives, involving some very clever system design tricks.

2.1 Dynamic Creation (allocate_thread)

When the system needs a new execution flow:

  1. Allocate Memory: Request a new stack space.
  2. Fabricate the Scene: Pre-push two stack frames onto the new stack:
    • Stack Bottom: Points to exit_thread.
    • Stack Top: Points to the user-specified starting_procedure.
  3. Benefit: When the thread’s main function executes return, the CPU pops the return address from the stack bottom and automatically jumps to exit_thread for self-destruction. Programmers don’t need to write manual cleanup code.
2.2 Destruction (destroy_thread)
  • Self-Destruction: After finishing its task, the thread calls exit_thread, marks its state as free or kill, and actively yields the CPU.
  • Forced Destruction: What if a thread enters an infinite loop? An external call to destroy(id) marks its state as kill. The next time the scheduler picks this thread, it checks the state and cleans it up (lazy cleanup).

3. Core Mechanism: Switching & Scheduling (Yield)

How do we pass the CPU from one thread to another without interrupting the computation logic?

For clarity, we divide the system into two layers:

  • Thread Layer: Runs ordinary user code.
  • Processor Layer (Kernel Layer): Runs scheduler code and possesses special privileges.

When a thread actively yields the CPU (calls yield):

  1. enter_processor_layer: Save the current thread’s context (PC, SP, General Purpose Registers) and change its state from Running to Runnable.

  2. scheduler(): The scheduler enters the stage. It scans the Thread Table to find the next thread waiting to be scheduled (state is Runnable).

    That is, the “Ready” state in Operating Systems.

    • Problem: If all threads are paused, where does the scheduler run?
    • Solution: Processor Layer Thread. Upon startup, the system creates a special idle thread for each CPU specifically to run the scheduler, ensuring the CPU always has something to do.
  3. exit_processor_layer: Retrieve the new thread’s SP and restore its registers.

  4. Jump: The CPU loads the new thread’s PC, instantly restoring execution to where the new thread last paused.

4. Preemptive Scheduling: Enforced Fairness

Relying on threads to voluntarily call yield (Cooperative Scheduling) is unreliable; malicious programs could write a while(1) loop to hog the CPU. We need Preemptive Scheduling.

4.1 Mechanism: Clock Interrupts
  • Hardware Timer: Triggers an interrupt every few milliseconds.
  • Forced Yield: The interrupt handler forcibly interrupts the current thread, “pretends” it called yield, and hands control over to the kernel.
4.2 Concurrency Challenge: Deadlock

After introducing interrupts, the system still faces the risk of Deadlock:

  • Scenario: Thread A is calling yield, has acquired the Thread Table Lock, and is preparing to switch states.

  • Accident: Suddenly, an interrupt arrives!

  • Deadlock: The interrupt handler also needs to access the thread table, so it attempts to acquire the lock again.

    • Result: Thread A holds the lock waiting for the interrupt to finish; the interrupt handler waits for the lock to be released. The CPU hangs.
  • Countermeasure: Disable Interrupts.

    • Principle: Before acquiring any spinlock, local interrupts must be disabled. Interrupts can only be re-enabled after unlocking.

      This is just one way to resolve deadlocks; more detailed solutions will be learned in the Operating Systems course.

5. Better Modularity: Threads + Virtual Memory

So far, the system has only switched the execution flow (SP/PC), but threads still share the same memory space (Assumption 3). To achieve true isolation (the goal of Lesson 06), we need to introduce the PMAR (Page Map Address Register).

  • Complete Switching:
    When restoring context in exit_processor_layer, besides restoring SP and PC, we must also point the PMAR register to the new thread’s page table base address.
  • Effect:
    When the new thread wakes up, not only has the code location changed, but the memory address space (essentially its entire world) has changed. This is the complete form of a Virtual Computer.

Author: linda1729
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source linda1729 !
评论
  TOC