Undefined behavior

From cppreference.net

如果违反了语言的某些规则，会使整个程序失去意义。

错误行为始终是不正确程序代码的后果。
常量表达式的求值永远不会导致错误行为。
若执行包含被指定具有错误行为的操作，允许并建议实现发出诊断信息，并允许在该操作之后的不确定时间终止执行。
若实现能根据对程序行为的实现特定假设集确定错误行为可达，则可发出诊断信息，这可能导致误报。

错误行为示例

#include <cassert>
#include <cstring>
void f()
{   
    int d1, d2;       // d1, d2 have erroneous values
    int e1 = d1;      // erroneous behavior
    int e2 = d1;      // erroneous behavior
    assert(e1 == e2); // holds
    assert(e1 == d1); // holds, erroneous behavior
    assert(e2 == d1); // holds, erroneous behavior
    std::memcpy(&d2, &d1, sizeof(int)); // no erroneous behavior, but
                                        // d2 has an erroneous value
    assert(e1 == d2); // holds, erroneous behavior
    assert(e2 == d2); // holds, erroneous behavior
}
unsigned char g(bool b)
{
    unsigned char c;     // c has erroneous value
    unsigned char d = c; // no erroneous behavior, but d has an erroneous value
    assert(c == d);      // holds, both integral promotions have erroneous behavior
    int e = d;           // erroneous behavior
    return b ? d : 0;    // erroneous behavior if b is true
}

(since C++26)

未定义行为 - 对程序的行为没有任何限制。

未定义行为的一些示例包括数据竞争、数组边界外的内存访问、有符号整数溢出、空指针解引用、同一表达式内对同一标量的多次修改（无任何中间序列点） (C++11 前) （且操作未排序） (C++11 起) 、通过不同类型指针访问对象等。
实现不要求诊断未定义行为（尽管许多简单情况会被诊断），且编译后的程序不要求执行任何有意义操作。

运行时未定义行为 - 该行为在除作为核心常量表达式求值期间发生外均属未定义。

(since C++11)

UB 与优化

由于正确的C++程序应当避免未定义行为，当实际存在UB的程序在启用优化的情况下编译时，编译器可能产生预期之外的结果：

例如，

有符号整数溢出

int foo(int x)
{
    return x + 1 > x; // 结果可能为真，也可能因有符号整数溢出导致未定义行为
}

可能被编译为 ( 演示 )

foo(int):
        mov     eax, 1
        ret

越界访问

int table[4] = {};
bool exists_in_table(int v)
{
    // 在前4次迭代中返回true，否则因越界访问导致未定义行为
    for (int i = 0; i <= 4; i++)
        if (table[i] == v)
            return true;
    return false;
}

可以编译为 ( 演示 )

exists_in_table(int):
        mov     eax, 1
        ret

未初始化标量

std::size_t f(int x)
{
    std::size_t a;
    if (x) // x非零或导致未定义行为
        a = 42;
    return a;
}

可以编译为 ( 演示 )

f(int):
        mov     eax, 42
        ret

所显示输出是在旧版本gcc上观察到的结果

运行此代码

#include <cstdio>
int main()
{
    bool p; // uninitialized local variable
    if (p)  // UB access to uninitialized scalar
        std::puts("p is true");
    if (!p) // UB access to uninitialized scalar
        std::puts("p is false");
}

可能的输出：

p is true
p is false

无效标量

int f()
{
    bool b = true;
    unsigned char* p = reinterpret_cast<unsigned char*>(&b);
    *p = 10;
    // 现在读取 b 的值将导致未定义行为
    return b == 0;
}

可以编译为 ( 演示 )

f():
        mov     eax, 11
        ret

空指针解引用

示例展示了从解引用空指针的结果中读取数据。

int foo(int* p)
{
    int x = *p;
    if (!p)
        return x; // 要么在上一行出现未定义行为，要么此分支永远不会执行
    else
        return 0;
}
int bar()
{
    int* p = nullptr;
    return *p; // 必然的未定义行为
}

可能被编译为 ( 演示 )

foo(int*):
        xor     eax, eax
        ret
bar():
        ret

访问传递给 std::realloc 的指针

选择 clang 以观察所示输出

运行此代码

#include <cstdlib>
#include <iostream>
int main()
{
    int* p = (int*)std::malloc(sizeof(int));
    int* q = (int*)std::realloc(p, sizeof(int));
    *p = 1; // UB access to a pointer that was passed to realloc
    *q = 2;
    if (p == q) // UB access to a pointer that was passed to realloc
        std::cout << *p << *q << '\n';
}

可能的输出：

无副作用的无限循环

选择 clang 或最新的 gcc 来观察所示输出。

运行此代码

#include <iostream>
bool fermat()
{
    const int max_value = 1000;
    // Non-trivial infinite loop with no side effects is UB
    for (int a = 1, b = 1, c = 1; true; )
    {
        if (((a * a * a) == ((b * b * b) + (c * c * c))))
            return true; // disproved :()
        a++;
        if (a > max_value)
        {
            a = 1;
            b++;
        }
        if (b > max_value)
        {
            b = 1;
            c++;
        }
        if (c > max_value)
            c = 1;
    }
    return false; // not disproved
}
int main()
{
    std::cout << "Fermat's Last Theorem ";
    fermat()
        ? std::cout << "has been disproved!\n"
        : std::cout << "has not been disproved.\n";
}

可能的输出：

Fermat's Last Theorem has been disproved!

需诊断信息的非良构

请注意，编译器被允许以某种方式扩展语言，从而为不符合规范的程序赋予意义。C++标准在这种情况下唯一要求的是发出诊断信息（编译器警告），除非该程序属于"无需诊断的不符合规范程序"。

例如，除非通过 --pedantic-errors 禁用语言扩展，GCC 将仅以警告方式编译以下示例 with only a warning ，尽管该示例在 C++ 标准中作为“错误”示例出现（另请参阅 GCC Bugzilla #55783 ）

运行此代码

#include <iostream>
// 示例调整，请勿使用常量
double a{1.0};
// C++23 标准 §9.4.5 列表初始化 [dcl.init.list]，示例 #6：
struct S
{
    // 无初始化列表构造函数
    S(int, double, double); // #1
    S();                    // #2
    // ...
};
S s1 = {1, 2, 3.0}; // 正确：调用 #1
S s2{a, 2, 3}; // 错误：存在窄化转换
S s3{}; // 正确：调用 #2
// — 示例结束]
S::S(int, double, double) {}
S::S() {}
int main()
{
    std::cout << "All checks have passed.\n";
}

可能的输出：

main.cpp:17:6: error: type 'double' cannot be narrowed to 'int' in initializer ⮠
list [-Wc++11-narrowing]
S s2{a, 2, 3}; // error: narrowing
     ^
main.cpp:17:6: note: insert an explicit cast to silence this issue
S s2{a, 2, 3}; // error: narrowing
     ^
     static_cast<int>( )
1 error generated.

参考文献

扩展内容
C++23 标准 (ISO/IEC 14882:2024): 3.25 病式程序 [defns.ill.formed] 3.26 实现定义行为 [defns.impl.defined] 3.66 未指定行为 [defns.unspecified] 3.68 良式程序 [defns.well.formed] C++20 标准 (ISO/IEC 14882:2020): TBD 病式程序 [defns.ill.formed] TBD 实现定义行为 [defns.impl.defined] TBD 未指定行为 [defns.unspecified] TBD 良式程序 [defns.well.formed] C++17 标准 (ISO/IEC 14882:2017): TBD 病式程序 [defns.ill.formed] TBD 实现定义行为 [defns.impl.defined] TBD 未指定行为 [defns.unspecified] TBD 良式程序 [defns.well.formed] C++14 标准 (ISO/IEC 14882:2014): TBD 病式程序 [defns.ill.formed] TBD 实现定义行为 [defns.impl.defined] TBD 未指定行为 [defns.unspecified] TBD 良式程序 [defns.well.formed] C++11 标准 (ISO/IEC 14882:2011): TBD 病式程序 [defns.ill.formed] TBD 实现定义行为 [defns.impl.defined] TBD 未指定行为 [defns.unspecified] TBD 良式程序 [defns.well.formed] C++98 标准 (ISO/IEC 14882:1998): TBD 病式程序 [defns.ill.formed] TBD 实现定义行为 [defns.impl.defined] TBD 未指定行为 [defns.unspecified] TBD 良式程序 [defns.well.formed]

扩展内容

C++23 标准 (ISO/IEC 14882:2024):

3.25 病式程序 [defns.ill.formed]

3.26 实现定义行为 [defns.impl.defined]

3.66 未指定行为 [defns.unspecified]

3.68 良式程序 [defns.well.formed]

C++20 标准 (ISO/IEC 14882:2020):

TBD 病式程序 [defns.ill.formed]

TBD 实现定义行为 [defns.impl.defined]

TBD 未指定行为 [defns.unspecified]

TBD 良式程序 [defns.well.formed]

C++17 标准 (ISO/IEC 14882:2017):

TBD 病式程序 [defns.ill.formed]

TBD 实现定义行为 [defns.impl.defined]

TBD 未指定行为 [defns.unspecified]

TBD 良式程序 [defns.well.formed]

C++14 标准 (ISO/IEC 14882:2014):

TBD 病式程序 [defns.ill.formed]

TBD 实现定义行为 [defns.impl.defined]

TBD 未指定行为 [defns.unspecified]

TBD 良式程序 [defns.well.formed]

C++11 标准 (ISO/IEC 14882:2011):

TBD 病式程序 [defns.ill.formed]

TBD 实现定义行为 [defns.impl.defined]

TBD 未指定行为 [defns.unspecified]

TBD 良式程序 [defns.well.formed]

C++98 标准 (ISO/IEC 14882:1998):

TBD 病式程序 [defns.ill.formed]

TBD 实现定义行为 [defns.impl.defined]

TBD 未指定行为 [defns.unspecified]

TBD 良式程序 [defns.well.formed]

参见

`[[ assume ( expression )]]` (C++23)	指定在给定点表达式将始终求值为 true (属性说明符)
`[[ indeterminate ]]` (C++26)	指定对象在未初始化时具有不确定值 (属性说明符)
unreachable (C++23)	标记不可达的执行点 (函数)
C 文档关于未定义行为

外部链接

1.	LLVM项目博客：每个C程序员应该了解的未定义行为 #1/3
2.	LLVM项目博客：每个C程序员应该了解的未定义行为 #2/3
3.	LLVM项目博客：每个C程序员应该了解的未定义行为 #3/3
4.	未定义行为可能导致时间旅行（以及其他后果，但时间旅行最离奇）
5.	理解C/C++中的整数溢出
6.	空指针趣味解析第一部分（Linux 2.6.30中因空指针解引用导致UB引发的本地漏洞）
7.	未定义行为与费马大定理
8.	C++程序员未定义行为指南

Compiler support
Freestanding and hosted
Language
Standard library
Standard library headers
Named requirements
Feature test macros (C++20)
Language support library
Concepts library (C++20)
Diagnostics library
Memory management library
Metaprogramming library (C++11)
General utilities library
Containers library
Iterators library
Ranges library (C++20)
Algorithms library
Strings library
Text processing library
Numerics library
Date and time library
Input/output library
Filesystem library (C++17)
Concurrency support library (C++11)
Execution control library (C++26)
Technical specifications
Symbols index
External libraries

Comments
ASCII
Punctuation
Names and identifiers
Types
Fundamental types
Objects
Scope
Object lifetime
Storage duration and linkage
Definitions and ODR
Name lookup
Qualified name lookup
Unqualified name lookup
The as-if rule
Undefined behavior
Memory model
Multi-threaded executions and data races (C++11)
Character sets and encodings
Phases of translation
The `main` function
Modules (C++20)
Contracts (C++26)

cppreference.net

Namespaces

Variants

Undefined behavior

目录

说明

UB 与优化

有符号整数溢出

越界访问

未初始化标量

无效标量

空指针解引用

访问传递给 std::realloc 的指针

无副作用的无限循环

需诊断信息的非良构

参考文献

参见

外部链接