Learner's Notes: Exploring C++ Undefined Behavior (Part 3) – Best Practices for Low-Level Object Manipulation
Quick Note:
If you’re into the nitty-gritty details, be sure to check out Part 1 and Part 2. But if you’re here for practical advice, welcome to Part 3!
Best Practices
As promised in the Part 2, here are the three key rules to keep in mind when dealing with low-level object manipulation:
- Always use arrays of
std::byte
orunsigned char
as memory buffers instead ofchar
. - Always check whether
std::launder
is necessary. - Always check whether you need to call
std::destroy_at
.
That’s it! If you just wanted the TLDR, you’re good to go. But if you’re curious about why these rules matter, stick around. The rest of this post will take a FAQ-style approach, where I go through key questions that came up during my own deep dive into these best practices.
Rule 1 FAQ: Using arrays of std::byte as Memory Buffers
Q: What do arrays of std::byte
or unsigned char
provide that arrays of char
don’t?
A: Besides implicitly creating objects of implicit-lifetime types (as covered in previous posts), arrays of std::byte
or unsigned char
also provide storage for those created objects, as stated in [intro.object]/3.
Combining this with [intro.object]/4 and [basic.life]/1.5, we see that the lifetime of a std::byte
or unsigned char
array does not end when we reuse its storage for another object. In contrast, reusing storage from a char
array does end its lifetime.
Q: Why does it matter that the lifetime of arrays of std::byte
or unsigned char
doesn’t end when reusing their storage?
A: This is a surprisingly tricky question to answer just by looking at the C++23 standard, but let’s go with an example to illustrate why it’s important:
1
2
3
4
5
6
7
8
9
10
11
12
#include <iostream>
struct SimpleIntBuffer {
alignas(int) char buf[64];
void f() { std::cout << "Hello World\n";}
};
int main() {
SimpleIntBuffer s;
::new(s.buf) int{2};
s.f(); // Undefined behavior: `s`'s lifetime has ended
}
When we reuse the storage of buf
to create an int
object, we unintentionally end the lifetime of s
itself.
Why? Because the int
object isn’t considered nested within buf
, which means it also isn’t nested within s
. According to [basic.life]/1.5, this means the lifetime of s
is over. And once s
’s lifetime has ended, calling f()
on it is undefined behavior, as per [basic.life]/7.2.
Rule 2 FAQ: Checking Whether std::launder Is Necessary
Q: When do you generally need to use std::launder
?
A: We’ve already seen its necessity for implicitly created objects, but more generally, std::launder
is typically required to access an object after placement new
if:
- You don’t use the pointer returned by placement
new
, and - You access the object via the buffer using a reinterpret_cast.
If both conditions hold, then a call to std::launder
is almost always needed. The rare exceptions are outlined in [basic.life]/8, but they usually don’t apply to memory buffer use cases.
Here’s a simple example:
1
2
3
4
5
6
7
8
9
#include <cstddef>
#include <mutex>
#include <iostream>
int main() {
alignas(int) std::byte arr[16];
::new(arr) int{100};
std::cout << *std::launder(reinterpret_cast<int*>(arr)) << std::endl; // std::launder is required here
}
Although it may not be necessary in the future due to proposal P3006, std::launder
is still required at the moment, which is why I’ve included it here.
Rule 3 FAQ: Checking Whether std::destroy_at Is Necessary
Q: What is std::destroy_at
and when should we call it?
A: For the purpose of our discussion, std::destroy_at
is essentially used to manually call the destructor of an object (we’ll skip discussing array types here for simplicity). It’s crucial to know when to invoke a destructor because, as we saw in [basic.life]/5, you can end an object’s lifetime without calling its destructor if you reuse or release the storage.
Here’s a simple illustration of why this matters:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <cstddef>
#include <mutex>
#include <iostream>
int main() {
std::mutex m;
alignas(std::lock_guard<std::mutex>) std::byte* arr = new std::byte[16]{};
std::lock_guard<std::mutex>* lk_ptr = ::new(arr) std::lock_guard<std::mutex>{m};
std::destroy_at(lk_ptr); // Commenting out this line will result in a deadlock
delete[] arr;
std::lock_guard<std::mutex> lk2{m};
std::cout << "Hello World\n";
}
The rule is simple: if you reuse or release the storage occupied by an object, it’s important to call either std::destroy_at
or the destructor manually to ensure the proper side-effects (like cleanup) are handled.
Conclusion
And that wraps up our 3-part series! For the language lawyers, I hope you found something useful in Part 1 and Part 2. For the practical software developers out there, I hope Part 3 helps you recognize common pitfalls and adopt best practices when it comes to low-level object manipulation.
Till next time!