Variables, Constants, and Naughty Strings
This post is licensed under CC BY-NC-ND 4.0.
This note encapsulates some details regarding variables and constants in Rust and C++. Specifically, const, static, let in Rust, and const, auto/int/char/..., constexpr in C++.
Additionally there's some extra notes related to references and strings in Rust.
In this note I'll use Rust's definition for things like "constants", and map C++'s definitions onto Rust's.
Rust
let: Local Variables, Mutable or Immutable
See TRPL 3.1. Variables and Mutability for details.
In Rust, generally we have three types of "values". Mutable variables, immutable variables, and constants. This is different from normal conversation, where the term "variable" implies mutability since, well, the thing "varies".
A variable is defined using let with type inference which almost always works. The rare exception is when the compiler has no way of figuring out for sure which type a variable is. An example to this is when using .collect() 1.
let a = 10; // a: u32
let mut b: &str = "hello, world";
By default, these variables are immutable. That is, their value cannot be changed. However, we can "shadow" the variable by declaring another variable of the exact same name. This new variable can be of a different type, different mutability, and different scope.
// a = 20; // Would fail compilation
{
let mut a = 20;
println!("{}", a); // 20
a *= 10; // Shadowed a becomes 200
} // Shadowed a was dropped here
// a = 20; // Would still fail compilation
println!("{}", a); // 10
let a = 30; // Original a no longer accessible
println!("{}", a); // 30
It is worth noting that, despite the code examples above, variables defined using let cannot have a global scope - they must exist in functions. Attempting to use let to define a global variable will yield this error message:
`let` cannot be used for global variables
help: consider using `static` or `const` instead of `let`
However, they can exist inside closures used with "lazy static". More on that later.
Quirky References
In Rust, you can take references to variables (or constants, anything really) and store them inside another variable. Sorta like pointers. I'll skip over stuff related to mutable and immutable references (& and &mut) since that's a whole new topic related to the borrow checker, but here's a quick example:
let mut c: &u32 = &a;
Here, c is a mutable variable holding an immutable reference to a. This means c can be changed to point at another u32, but you cannot modify a through c 2. c also cannot be changed to point at any variable or constant of any other type (including other integer types), but it can be changed to point at a u32 constant.
The real quirky stuff comes in when you consider literal strings. In Rust, hardcoded strings are &str, different from String. It's widely known that &str lives on the stack since it has a fixed size, while String lives on the heap since its size is unknown at compile time. However, the reason there's a & here is because b in the first example isn't a string in itself. It's a pointer to the literal string on the stack!
Note from the future: The crossed out section above is wrong. &str can be a reference to any slice of string. It is in itself just a "fat pointer" (a pointer with some other property such as length) pointing to a string slice, similar to std::span in C++. Therefore the string it controls can be in the heap (In the case of a slice of a String), on the stack, or in the case of a literal string (of type &'static str), live in the rodata (read-only data) segment of the process. Having a &str pointing to a stack-allocated chunk of memory actually requires some magic using str::from_utf8(), which can take in a stack-allocated [u8; N].
Given that b is mutable in the examples above, we can actually do this:
b = "hello! world"; // Reassigning b to a reference to another literal string on the stack
let s = String::from(b);
b = &s[0..2]; // Reassigning b to a string slice, a reference to a String object on the heap
const: Constants
Constants are something completely different in Rust. They are declared using the const keyword, can never be mutable, and have some additional quirks. Their conventional naming is the SCREAMING_CASE and their types must be manually annotated at definition, to name a few.
const C: u32 = 100;
Since constants are always immutable, they can be defined in either a local or a global scope, and will always have thread-safe reads. Worth noting that the compiler is permitted to copy them around, and not necessarily reading its content from a fixed memory location.
Constants must be known at compile time, and thus only constant expressions are allowed inside. The details for allowed expressions can be found in The Rust Reference, but in general, basic operations like arthimetic, logical, derefencing, casting, loops, structs/tuples/arrays. Worth noting that things that would normally panic (e.g., out-of-bound indexing) would now raise a compile error if they are inside a const expression.
It is also worth noting that the operations mentioned above (known as "constant expressions") can also be evaluated during compilation when they're in let statements. This doesn't necessarily happen, but the compiler is permitted to do so.
const is typically used to define actual constants, such as configurations, number of experiments, etc., within the code for easy access.
static: Global Variables, Mutable or Immutable
static is usually used to define global variables. Similar to let, they can be mutable, but similar to const, they must have annotated types and the convention is to use all caps.
Even though they are called "global variables", they can also be defined within a local scope, such as a function.
The reason why it's called "static", is that static items always have only one copy. All references of the object will be pointing at the exact same item. This is not the case for const. In technical terms, static objects have the 'static lifetime, and drop is not called on it at the end of the program.
static is typically used to define application configurations or other variables shared by the entire application. Read-only access to static is thread safe. It is possible to use crates such as once_cell to initialise static objects at runtime startup, so it's often used for things like reading from .env or other configuration paths during runtime. The data will be accessible throughout the entire duration of a program.
static can be defined as mutable, but that is unsafe due to the potential of data races which will lead to undefined behaviour. It is a common pattern to use static for things like the global state of a web application, in which case you can wrap Mutex inside a static immutable variable like this:
use std::sync::{LazyLock, Mutex};
static ARRAY: LazyLock<Mutex<Vec<u8>>> = LazyLock::new(|| Mutex::new(vec![]));
fn do_a_call() {
ARRAY.lock().unwrap().push(1);
}
Example provided by random StackOverflow question.
You can call *OTHER_STATIC_VARIABLE inside the closure of another static variable to enforce the order of initilisation.
C++
In C++, const is a modifier just like mut (doing the opposite), and there are no keyword separating local and global variables. They have constexpr instead of const for actual constants, and that's basically it.
A few additional quirks:
- Type inference can be used with
constexpr(constexpr auto a = 1;) constexprare immutable just like the Rust constants, so there's no need to writeconst constexpr- Similar to Rust's case, constant expressions can be evaluated at compile time even when they are not defined with
constexpr
Simple mapping which is probably not entirely accurate:
| Rust | C++ | Description | Thread Safe | Evaluated at Compile Time |
|---|---|---|---|---|
let mut | auto, int... | Mutable Local Variable | Yes | Possible |
static mut3 | auto, int... | Mutable Global Variable | No | Possible |
let | const auto4, const int... | Immutable Local Variable | Yes | Possible |
static | const auto, const int... | Immutable Global Variable | Yes | Possible |
const | constexpr auto | Constant | Yes | Yes, Always |
To achieve "thread-save mutable global variable", in C++ a similar design pattern (as described earlier) involving mutexes is often used.
C++, however, will not warn you at all that data races might happen if you define mutable global variables in a multi-threaded context, so if you accidentally do that, it'll probably be a couple hours of painful debugging. You can either specify A mutable reference to Only mutations require an You can also write this as .collect::<Type>(), in which case the type of the variable is inferred, or specify the type of the variable, in which case the target of the collect method is inferred. ↩a would also be invalid in itself, since a is immutable. ↩unsafe block. ↩auto const (less common) but not constexpr auto as auto constexpr. This is because constexpr is a declaration specifier while const is a type qualifier. Declaration specifiers modify type specifiers (auto), hence they must appear in order. Type qualifiers can be anywhere. ↩