Rendered at 11:27:43 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
myrmidon 22 hours ago [-]
> There are multiple articles of how C++ is superior to C, that everything you can do in C you can do in C++ with a lot of extras, and that it should be used even with bare metal development
An interesting perspective. Could turn it around as "everything you can do in C++ you can do in C with a lot less language complexity".
My personal experience with low-level embedded code is that C++ is rarely all that helpful, tends to bait you into abstractions that don't really help, brings additional linker/compiler/toolchain complexity and often needs significant extra work because you can't really leverage it without building C++ abstractions over provided C-apis/register definitions.
Would not generally recommend.
jonathrg 22 hours ago [-]
You definitely need discipline to use C++ in embedded. There are exactly 2 features that come to mind, which makes it worth it for me: 1) replacing complex macros or duplicated code with simple templates, and 2) RAII for critical sections or other kinds of locks.
kevin_thibedeau 17 hours ago [-]
Consteval is great for generating lookup tables without external code generators. You can use floating point freely, cast the result to integers, and then not link any soft float code into the final binary.
kryptiskt 20 hours ago [-]
> Could turn it around as "everything you can do in C++ you can do in C with a lot less language complexity".
No, you can't, C is lacking a lot that C++ brings to the table. C++ has abstraction capabilities with generic programming and, dare I say it, OO that C has no substitute for. C++ has compile-time computation facilities that C has no substitute for.
embeng4096 20 hours ago [-]
Is there an example of the generic programming that you've found useful?
The extent of my experience has been being able to replace functions like convert_uint32_to_float and convert_uint32_to_int32 by using templates to something like convert_uint32<float>(input_value), and I didn't feel like I really got much value out of that.
My team has also been using CRTP for static polymorphism, but I also feel like I haven't gotten much more value out of having e.g. a Thread base class and a derived class from that that implements a task function versus just writing a task function and passing it xTaskCreate (FreeRTOS) or tx_thread_create (ThreadX).
Typed compile-time computation is nice, though, good point. constexpr and such versus untyped #define macros.
csb6 18 hours ago [-]
The generic algorithms that come with the C++ standard library are useful. Once you get used to using them you start to see that ad-hoc implementations of many of them get written repeatedly in most code. Since most of the algorithms work on plain arrays as well as more complex containers they are still useful in embedded environments.
rkagerer 9 hours ago [-]
I had been programming for a long time before I learned OOP. After some years playing with it, I came to the conclusion there's not much I can't do about as well using simple functions and structs. The key is a well thought out and organized codebase. Always felt polymorphism in particular seemed more trouble than it was worth.
I still use modern languages on a regular basis, but when I drop back to more basic languages there are only a few ergonomics that I truly miss (eg. generics).
slaymaker1907 19 hours ago [-]
std::array can sometimes give you the best of both worlds for stack allocation in that you statically constrain the stack allocation size (no alloca) while guaranteeing that your buffers are large enough for your data. You can also do a lot of powerful things with constexpr that are just not possible with arrays. It is very convenient for maintaining static mappings from enums to some other values.
CyberDildonics 16 hours ago [-]
You've never used a template for a data structure and you've never used a destructor to free memory?
myrmidon 19 hours ago [-]
My point is trivially true as far as computability goes, but that is not what I ment.
All those abstraction capabilities can be a big detriment to any project, because they always come with a cost, and runtime is far from the only concern.
Specifically in an embedded project, toolchain complications and memory use (both RAM and code) are potentially much bigger concerns than for Desktop applications, and your selection of programmers is more limited as well; might be much more feasible to lock your developers onto acceptable C coding standards than to make e.g. "template metaprogramming" a necessary prerequisite for your codebase and then having to teach your applicants electrical engineering.
Both object oriented programming and compile time computation is doable for a C codebase, just needs more boilerplate and maybe a code-generator step in your build, respectively. But that might well be an advantage, discouraging frivolous use of complexity that you don't actually need, and that introduces hidden costs (understanding, ease of change, compile time) elsewhere.
bigfishrunning 15 hours ago [-]
> C++ has compile-time computation facilities that C has no substitute for.
The substitute for this is that C is insanely easy to generate. Do your compile time computation in your build system.
OO is also pretty trivial in C -- the Linux kernel is a great example of a program in C that is very Object Oriented.
Conscat 17 hours ago [-]
> An interesting perspective. Could turn it around as "everything you can do in C++ you can do in C with a lot less language complexity".
C can't parameterize an optimal layout fixed-length bitset with zero overhead, nor can it pragmatically union error sets at scale.
bsoles 9 hours ago [-]
I find that encapsulation of devices (UART, I2C, etc.) as classes in C++ rather than global functions that take structs, etc. as input arguments in C to be much more manageable. Same for device drivers for individual sensor ICs.
myrmidon 9 hours ago [-]
Maybe, but I'd argue that this is a matter of taste, and what does it actually do for you?
In practice: Is it worth wrapping all your vendor-provided hardware abstraction in classes that you write yourself? Designing something like that to be "complete" (i.e. allow full access to all hardware functionality like IRQ/DMA setup) and to still stay hardware-independent is often borderline impossible anyway, and any attempt is invariably going to make your codebase more difficult to understand.
g947o 22 hours ago [-]
Mind if I ask whether you speak of that from a professional embedded system engineer's perspective?
myrmidon 22 hours ago [-]
I do. But talking about low-level embedded stuff here.
Generally, the more you deviate from your vendors "happy path", the more busy work/unexpected difficulties you will run into, and a solid grasp of how exactly architecture and toolchain work might become necessary (while staying on the "happy path" allows you to stay blissfully unaware).
technothrasher 22 hours ago [-]
I struggle with this deviating from the vendor's "happy path" often. I mostly use the STM32 chips, and I don't particularly care for their HAL library. I find it over complicated and often has bugs in it that I have to track down and fix. But boy is it nice to use their STM32CubeMX program to generate all the low level code so I can just get to work. I tend to end up building my own low level libraries during my free time because I enjoy it and it gives me a better idea of how the hardware is actually working, but using the STM32 HAL library to write my actual client code at work.
patchnull 20 hours ago [-]
Same experience here. What worked for me was using CubeMX purely for pin and clock config, then dropping down to the LL (low-layer) drivers or direct CMSIS register access for anything in a hot path. The HAL interrupt handlers in particular add a surprising amount of overhead — on a tight DMA transfer loop I measured ~40% cycle waste just from HAL callback dispatch.
The LL API is basically thin inline wrappers around register writes, so you still get the CubeMX-generated init code but without the HAL abstraction tax at runtime.
bsoles 9 hours ago [-]
Also same experience here. I can write UART code with DMA in 20 lines of code on an STM32 microcontroller. Same functionality using HAL is astonishingly cumbersome. The reference manual and willingness to read it is all you need.
embeng4096 21 hours ago [-]
+1 to this and your above points (the embedded team I'm on has started using C++ over the last year or so).
I've definitely learned a lot, and I like the portability of CMake for cross-platform use (our team uses all 3 of Windows, Mac, and Linux). My experience sounds much like yours: there've been a lot of times where using the vendor's flavor of Eclipse-based IDE (STM32CubeIDE, Renesas e2studio, etc) would have saved us a lot of discovered work, or extra work adapting the "happy path" to CMake and/or C++.
Using C++ and/or CMake is fine when it's part of the happy path and for simpler things e.g. STM32CubeMX-generated CMake project + simple usage of HAL. For more complex things like including MCUboot or SBSFU, etc, it's forced me to really dig in. Or even just including FreeRTOS/ThreadX, we've created abstractions like a Thread class on top -- sometimes it's nice and convenient, others it feels like unnecessary complexity, but maybe I'm just not used to C++ yet.
One clear, simple example is needing to figure out CMake and Ninja install. In an Eclipse-based IDE, you install the platform and everything Just Works(tm). I eventually landed on using scoop to install CMake and Ninja, which was an easy solution and didn't require editing my PATH, etc, but that wasn't the first thing I tried.
adrian_b 15 hours ago [-]
I have never seen any advantage of CMake over the much simpler GNU make.
Ninja is supposed to be faster for compiling very big software projects, but I have never seen an embedded software project that is well organized and which is not compiled in a few seconds on a powerful development computer with many cores, so I do not see the benefit of Ninja for such projects.
All Eclipse-based IDEs that I have ever seen are extremely slow for anything, both for editing and for building a project and they make the simplest project management operations extremely complicated. Even Visual Code Studio is much faster and more convenient than using Eclipse-based IDEs. Other solutions can be much faster.
While the example programs provided for STM32 MCUs are extremely useful for starting a project for them, I believe that using the methods of project building provided by the vendor results in a waste of time. I have always obtained better results and faster development by building a GNU toolchain (e.g. binutils,gcc,gdb, some libc) from scratch and by using universal GNU makefiles, which work for any CPU target and for any software project, with the customization of a few Make variables. I have written once a set of GNU Makefiles, according to its manual, around 1998, and I have never had to change them since then, regardless what platform I had as a target. For any new platform, there is just a small set of variables that must be changed by generating one per-platform included file, with things like the names of the compilers and other tools that must be invoked and their command-line options.
For new projects, there is one very small file that must be generated for each binary file that must be built, which must contain the type of the file (e.g. executable, static library, shared library) and a list with one or more directories where source files should be searched. No changes are needed when source files are created, deleted, moved or renamed, and dependencies are identified automatically. I am always astonished when I see how many totally unnecessary complications exist in the majority of the project building configurations that I have ever seen provided by the vendors or in most open-source projects.
chris_money202 3 hours ago [-]
Just switched one of my teams firmware projects from make to cmake + ninja (with the help of GHCP). Build time went from 10 minutes to 2 minutes and now we have ability just to build what has changed.
myrmidon 2 hours ago [-]
You might have just had a bad makefile.
The "canonical" way to just build what was changed with make is to let the compiler generate header dependencies during the build and to include those (=> if you change only 1 header, just the affected translation units get rebuilt). Works like a charm (and never messes up incremental builds by not rebuilding on a header-only change).
If you did not have proper incremental builds before, I would blame the makefile specifically and not make itself.
Another way to mess up with make is to not allow anything in parallel, either by missing flags (-j) or by having a rule that does everything sequentially.
Ninja does have less overhead than make, but your build time should be dominated by how long compiler and linker take; such a big difference between the two indicates to me that your old make setup was broken in some way.
chris_money202 10 hours ago [-]
With anything low level you should choose the language you’re most comfortable and competent in imo. Learning both the platform and the language is just asking for headaches, control what you can
bigfishrunning 16 hours ago [-]
I would honestly extend this sentiment out to all code. The benefits C++ has over C are much better served by Rust or Go or Python or Lisp, and if "Simple" is what you want, then C is a much better choice.
Honestly, I can't think of a single job for which C++ is the best tool.
embeng4096 21 hours ago [-]
I took a brief skim through so apologies if I missed that it was mentioned, but wanted to bring up the Embedded Template Library[0]. The (over)simplified concept is: it provides a statically-allocated subset (but large subset) of the C++ standard library for use in embedded systems. I used it recently in a C++ embedded project for statically-allocated container/list types, and for parsing strings, and the experience was nice.
So I use C++ heavily in the kernel. But couldn't you just set your own allocator and a couple other things and achieve the same effect and use the actual C++ STL? In kernel land, at the risk of simplifying, you just implement allocators and deallocators and it "just works", even on c++ 26.
nly 16 hours ago [-]
Do you typically just compile with -fno-rtti -fno-exceptions -nostdlib ?
Last time I did embedded work this was basically all that was required.
nulltrace 13 hours ago [-]
Those three flags cover most of it. One gotcha: -fno-exceptions makes `new` return nullptr instead of throwing, so if any library code expects exceptions you get silent corruption. We added -fcheck-new to catch that.
Also -nostdlib means no global constructors run, so static objects with nontrivial ctors need you to call __libc_init_array yourself.
pjmlp 23 hours ago [-]
While tag dispatching used to be a widely used idiom in C++ development, it was a workaround for which nowadays there are much better alternatives with constexpr, and concepts.
tialaramex 22 hours ago [-]
Surely one of the obvious reasons you'd want tagged dispatch in C++ isn't obviated by either of those features? Or am I missing something?
Suppose Doodads can be constructed from a Foozle either with the Foozle Resigned or with the Foozle Submitted. Using tagged dispatch we make Resigned and Submitted types and the Doodad has two specialised constructors for the two types even though substantively we pass only the Foozle in both cases.
In a language like Rust all the constructors have names, it's obvious what Vec::with_capacity does while you will still see C++ programmers who thought constructing a std::vector with a single integer parameter does the same because it's just a constructor and you'd need to memorize what happens.
quuxplusone 22 hours ago [-]
I wouldn't call the idiom you describe (like with unique_lock's defer_lock_t constructor) "tag dispatch"; to me, one defining characteristic of the "tag dispatch idiom" is that the tag you're dispatching on is computed somehow (e.g. by evaluating iterator_traits<T>::iterator_category()). The idiom you're describing, I'd call simply "a constructor overload set" that happens to use the names of "disambiguation tags" to distinguish semantically different constructors because — as you point out — C++ doesn't permit us to give distinct names to the constructor functions themselves.
You use if constexpr with requires expressions, to do poor man's reflection on where to dispatch, and eventually with C++26, you do it with proper reflection support.
daemin 10 hours ago [-]
I just thought I'd mention that Khalil Estell is doing some work regarding exceptions in embedded and bare metal programming. In the first of his talks about this topic (https://www.youtube.com/watch?v=bY2FlayomlE) he mentions that a lot of the bloat in exception handling is including printf in order to print out the message when terminating the program.
For those interested he has another presentation on this topic https://www.youtube.com/watch?v=wNPfs8aQ4oo
chris_money202 10 hours ago [-]
Didn’t watch the video but as opposed to what? Printf is definitely the most efficient way to retrieve the info needed to begin initial triage especially in “hacking” or bring up where a formal process isn’t defined
daemin 9 hours ago [-]
There are processors and platforms where including the standard library feature to print text to standard output significantly increases the size of the binary. In such cases just enabling exceptions also enables this feature only for the purpose of outputting a termination message, which does not get displayed or read because the device doesn't have a way to emit standard output.
randusername 22 hours ago [-]
> Although there is an opinion that templates are dangerous because of executable code bloating, I think that templates are developer’s friends, but the one must know the dangers and know how to use templates effectively. But again, it requires time and effort to get to know how to do it right.
idk man, obviously I don't know much since I don't have my own online book, but templates would not be at the start of my list when selling C++ for bare-metal.
unit suffixes, namespaces, RAII, constexpr, OOP (interfaces mostly), and I like a lot of the STL in avoiding inscrutable "raw loops".
I like the idea of templates, but it feels like a different and specialized skillset. If you are considering C++ from C, why not ease into it?
bluGill 21 hours ago [-]
Most of the good parts of the STL are implemented as templates. If you are considering C++ from C, then treating it like C with the STL templates is a great first step. Over time you will discover other useful features of C++ that solve specific problems. Virtual functions in a class - better than writing a table of function pointers. Lambda - sometimes much better than a function pointer. Custom templates - better than doing the same thing in macros (at least some times). Exceptions are sometimes really nice for error handling (and contrary to popular belief are not slow) And so on - C++ has a lot of features that can be useful, but most of them are features that can also be abused in places they don't belong creating a real problem.
lkjdsklf 20 hours ago [-]
Templates don’t have to be complicated.
Just very basic type substitution is one of the most useful uses of templates and is useful in pretty much all software
They’re also useful when you can’t use virtual dispatch. Concepts help a lot in making that tolerable.
Sure they can get stupid complicated and ugly as hell, but you don’t have to do that. Even their basic form is very useful
That said, RAII is probably the must useful thing
Panzerschrek 6 hours ago [-]
> Yes, indeed, there are two calls to two different functions. However, the assembler code of these functions is almost identical.
As I understand this is due to requirement for distinct C++ functions to have unique addresses.
saltmate 23 hours ago [-]
This seems very well written, but has a lot of requirements/previous knowledge required by a reader.
Are there similar resources that fill these gaps?
birdsongs 22 hours ago [-]
I only skimmed the book, but I think this is an artifact of the embedded engineering side. (Something I do professionally.)
I've seen a lot of new people come into my team as juniors, or regular C/C++ engineers that convert to embedded systems. There is a real lack of good, concise resources for them, and the best result I've had is just mentoring them and teaching as we go.
You could look for an intro to embedded systems resource. Or just get a dev kit for something. Go different than the standard Pi or Arduino. Try and get something like a STM32G0 dev kit working and blinking its lights. It's less polished, but you'll have to touch more things and learn more.
If you want, core areas I would suggest to research are the very low level operations of a processor:
* How does the stack pointer work? (What happens to this during function calls?
* How do parameters get passed to functions by value, by reference? What happens when you pass a C++ class to a function by value? What is a deep vs shallow copy of a C++ object, and how does that work when you don't have an OS or MMU?
* Where is heap memory stored? Why do we have a heap and a stack? How do these work in the absence of an OS?
* The Program Counter? (PC register). What happens to this as program execution progresses?
* What happens when a processor boots, how does it start receiving instructions? (This is vague, you could read the datasheet for something like the STM32G0 microcontroller, or the general Arm Cortex M0 core.)
* How are data/instructions loaded from disk to memory to cache to register? What are these divisions and why do we have them?
* Basic assembly language, you should know how loads and stores work, basic arithmetic, checks/tests/comparisons, jump operations.
* How do interrupts work, what's an ISR, IRQ, IVT? How do peripherals like UART, I2C (also what are these?), handle incoming data when you have a main execution thread already running on a single core processor?
Some of this may be stuff you already know, or seem rudimentary and not exactly relevant, but they are things that have to be rock solid before you start thinking about how compilers for differently languages, like C++, create machine code that runs on a processor without an OS.
Assembly is often overlooked, but a critical skill here. It's really not that bad. Often when working with C++ (or Rust) on embedded systems, if I'm unsure of something, my first step is to decompile and dump the assembly to investigate, or step through the assembly with GDB via a JTAG probe on the target processor if the target was too small to hold all the debug symbols (very common).
Anyways, this may have been more than you were asking for. Just me typing out thoughts over my afternoon coffee.
randusername 22 hours ago [-]
I like Realtime C++ (Kormanyos) and Making Embedded Systems (White)
The former is probably more what you are looking for.
skydhash 22 hours ago [-]
I'm not a professional embedded engineer, but I do hack around it. Some books I collected:
- Applied Embedded Electronics: Design Essentials for Robust Systems by J. Twomey. It goes over the whole process making a device and what knowledge would be required for each. Making Embedded Systems, 2nd Edition by E. White is a nice complement.
- Embedded System Interfacing by M. Wolf describes the signals and protocols behind everything. It's not necessary as a book, but can help understand the datasheets and standards
- But you want to start with something like Computer Architecture by C. Fox or Write Great Code - Volume 1 - Understanding the Machine, 2nd Edition by R. Hyde. There are better textbooks out there, but those are nice introductions to the binary world.
The gist is that you don't have a lot of memory and CPU power (If you do, adapting Linux is a more practical option unless it's not suited for another reason). So all the nice abstractions provided by an OS is a no go, so you need to take care of the hardware and those are really finicky,
mdocc 21 hours ago [-]
>StaticQueue: LinearisedIterator
Using C++ iterator interface to fix the main problem of a standard ring buffer of non-contiguous regions is a cute idea, but I like to use a "bip buffer"[1] instead which actually always gives you real contiguous regions so you can pass the pointers to things like a dma engine.
The tradeoff is that you have in the worse case only half the buffer available - the ring buffer essentially becomes a kind of double buffer where you periodically switch between writing/reading at the end or the beginning of the storage.
VorpalWay 23 hours ago [-]
Why does the link go to the abstract classes heading, halfway down the page?
NooneAtAll3 22 hours ago [-]
link has extra "#_abstract_classes" in it that it would be better without
m00dy 22 hours ago [-]
just use Rust, and never look back.
soci 18 hours ago [-]
Why is this comment downvoted?
I mean, is it downvoted because Rust is bad for embeded
systems?
bigstrat2003 17 hours ago [-]
Probably because it's kind of annoying to bring up Rust in a C++ thread.
menaerus 23 hours ago [-]
Outdated, opinionated, platform-specific, and incorrect.
superxpro12 21 hours ago [-]
Bare metal firmware tends to be platform specific tho.
menaerus 21 hours ago [-]
rPi with GCC 4.7 is neither firmware nor bare metal. GCC 4.7 is almost 15 years old.
bigstrat2003 17 hours ago [-]
Not firmware sure, but if one boots the Pi from the software that's just as bare metal as anything else. And the age of the GCC version is completely irrelevant to whether something is bare metal.
menaerus 17 hours ago [-]
Pi runs Linux. That's software. Not bare metal. Bare metal is when you're running without OS, or at light abstraction such as RTOS. Bare metal is normally associated with chips without MMU. Pi runs SoC.
I didn't say that GCC version is relevant to bare metal definition, I said it's 15 years old. And when you try to draw conclusions, and derive strong opinion, based on the codegen output but you're using 15 year old toolchain, and btw pretty shitty CPU core, something isn't quite right. This article is just a good display of never ending cargo-cult programming myths.
An interesting perspective. Could turn it around as "everything you can do in C++ you can do in C with a lot less language complexity".
My personal experience with low-level embedded code is that C++ is rarely all that helpful, tends to bait you into abstractions that don't really help, brings additional linker/compiler/toolchain complexity and often needs significant extra work because you can't really leverage it without building C++ abstractions over provided C-apis/register definitions.
Would not generally recommend.
No, you can't, C is lacking a lot that C++ brings to the table. C++ has abstraction capabilities with generic programming and, dare I say it, OO that C has no substitute for. C++ has compile-time computation facilities that C has no substitute for.
The extent of my experience has been being able to replace functions like convert_uint32_to_float and convert_uint32_to_int32 by using templates to something like convert_uint32<float>(input_value), and I didn't feel like I really got much value out of that.
My team has also been using CRTP for static polymorphism, but I also feel like I haven't gotten much more value out of having e.g. a Thread base class and a derived class from that that implements a task function versus just writing a task function and passing it xTaskCreate (FreeRTOS) or tx_thread_create (ThreadX).
Typed compile-time computation is nice, though, good point. constexpr and such versus untyped #define macros.
I still use modern languages on a regular basis, but when I drop back to more basic languages there are only a few ergonomics that I truly miss (eg. generics).
All those abstraction capabilities can be a big detriment to any project, because they always come with a cost, and runtime is far from the only concern.
Specifically in an embedded project, toolchain complications and memory use (both RAM and code) are potentially much bigger concerns than for Desktop applications, and your selection of programmers is more limited as well; might be much more feasible to lock your developers onto acceptable C coding standards than to make e.g. "template metaprogramming" a necessary prerequisite for your codebase and then having to teach your applicants electrical engineering.
Both object oriented programming and compile time computation is doable for a C codebase, just needs more boilerplate and maybe a code-generator step in your build, respectively. But that might well be an advantage, discouraging frivolous use of complexity that you don't actually need, and that introduces hidden costs (understanding, ease of change, compile time) elsewhere.
The substitute for this is that C is insanely easy to generate. Do your compile time computation in your build system.
OO is also pretty trivial in C -- the Linux kernel is a great example of a program in C that is very Object Oriented.
C can't parameterize an optimal layout fixed-length bitset with zero overhead, nor can it pragmatically union error sets at scale.
In practice: Is it worth wrapping all your vendor-provided hardware abstraction in classes that you write yourself? Designing something like that to be "complete" (i.e. allow full access to all hardware functionality like IRQ/DMA setup) and to still stay hardware-independent is often borderline impossible anyway, and any attempt is invariably going to make your codebase more difficult to understand.
Generally, the more you deviate from your vendors "happy path", the more busy work/unexpected difficulties you will run into, and a solid grasp of how exactly architecture and toolchain work might become necessary (while staying on the "happy path" allows you to stay blissfully unaware).
The LL API is basically thin inline wrappers around register writes, so you still get the CubeMX-generated init code but without the HAL abstraction tax at runtime.
I've definitely learned a lot, and I like the portability of CMake for cross-platform use (our team uses all 3 of Windows, Mac, and Linux). My experience sounds much like yours: there've been a lot of times where using the vendor's flavor of Eclipse-based IDE (STM32CubeIDE, Renesas e2studio, etc) would have saved us a lot of discovered work, or extra work adapting the "happy path" to CMake and/or C++.
Using C++ and/or CMake is fine when it's part of the happy path and for simpler things e.g. STM32CubeMX-generated CMake project + simple usage of HAL. For more complex things like including MCUboot or SBSFU, etc, it's forced me to really dig in. Or even just including FreeRTOS/ThreadX, we've created abstractions like a Thread class on top -- sometimes it's nice and convenient, others it feels like unnecessary complexity, but maybe I'm just not used to C++ yet.
One clear, simple example is needing to figure out CMake and Ninja install. In an Eclipse-based IDE, you install the platform and everything Just Works(tm). I eventually landed on using scoop to install CMake and Ninja, which was an easy solution and didn't require editing my PATH, etc, but that wasn't the first thing I tried.
Ninja is supposed to be faster for compiling very big software projects, but I have never seen an embedded software project that is well organized and which is not compiled in a few seconds on a powerful development computer with many cores, so I do not see the benefit of Ninja for such projects.
All Eclipse-based IDEs that I have ever seen are extremely slow for anything, both for editing and for building a project and they make the simplest project management operations extremely complicated. Even Visual Code Studio is much faster and more convenient than using Eclipse-based IDEs. Other solutions can be much faster.
While the example programs provided for STM32 MCUs are extremely useful for starting a project for them, I believe that using the methods of project building provided by the vendor results in a waste of time. I have always obtained better results and faster development by building a GNU toolchain (e.g. binutils,gcc,gdb, some libc) from scratch and by using universal GNU makefiles, which work for any CPU target and for any software project, with the customization of a few Make variables. I have written once a set of GNU Makefiles, according to its manual, around 1998, and I have never had to change them since then, regardless what platform I had as a target. For any new platform, there is just a small set of variables that must be changed by generating one per-platform included file, with things like the names of the compilers and other tools that must be invoked and their command-line options.
For new projects, there is one very small file that must be generated for each binary file that must be built, which must contain the type of the file (e.g. executable, static library, shared library) and a list with one or more directories where source files should be searched. No changes are needed when source files are created, deleted, moved or renamed, and dependencies are identified automatically. I am always astonished when I see how many totally unnecessary complications exist in the majority of the project building configurations that I have ever seen provided by the vendors or in most open-source projects.
The "canonical" way to just build what was changed with make is to let the compiler generate header dependencies during the build and to include those (=> if you change only 1 header, just the affected translation units get rebuilt). Works like a charm (and never messes up incremental builds by not rebuilding on a header-only change).
If you did not have proper incremental builds before, I would blame the makefile specifically and not make itself.
Another way to mess up with make is to not allow anything in parallel, either by missing flags (-j) or by having a rule that does everything sequentially.
Ninja does have less overhead than make, but your build time should be dominated by how long compiler and linker take; such a big difference between the two indicates to me that your old make setup was broken in some way.
Honestly, I can't think of a single job for which C++ is the best tool.
[0]: https://www.etlcpp.com/
Last time I did embedded work this was basically all that was required.
Also -nostdlib means no global constructors run, so static objects with nontrivial ctors need you to call __libc_init_array yourself.
Suppose Doodads can be constructed from a Foozle either with the Foozle Resigned or with the Foozle Submitted. Using tagged dispatch we make Resigned and Submitted types and the Doodad has two specialised constructors for the two types even though substantively we pass only the Foozle in both cases.
In a language like Rust all the constructors have names, it's obvious what Vec::with_capacity does while you will still see C++ programmers who thought constructing a std::vector with a single integer parameter does the same because it's just a constructor and you'd need to memorize what happens.
For more on disambiguation tags, see https://quuxplusone.github.io/blog/2025/12/03/tag-types/
and https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p39...
idk man, obviously I don't know much since I don't have my own online book, but templates would not be at the start of my list when selling C++ for bare-metal.
unit suffixes, namespaces, RAII, constexpr, OOP (interfaces mostly), and I like a lot of the STL in avoiding inscrutable "raw loops".
I like the idea of templates, but it feels like a different and specialized skillset. If you are considering C++ from C, why not ease into it?
Just very basic type substitution is one of the most useful uses of templates and is useful in pretty much all software
They’re also useful when you can’t use virtual dispatch. Concepts help a lot in making that tolerable.
Sure they can get stupid complicated and ugly as hell, but you don’t have to do that. Even their basic form is very useful
That said, RAII is probably the must useful thing
As I understand this is due to requirement for distinct C++ functions to have unique addresses.
I've seen a lot of new people come into my team as juniors, or regular C/C++ engineers that convert to embedded systems. There is a real lack of good, concise resources for them, and the best result I've had is just mentoring them and teaching as we go.
You could look for an intro to embedded systems resource. Or just get a dev kit for something. Go different than the standard Pi or Arduino. Try and get something like a STM32G0 dev kit working and blinking its lights. It's less polished, but you'll have to touch more things and learn more.
If you want, core areas I would suggest to research are the very low level operations of a processor:
* How does the stack pointer work? (What happens to this during function calls?
* How do parameters get passed to functions by value, by reference? What happens when you pass a C++ class to a function by value? What is a deep vs shallow copy of a C++ object, and how does that work when you don't have an OS or MMU?
* Where is heap memory stored? Why do we have a heap and a stack? How do these work in the absence of an OS?
* The Program Counter? (PC register). What happens to this as program execution progresses?
* What happens when a processor boots, how does it start receiving instructions? (This is vague, you could read the datasheet for something like the STM32G0 microcontroller, or the general Arm Cortex M0 core.)
* How are data/instructions loaded from disk to memory to cache to register? What are these divisions and why do we have them?
* Basic assembly language, you should know how loads and stores work, basic arithmetic, checks/tests/comparisons, jump operations.
* How do interrupts work, what's an ISR, IRQ, IVT? How do peripherals like UART, I2C (also what are these?), handle incoming data when you have a main execution thread already running on a single core processor?
Some of this may be stuff you already know, or seem rudimentary and not exactly relevant, but they are things that have to be rock solid before you start thinking about how compilers for differently languages, like C++, create machine code that runs on a processor without an OS.
Assembly is often overlooked, but a critical skill here. It's really not that bad. Often when working with C++ (or Rust) on embedded systems, if I'm unsure of something, my first step is to decompile and dump the assembly to investigate, or step through the assembly with GDB via a JTAG probe on the target processor if the target was too small to hold all the debug symbols (very common).
Anyways, this may have been more than you were asking for. Just me typing out thoughts over my afternoon coffee.
The former is probably more what you are looking for.
- Applied Embedded Electronics: Design Essentials for Robust Systems by J. Twomey. It goes over the whole process making a device and what knowledge would be required for each. Making Embedded Systems, 2nd Edition by E. White is a nice complement.
- Embedded System Interfacing by M. Wolf describes the signals and protocols behind everything. It's not necessary as a book, but can help understand the datasheets and standards
- But you want to start with something like Computer Architecture by C. Fox or Write Great Code - Volume 1 - Understanding the Machine, 2nd Edition by R. Hyde. There are better textbooks out there, but those are nice introductions to the binary world.
The gist is that you don't have a lot of memory and CPU power (If you do, adapting Linux is a more practical option unless it's not suited for another reason). So all the nice abstractions provided by an OS is a no go, so you need to take care of the hardware and those are really finicky,
Using C++ iterator interface to fix the main problem of a standard ring buffer of non-contiguous regions is a cute idea, but I like to use a "bip buffer"[1] instead which actually always gives you real contiguous regions so you can pass the pointers to things like a dma engine.
[1] https://ferrous-systems.com/blog/lock-free-ring-buffer/
The tradeoff is that you have in the worse case only half the buffer available - the ring buffer essentially becomes a kind of double buffer where you periodically switch between writing/reading at the end or the beginning of the storage.
I didn't say that GCC version is relevant to bare metal definition, I said it's 15 years old. And when you try to draw conclusions, and derive strong opinion, based on the codegen output but you're using 15 year old toolchain, and btw pretty shitty CPU core, something isn't quite right. This article is just a good display of never ending cargo-cult programming myths.
Their stuff isn't running on top of Linux on the Pi.