It's a huge reminder C++ is missing a proper, hygienic, macro system. Too many things that are a pain to do in C++ would be easy with a real macro language. I have hope we'll get it sometime this decade, seeing all the work on the language since C++11 still happening more than 10 years later.
It's worth nothing that a macro system plus basic reflection is where the real power of macros lies at.
Actually, some of the problem may be with C++ itself. When the C preprocessor is used in a more flexible, dynamic language, you can do surprising things.
In the cppawk project, which combines the preprocessor with Awk, I used the preprocessor to create an iteration syntax with a vocabulary of useful clauses that combine together for parallel or cross-product iteration.
What helps is that you don't have to deal with types and declaration syntax. So many of the ideas in cppawk will not translate back to C or C++, or not without wrecking the syntax with additional arguments and whatnot.
The man page for the <iter.h> header has a section on defining a clause; I provided an example of defining a clause that iterates on alpha-numeric string ranges like from "A00" to "Z99".
As an experienced Lisp programmer (and implementor), I had to rub my eyes several times to believe I had such a thing working, under such a universally maligned and reviled preprocessor.
You don't get the guarantees for what code is generated that you do with macros and they take a lot longer to compile. Also you can't just modify the AST of the current scope like you can with macros - pretty much every single time I have to use a macro it's because I need to generate code within the current scope. Fortunately they are usually very short.
The two for me fill different niches - for generic type-safe functions, parametrised types, etc. - templates. For text generation - macros. A hygenic macro system that lets you generate AST nodes and gives you access to type information would be absolutely divine, but it doesn't seem like we're getting it. Imagine if we had a script language that had full access to the compiler's internals.
Most of the abuses of #define I have seen, a template or constexpr takes care of. There are still some cases where it is nice. But many times you probably should just write a function/method/template out of it anyway.
In case someone reads this comment as sarcasm: I got into a convo with Stroustrup about this once, back in the 90s. I said one thing I missed was the lack of macros, and he made a glancing comment about the preprocessor which I obviously dismissed and said didn’t even count. He bitterly said, “Yeah, unfortunately when something like that pollutes an ecological niche it becomes impossible to eradicate. The best I could get away with was templates.”
I'm standing by for the announcement that some caffeine-addled Boost metaprogramming madman has implemented the Rust borrow checker as a C++ template, or at least thinks that he may have, when the compilation completes sometime in the 2030s.
It's not clear how much boost.org magic they used. Failing that, a GCC extension could be needful.
From the ref:
---
Conclusion
We attempted to represent ownership and borrowing through the C++ type system, however the language does not lend itself to this. Thus memory safety in C++ would need to be achieved through runtime checks.
Don't forget function templates! D can do them, too, but we strongly discourage their use. The trouble is that people use them to create DSLs that are indistinguishable from C++ code. For example:
Yes, obviously operations with parser combinators are different that those with numbers. (Also, I find it kind of dumb to reserve short symbols for low-level operations that are rarely used in normal programming.)
Well, the problem here is the re-use of existing operators.
(That's why it's great that Haskell and other languages in that family allow you to define your own operators, instead of eg re-using bit-shifting for IO.)
Templates are nice, but they have shortcomings a more generic macro system wouldn't. They also have the issue where the more complex is your task, the more convoluted the code has to look, compilation times also increase and parsers (ergo, IDEs too) have trouble giving meaningful info on parameters. Don't even get me started on template errors because that's an atrocity on another level :(
That's pretty cool. It's been a while since I've done C, but couldn't you use a `for` loop instead of a while and perform any necessary cleanup in the "update" section? i.e. https://gcc.godbolt.org/z/jq84jondh
(The condition is optimized away by the big three: msvc, clang, and gcc)
I think in part because do … while expects a ; at the end so you are obliged to provide one, which makes the macro feel more like a “real” function call.
Good point, thank you. The while (0) demands the expected ; The trailing else hopes for the expected ; but would tolerate a wide range of nonsense instead.
I wrote a useful extension to the C preprocessor for GCC, and submitted it to the gcc-patches mailing list in April. This went unnoticed, as have my subsequent pings since. I'm planning to ping once a month from now on until the rest of 2022, and then switch to quarterly.
__EXP_COUNTER__ gives a macro expansion to a numeric value which uniquely enumerates that expansion.
The sister macro __UEXP_COUNTER__ allows a macro expansion to access the parent's value: if a macro is being expanded in the body of another macro, one level up, it provides the __EXP_COUNTER__ value of that parent macro.
This feature solves the problem of producing unique names in a macro. (Unique within a translation unit.)
The __LINE__ symbol gets abused for this. The problem is that it's not unique. A macro can be called two or more times in the same line of code. Moreover, the same line number like 42 can occur multiple times in the same translation unit due to #include; a line number is not unique within a translation unit.
__COUNTER__ is next to useless because on each access, its value changes. It's useful in a situation in which a name is needed syntactically, and has to be unique, but is otherwise never referenced: just mentioned once and that's it.
Multiple references to __EXP_COUNTER__ in the same macro expansion context produce the same value.
As someone who understands C macros relatively well and who makes frequent use of them, I don't think I understand what __EXP_COUNTER__ does, and how it is different from __COUNTER__. I would have to experiment each time before using it, and would then quickly forget again the intricate details about expansion order, etc., similar to how every time I do some kind of STRINGIFY macro I have to make sure to use the right number of forwarding macro calls.
Is there a concrete use case for this that really can't be solved by __LINE__? I've used __LINE__ in the past to generate unique identifiers used in macro-generated code chunks. I don't see that non-uniqueness thing you mentioned causing any problems except for global variables (so not really an issue in my book).
As much as I love the C preprocessor as a crude tool that can solve many practical issues that are solved in other languages with a magnitude more complexity (besides solving problems that other languages don't have), I think the value doesn't come from its unintelligible execution model. And if __EXP_COUNTER__ is so difficult to understand, I personally don't like it.
The parent comment explained pretty well the advantages over __LINE__ imo.
Crafting macros is often black magic, but using them shouldn't be (if they're well crafted). Having an implicit rule in your macro that it cannot be used twice in the same line is surprising and potentially dangerous.
Another example is where you have a macro doing lots of work, if it needs to use a submacro multiple times that itself needs a unique identifier, then __LINE__ is no longer sufficient.
__UEXP_COUNTER__ is a little more difficult to imagine a use-case for, I'll admit (I can see it allows passing counters around, but I can't see why a parameter couldn't do the same).
Again, preprocessor macros are black magic, these additions seem a lot simpler to understand than `__VA_OPT__` (and its predecessor `##__VA_ARGS__`) or MSVCs awful stringify problems, as you brought up.
__EXP_COUNTER__ has a stable value in a given token replacement sequence (right hand side of a macro).
__COUNTER__ __COUNTER__ __COUNTER__ might give you 42 43 44, whereas __EXP_COUNTER__ __EXP_COUNTER__ __EXP_COUNTER__ will produce 73 73 73.
We can imagine that every macro has a hidden parameter:
#define MAC(A, B, C, __EXP_COUNTER__)
we don't pass this parameter when calling the macro; the macro expander does that, and it passes an integer token whose value is incremented for each such call.
Right, but now you can't use that COUNTER value in a macro and have it stay the same value. Think about concatenating a variable name with the counter and trying to use that same new name later in the same macro. Like NAME ## __COUNTER = NAME ## __COUNTER -1; This won't work without some extra state.
This code won't work in any case. You can't do arithmetic like that. And I think it's much better code anyway to create the expansion once, because otherwise you have to construct NAME ## __COUNTER at each use and it quickly becomes unmaintainable and hard to change how you construct that name.
IMO the best solution if you want to avoid an extra indirection to inject some state, would be preprocessor variables that you can assign to in a macro expansion. Procedural preprocessor code basically. But the preprocessor doesn't work like that.
I'm probably missing something, but can't you use counter to generate a unique name once and then forward to to another macro so that it can be used multiple times?
Thus perhaps you may be able to get something like __EXP_COUNTER__ by splitting your macros into interface and implementation:
#define MAC_IMPL(A, B, EXP_COUNTER)
#define MAC(A, B) MAC_IMPL(A, B, __COUNTER__)
I'm guessing this is what you mean by forwarding.
This could be a pretty major inconvenience, if you have to do it in the middle of a situation that is already stuffed with preprocessing contortions. Like say you had to define 32 macros that are similar to each other, for whatever reason, and you want this hack: now you have 64.
By the way, I'm also interested in solving the "no recursive macro" problem hinted at in this submission. While working on __EXP_COUNTER__, I looked into it a bit.
The big issue is that the GNU C preprocessor uses global state for tracking expansion. In effect, it takes advantage of the no-recursion rule and says that during a macro's expansion, only one context for that expansion needs to exist. That context is patched into the macro definition, or something like that. (I don't have the code in front of me and it's been a few months.) The preprocessor knows that there is a current macro being expanded, and there is a stack of those; but that is referenced by its static definition, which has a 1:1 relationship to expansion state, like parameters, location and whatnot. That might have to turn into a stack, perhaps; there is a refactoring job there, and the code is a bit of a hornet's nest.
In terms of syntax/deployment, it would be easy. I envision that there could be a #defrec directive that is like #define, but which creates a macro that is blessed for recursive expansion. Or other possibilities: #pragma rec(names, of, macros, ...) which is better for code that has to work without the extension, since it uses #define.
Your NBDKIT_UNIQUE_NAME(name) cannot produce the same name twice because it doesn't take a counter as a parameter.
__EXP_COUNTER__ adds the ability for a macro expansion to have its own counter for that expansion instance, without some other macro having to hand it one as a an extra, visible parameter.
Macro systems inevitably wind up being used to create a specialized undocumented language that nobody but its creator understands.
I know how enticing they are, I designed and implemented one myself for the ABEL programming language. I used lots of clever C macros in my C programming, and was proud of them.
But, eventually, I removed all the macro usage, and quite preferred the resulting code. It was cleaner and easier to read.
It's not just C macros. It's the same for assembler macros. I've heard from others it's the same for other languages that rely on macros.
Essentially, macros are a cheap way to add power to a language. A better way is to add proper metaprogramming features. This is the route we chose to go with D, and it is satisfyingly successful.
I have seen this kind of ”flip-flop” behaviour people have with macros a few times. First you go all in, burn yourself, and then go to the other extreme.
Personally, i think macros are a good way to automate some common tasks, but you have to be carefull to keep them short. Also it is a good idea to prune macros periodically to remove what you dont need.
In Cpp, If you find yourself choosing weather to use a macro or a template; Choose the one which is more terse!
Also macros will always inline in debug while templates will generate functions in debug builds, without optimizations. This may be an important performance consideration at times.
I can hear the sound of a thousand LISP devs hurting in parens reading this comment lol.
Macros, as with most things (including even goto!) have their place, the problem is when they’re abused. But to say they’re never useful ever and you should instead always rely on language features is not something I agree with, and could even lead to language bloat if you need a full fledged feature for every little thing which would be trivially solved with a macro.
My unfettered opinion is that Lisp has not really caught on because it relies on macros to make it useful. Every project invents their own language on top of Lisp, incompatible with anyone else's.
It's like the problem with C++ before C++98. It had no string class, so everybody invented their own, all incompatible with everyone else's.
BTW, everyone says that they understand my point and use macros modestly and responsibly. Nearly all of them go on to create their own undocumented impenetrable language out of those macros.
It takes a programmer about 10 years of creating and using macros and dealing with other peoples' macros to come to the conclusion that the whole feature needs to be scrapped. Sadly, there aren't any shortcuts to this realization :-)
> My unfettered opinion is that Lisp has not really caught on because it relies on macros to make it useful. Every project invents their own language on top of Lisp, incompatible with anyone else's.
Been saying this for years - this way of programming is powerful for the lone hacker, but lethal for team efforts. I will never forget the guy who ported some weird function evaluation framework from Clojure to a Java app and then left for greener pastures, what he left behind was the gnarliest of mindfucks.
It also took the C++ community about 10 years to realize that the way iostreams was doing operator overloading to do pipelining was an abomination as well.
In the D community, we also strongly discourage operator overloading for any purpose other than creating arithmetic types.
I am doubtful that even WG21 as originally constituted would have accepted I/O Streams with its "Look at me, I've got operator overloading" operator abuse if it wasn't Stroustrup's own code. If some outsider had come along and said "Look at this slower, clumsier, operator abusing alternative to C's stdio" the committee might have quoted Stroustrups' own words condemning such abuse. "the ability to define new meanings for old operators can be used to write programs that are well nigh incomprehensible".
I'm with you up to a point on overloading, if it were up to me for example Rust would not implement Add and AddAssign on String, and certainly Java wouldn't special case += but we are where we are.
However Rust has several operators (fewer than C++ but still several) that aren't just for arithmetic types. Deref and DerefMut of course (used to implement smart pointers such as Arc), Index and IndexMut (for the indexing operator []) but also Try (implementation of the ? operator) and (though rather more distant into your future than Try if you write Stable Rust) the Function operator traits Fn, FnMut and FnOnce which represent callables.
Of course arguably Rust isn't overloading operators at all. Rust has no subtyping, and so whether you can Add or Multiply or Try something is a matter only of whether that type implements the associated Trait.
I think Rust's hygienic and declarative "by example" macros are very nice actually. You could of course do the same things with its procedural macros but that's messy and harder to maintain. Appropriate tools for the job, don't use a chainsaw to trim your rosebush.
I don't know D but it sounds like you do a lot of work at compilation which is good. I never understood why people took away the preprocessor but then forced the use of reflection which then breaks at runtime instead of breaking at compile time. When I write C# there would be so many opportunities for short preprocessor macros. Instead you either have to create a reflection monstrosity or copy/paste the same piece of code dozens of times.
My thoughts were "that's just #if true, no?", then "wait, static_cast is not part of the preprocessor, that can't work" to "wtf, it actually compiles"...
Edit: As people point out, you can click on it to get context. And yeah, that one is an oof.
> The #if statement replaces, after macro expansion, every remaining identifier with the pp-number 0. So #if static_cast<bool>(-1) is equivalent to #if 0<0>(-1), #if 0 > -1, and #if 1.
I suspect the point is that the preprocessor language expression syntax that is out of whack with the host languages it is integrated into. If you hoist an expression of the language proper into a preprocessing directive, you may get gibberish.
This could happen by accident, particularly through layers of macros:
Say you have:
#if SOME_MACRO(ARG)
originally, this expands to an constant expression in which everything is an integer; then someone edits the macro. Things may still compile, but the expression is gibberish, not doing what it looks like it's doing.
The macro could be used in non-preprocessing contexts:
int x = SOME_MACRO(X);
if (SOME_MACRO(Y)) ...
so that programmer might have a good reason for editing it; just they didn't notice it's also used in an #if directive.
I have some program I'm working on, doing the usual edit/compile/run/debug cycle. At some point I decide to compare two versions of some section of code, so I write out temporary files of the old section named "old" and the new section named "new". Then compiles start failing, but oddly it is a file that I haven't edited recently.
The issue is that some code (not necessarily even mine) has an "#include <new>" and it is picking up my temporary file named "new".
One of the most odd issues I have encountered was a test case that would fail if one random log line was deleted (which would normally means UB or timing issues) but, wildly, not when the log line was commented out. Turns out it was interaction between the use of __LINE__ in a macro to generate unique identifiers and a violation of the One Definition Rule.
Heh. If I had a nickel for every time I shot myself in the foot over the years by dropping a temporary file named "test.py" somewhere... I'd don't know about rich, but I'd probably at least be able to buy myself a coffee.
Anyone know what SIMD means at the bottom layer is here? I know what SIMD is I mean in the context of the preprocessor (and being the worst offender apparently).
Sorry, I misread your comment as
clickbait instead of clickable.
~~I can see how others are somewhat clickbait, but this is literally single instruction/continuation multiple data. It uses a single step in the continuation mechanism to compute on multiple elements of data to speed up the computation.~~
No worries, I probably misread your comment too :). I thought you were asking what SIMD/SCMD is in this context, but as the submitter you probably already did, and already knew that the links were clickable.
BTW: I don't think the titles of each entry are particularly clickbait-y.
Unrelated, but yesterday I found out you can have two bools in C++ that are both true but do not equal each other, by reinterpret casting them from (u)ints. I think this was for the standard bool type too... Now I'm questioning the most basic of things.
I'm very sure that's just UB. The standard requires unique representations of true and false (e.g. byte values of 1 and 0, but for all you care it could be 13 and 37). Converting an integer to bool (even implicitly, as in `if (3)`) is required to lead to those values in the abstract machine.
If you somehow force a bool to be a different value (e.g. `*(int*)(&myBool) = 7`), that's UB.
And so it does! Thank you, will see if I can keep it to prevent me from running into this issue. I have a feeling the Unreal codebase will be full of UB abuse though.
Clang, MSVC and GCC all have options to turn off various flavours of UB, or rather to define the behaviour in those cases. I strongly suggest using -fwrapv -fno-strict-aliasing -fno-delete-null-pointer-checks (and the equivalent in MSVC) in every large project. That is the easiest UB to hit and while the program will have a bug, it will at least be easier to reason about and optimised vs debug builds will have the same behaviour. Debugging "the compiler deleted my if meant to catch and log an error condition because after the inlining pass some function dereferences a pointer, thus the pointer cannot be null, thus the if can be deleted" is... hard.
I know that's the case for ints, but in this case [0] the int was cast to a boolean, which I thought would ensure comparisons would perform as expected, but no such luck.
You are not casting an int to a bool (which would indeed do the right thing) but casting a pointer to int to a pointer to bool which violates strict aliasing.
This is not strict aliasing as far as I understand. Strict aliasing is about inference of distinctness of pointers. The case here is that an invalidly typed pointer is created (a bool pointer pointing to where there is no bool). Not sure what this situation is called in standardese.
After skimming this, I still think the strict aliasing rule is used by compiler to avoid re-reads. What you were talking about is probably something else, maybe an "invalid lvalue access" as per your link.
From cpp reference: "Strict Aliasing: Given an object with effective type T1, using an lvalue expression (typically, dereferencing a pointer) of a different type T2 is undefined behavior, unless [non relevant exceptions omitted]".
In this case the expression has type bool and the underlying object has type int, so it is a straightforward strict aliasing violation.
With GCC you can compile with -fno-strict-aliasing to ignore this rule. But now you fall afoul of the rule that prevents accessing an invalid representation (i.e. a trap-representation) of an object. This rule is also described in the link I posted before, under the object representation paragraph.
Ah fair, thanks for pointing it out! Wasn't my issue, just the minimal reproduction of behaviour that happened over a few external modules, causing this issue.
The C/C++ preprocessor is probably the esoteric programming language that see the most real world use. Or would that be C++ template metaprogramming...?
i don't know which is worse: pre-processor abuse in C land or object oriented design abuse in C++ land, both can lead to code that is quite hard to maintain...
(i know that you can have pre-processor abuse in C++ too, but that's not common practice)
I started but couldn't finish. For example __FILE__ (and __LINE__ and similar) certainly can be used in user-written macros, but they evaluate to the file/line they appear at in the macro source, not at the line the macro is invoked.