Abusing the C preprocessor

August 29, 2011

Both tricks shown here are related with a change (by Peter Zijlstra) in the kmap_atomic() and kunmap_atomic() macros/functions. LWN has an excellent article about what that change involved. It basically ‘dropped’ support for atomic kmap slots, switching to a more general stack-based approach.

Now with this change, the number of arguments passed to the kmap_atomic() function changed too, and thus you end up with a huge patch covering all the tree, which fixed the issue (changing kmap_atomic(p, KM_TYPE) to kmap_atomic(p)).

But there was another way to go. Some C preprocessor magic.

#define kmap_atomic(page, args...) __kmap_atomic(page)

Yes, the C preprocessor supports va_args. :)
(which I found out when going through the reptyr code, but I’ll talk about it in an other post.)

Today, I saw a thread at the lkml, which actually did the cleanup I described. Andrew Morton responded:

I’m OK with cleaning all these up, but I suggest that we leave the back-compatibility macros in place for a while, to make sure that various stragglers get converted. Extra marks will be awarded for working out how to make unconverted code generate a compile warning

And Nick Bowler responded with a very clever way to do this (which involved abusing heavily the C preprocessor :P):

  #include <stdio.h>

  int foo(int x)
  {
     return x;
  }

  /* Deprecated; call foo instead. */
  static inline int __attribute__((deprecated)) foo_unconverted(int x, int unused)
  {
     return foo(x);
  }

  #define PASTE(a, b) a ## b
  #define PASTE2(a, b) PASTE(a, b)
  
  #define NARG_(_9, _8, _7, _6, _5, _4, _3, _2, _1, n, ...) n
  #define NARG(...) NARG_(__VA_ARGS__, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0)

  #define foo1(...) foo(__VA_ARGS__)
  #define foo2(...) foo_unconverted(__VA_ARGS__)
  #define foo(...) PASTE2(foo, NARG(__VA_ARGS__)(__VA_ARGS__))

  int main(void)
  {
    printf("%d\n", foo(42));
    printf("%d\n", foo(54, 42));
    return 0;
  }

The actual warning is printed due to the deprecated attribute of the foo_unconverted() function.

The fun part, however, is how we get to use the foo ‘identifier’/name to call either foo() or foo_uncoverted() depending on the number of arguments given. :)

The trick is to use the __VA_ARGS__ to ‘shift’ the numbers 9-0 in the NARG macro, so that when calling the NARG_ macro, _9 will match with the first __VA_ARGS__ argument, _8 with the second etc, and so n will match with actual number of arguments (I’m not sure I described it very well, but if you try doing it by hand, you’ll understand how it’s working).

Now that we have the number of arguments given to foo, we use the PASTE macro to ‘concatenate’ the number of the arguments with the function name, and the actual arguments given, and call the appropriate wrapper macro (foo1, foo2 etc).

Another interesting thing, which I didn’t know, is about argument expansion in macros. For macros that concatenate (##) or stringify (#) the arguments are not expanded beforehand. That’s why we have to use PASTE2 as a wrapper, to get the NARG() argument/macro fully expanded before concatenating.

Ok, C code can get at times a bit obfuscated, and yes you don’t have type safety etc etc, but, man, you can be really creative with the C language (and the C preprocessor)!
And the Linux kernel development(/-ers) prove just that. :)

Coolest hack/trick ever!

February 17, 2011

Some time ago, I wrote about lguest, a minimal x86 hypervisor for the Linux Kernel, which is mainly used for experimentation, and learning stuff about hypervisors, operating systems, even computer architecture/ISA(x86 in particular).

Today I cloned the git repo for the lguest64 port, and I started browsing through the documentation and the code. In the launcher code(the program that initializes/sets up and launches a new guest kernel), I saw the coolest programming hack/trick I’ve seen in a long time. :P

/*L:170 Prepare to be SHOCKED and AMAZED.  And possibly a trifle nauseated.
 *
 * We know that CONFIG_PAGE_OFFSET sets what virtual address the kernel expects
 * to be.  We don't know what that option was, but we can figure it out
 * approximately by looking at the addresses in the code.  I chose the common
 * case of reading a memory location into the %eax register:
 *
 *  movl <some-address>, %eax
 *
 * This gets encoded as five bytes: "0xA1 <4-byte-address>".  For example,
 * "0xA1 0x18 0x60 0x47 0xC0" reads the address 0xC0476018 into %eax.
 *
 * In this example can guess that the kernel was compiled with
 * CONFIG_PAGE_OFFSET set to 0xC0000000 (it's always a round number).  If the
 * kernel were larger than 16MB, we might see 0xC1 addresses show up, but our
 * kernel isn't that bloated yet.
 *
 * Unfortunately, x86 has variable-length instructions, so finding this
 * particular instruction properly involves writing a disassembler.  Instead,
 * we rely on statistics.  We look for "0xA1" and tally the different bytes
 * which occur 4 bytes later (the "0xC0" in our example above).  When one of
 * those bytes appears three times, we can be reasonably confident that it
 * forms the start of CONFIG_PAGE_OFFSET.
 *
 * This is amazingly reliable. */
static unsigned long intuit_page_offset(unsigned char *img, unsigned long len)
{
	unsigned int i, possibilities[256] = { 0 };

	for (i = 0; i + 4 < len; i++) {
		/* mov 0xXXXXXXXX,%eax */
		if (img[i] == 0xA1 && ++possibilities[img[i+4]] > 3)
			return (unsigned long)img[i+4] << 24;
	}
	errx(1, "could not determine page offset");
}

It’s very well commented, and so I don’t think there’s something I could explain any better.
Very very nice trick!
‘Prepare to be shocked and amazed!’ ;)

Follow

Get every new post delivered to your Inbox.

Join 276 other followers