Honestly, the cleanest approach is probably going to be to have the lower half and higher half in different executables. This is what I'm doing to avoid such problems, but then, I have to since I am developing a 64-bit OS (and the lower-half code must be 32-bit). In my case, the loader kernel gets the memory info from multiboot, puts itself in the reserved section, puts the main kernel in the reserved section, and then allocates all the paging structures. At the end, it removes itself from the reserved section and hands those structures in a defined way to the higher-half kernel. In this way, there is no added memory footprint since the loader kernel frees itself at the end.
If you insist on continuing on your path, you can probably automate the process somewhat with macros. Such as (untested)
Code:
#define lower_half(x) ((__typeof__(&x))((uintptr_t)(x) - HIGHER_HALF_START_ADDR))
//and then you use it like:
lower_half(memcpy)(dest, src, len);
But I wouldn't want to use such an approach, because the compiler can emit undecorated calls to memcpy/memmove/memset literally everywhere. So that is why I would suggest separate executables. I think I remember Linux doing something similar, with a boot section that is separate from the rest of the kernel.