r6.1.0:Porting Guide (lfds)
Introduction
There are six issues to deal with when it comes to porting liblfds;
- The lock-free algorithms implemented in liblfds depend on atomic instructions and the API for these instructions varies by platform.
- Atomic instructions always need to work on aligned memory (e.g. memory allocated on pointer length boundaries) and as such liblfds needs for heap allocations access to platform specific aligned malloc and aligned free functions and for stack allocations access to the compiler specific keyword which controls the memory alignment of variables declared on the stack.
- The atomic instructions always correspond to a single underlying CPU instruction and it is important that the abstraction layer functions are inlined by the compiler, otherwise we have the overhead of a function call every time we use one of these instructions and one of the points of using lock-free is performance; as such, liblfds can be given (but does not require) access to the compiler specific keyword which controls function inlining.
- The word length of the underlying hardware can vary. In today's world, this means 32 bit CPUs and 64 bit CPUs and liblfds needs to know an integer type which has the word length of the underlying CPU.
- Compilers and processors re-order instructions such that compiler and processor barriers must be provided.
- Memory allocation and release functions (malloc and free) are required.
In response to these issues, liblfds implements an abstraction layer. The abstraction layer provides to the liblfds a standard API to mask these platform dependent issues. Porting consists of implementing the abstraction layer API on your platform.
Detailed Platform Requirements
To port liblfds, the target platform must provide;
- atomic single-word increment
- atomic single-word compare-and-swap
- atomic contigious double-word compare-and-swap
- malloc and free
- compiler directive for alignment of variables declared on the stack
- compiler directives for compiler barriers and processor barriers
A word here means a type equal in length to the platform pointer size.
Also, the target plaform may provide;
- compiler keyword for function inlining
It is desirable but not essential to have a compiler keyword for function inlining. If the abstraction layer define for this keyword is left empty, then inlining will not occur; everything will still work, it'll just be slower.
Note that the Alpha, IA64, MIPS, PowerPC and SPARC architectures do not provide contigious double-word compare-and-swap and as such, liblfds cannot be ported to those architectures.
The Abstraction Layer
At this point in the guide, you should briefly open the abstraction layer API page, lfds610_abstraction (liblfds), to get an overview of the API function prototypes, defines and typedefs.
Each function, define and typedef has a wiki page and those pages contain detailed information and examples specifically written with implementation in mind.
Implementing the Abstraction Layer
Implementing the abstraction layer requires;
- implementing the five abstraction layer functions;
- malloc
- free
- atomic single-word compare-and-swap function, compiler instrinic or instruction
- atomic contigious double-word compare-and-swap function, compiler instrinic or instruction
- atomic increment function, compiler instrinic or instruction
- providing the necessary set of typedefs and defines in liblfds.h;
- a type for lfds610_atom_t
- the (optional, but it's highly desireable) compiler inline directive for LFDS610_INLINE
- the compiler stack variable alignment directive for LFDS610_ALIGN
- the correct sizes for LFDS610_ALIGN_SINGLE_POINTER and LFDS610_ALIGN_DOUBLE_POINTER
- the directives for LFDS610_BARRIER_COMPILER_LOAD/STORE/FULL and LFDS610_BARRIER_PROCESSOR_LOAD/STORE/FULL
- #including any header files needed for your platform
The location of the functions, typedef and defines in the source files is as follows;
File | Function |
/liblfds610/src/lfds610_abstraction/lfds610_abstraction_free.c | lfds610_abstraction_free |
/liblfds610/src/lfds610_abstraction/lfds610_abstraction_malloc.c | lfds610_abstraction_malloc |
/liblfds610/src/lfds610_abstraction/lfds610_abstraction_cas.c | lfds610_abstraction_cas |
/liblfds610/src/lfds610_abstraction/lfds610_abstraction_dcas.c | lfds610_abstraction_dcas |
/liblfds610/src/lfds610_abstraction/lfds610_abstraction_increment.c | lfds610_abstraction_increment |
/liblfds610/inc/liblfds610.h | typedef, defines and includes |
When you come to implement, you will be adding your code alongside other existing implementations for other platforms. You must guard each function by an #if, such that it is only seen by platforms which can use that implementation. The set of the typedef and defines in liblfds610.h is similarly guarded by an identical #if.
So for example, lfds610_abstraction_dcas in lfds610_abstraction_dcas.c, on 64-bit Windows looks like this;
#if (defined _WIN64 && defined _MSC_VER) /* TRD : 64 bit Windows (user-mode or kernel) on any CPU with the Microsoft C compiler _WIN64 indicates 64 bit Windows _MSC_VER indicates Microsoft C compiler */ static LFDS610_INLINE unsigned char lfds610_abstraction_dcas( volatile lfds610_atom_t *destination, lfds610_atom_t *exchange, lfds610_atom_t *compare ) { assert( destination != NULL ); assert( exchange != NULL ); assert( compare != NULL ); return( _InterlockedCompareExchange128((volatile __int64 *) destination, (__int64) *(exchange+1), (__int64) *exchange, (__int64 *) compare) ); } #endif
And the typedef and defines in liblfds.h on 64-bit Linux with GCC looks like this;
#if (defined __unix__ && defined __x86_64__ && __GNUC__) // TRD : any UNIX with GCC on x64 #include <assert.h> #include <stdio.h> #include <stdlib.h> typedef unsigned long long int lfds610_atom_t; #define LFDS610_INLINE inline #define LFDS610_ALIGN(alignment) __attribute__( (aligned(alignment)) ) #define LFDS610_ALIGN_SINGLE_POINTER 8 #define LFDS610_ALIGN_DOUBLE_POINTER 16 #define LFDS610_BARRIER_COMPILER_LOAD __asm__ __volatile__ ( "" : : : "memory" ) #define LFDS610_BARRIER_COMPILER_STORE __asm__ __volatile__ ( "" : : : "memory" ) #define LFDS610_BARRIER_COMPILER_FULL __asm__ __volatile__ ( "" : : : "memory" ) #define LFDS610_BARRIER_PROCESSOR_LOAD __sync_synchronize() #define LFDS610_BARRIER_PROCESSOR_STORE __sync_synchronize() #define LFDS610_BARRIER_PROCESSOR_FULL __sync_synchronize() #endif
(We see here that this platform only provides a full compiler barrier and a full processor barrier).