r6.0.1:Porting Guide

From liblfds.org
Revision as of 14:07, 4 January 2015 by Admin (talk | contribs) (1 revision imported)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Library

Introduction

There are four issues to deal with when it comes to porting liblfds;

  1. The lock-free algorithms implemented in liblfds depend on atomic instructions and the API for these instructions varies by platform.
  2. Atomic instructions always need to work on aligned memory (e.g. memory allocated on pointer length boundaries) and as such liblfds needs for heap allocations access to platform specific aligned malloc and aligned free functions and for stack allocations access to the compiler specific keyword which controls the memory alignment of variables declared on the stack.
  3. The atomic instructions always correspond to a single underlying CPU instruction and it is important that the abstraction layer functions are inlined by the compiler, otherwise we have the overhead of a function call every time we use one of these instructions and one of the points of using lock-free is performance; as such, liblfds needs access to the compiler specific keyword which controls function inlining.
  4. The word length of the underlying hardware can vary. In today's world, this means 32 bit CPUs and 64 bit CPUs and liblfds needs to know an integer type which has the word length of the underlying CPU.

In response to these issues, liblfds implements an abstraction layer. The abstraction layer provides to the liblfds a standard API to mask these platform dependent issues. Porting consists of implementing the abstraction layer API on your platform.

Detailed Platform Requirements

To port liblfds, the target platform must provide;

  • atomic single-word increment
  • atomic single-word compare-and-swap
  • atomic contigious double-word compare-and-swap
  • aligned malloc
  • aligned free
  • compiler keyword for alignment of variables declared on the stack

A word here means a type equal in length to the platform pointer size.

Also, the target plaform may provide;

  • compiler keyword for function inlining

It is desirable but not essential to have a compiler keyword for function inlining. If the abstraction layer define for this keyword is left empty, then inlining will not occur; everything will still work, it'll just be slower.

Currently, liblfds has been ported only to platforms which provide a contigious double-word compare-and-swap (x86, x64 and ARM). It is entirely possible to port to a platform which instead provides load-link/conditional-store where non-load-linked loads can be performed inside the LL/SC pair (this is how ABA is solved using single word LL/SC) but this has not yet been done and it is not reasonable to ask a third party to undertake such a port until I've done it and documented it. Please see lfds601_abstraction_dcas for more details should you wish to try this anyway.

Note that the SPARC and IA64 architectures provides neither contigious double-word compare-and-swap nor load-link/conditional-store and as such, liblfds cannot be ported to those architectures.

The Abstraction Layer

At this point in the guide, you should briefly open the abstraction layer API page, lfds601_abstraction (liblfds), to get an overview of the API function prototypes, defines and typedefs.

Each function, define and typedef has a wiki page and those pages contain detailed information and examples specifically written with implementation in mind.

Implementing the Abstraction Layer

Implementing the abstraction layer requires;

  • implementing the five abstraction layer functions;
    • aligned malloc
    • aligned free
    • atomic single-word compare-and-swap function, compiler instrinic or instruction
    • atomic contigious double-word compare-and-swap function, compiler instrinic or instruction
    • atomic increment function, compiler instrinic or instruction
  • providing the necessary set of typedefs and defines in liblfds.h;
    • a type for lfds601_atom_t
    • the compiler inline directive for LFDS601_INLINE
    • the compiler stack variable alignment directive for LFDS601_ALIGN
    • the correct sizes for LFDS601_ALIGN_SINGLE_POINTER and LFDS601_ALIGN_DOUBLE_POINTER
    • #including any header files needed for your platform

The location of the functions, typedef and defines in the source files is as follows;

File Function
/liblfds601/src/lfds601_abstraction/lfds601_abstraction_aligned_free.c lfds601_abstraction_aligned_free
/liblfds601/src/lfds601_abstraction/lfds601_abstraction_aligned_malloc.c lfds601_abstraction_aligned_malloc
/liblfds601/src/lfds601_abstraction/lfds601_abstraction_cas.c lfds601_abstraction_cas
/liblfds601/src/lfds601_abstraction/lfds601_abstraction_dcas.c lfds601_abstraction_dcas
/liblfds601/src/lfds601_abstraction/lfds601_abstraction_increment.c lfds601_abstraction_increment
/liblfds601/inc/liblfds601.h typedef, defines and includes

When you come to implement, you will be adding your code alongside other existing implementations for other platforms. You must guard each function by an #if, such that it is only seen by platforms which can use that implementation. The set of the typedef and defines in liblfds601.h is similarly guarded by an identical #if.

So for example, lfds601_abstraction_aligned_free in lfds601_abstraction_aligned_free.c, on Windows looks like this;

#if (defined _WIN32 && defined _MSC_VER)

  /* TRD : any Windows on any CPU with the Microsoft C compiler

           _WIN32    indicates 64-bit or 32-bit Windows
           _MSC_VER  indicates Microsoft C compiler
  */

  void lfds601_abstraction_aligned_free( void *memory )
  {
    _aligned_free( memory );

    return;
  }

#endif

And the typedef and defines in liblfds.h on 64-bit Windows looks like this;

#if (defined _WIN64 && defined _MSC_VER)
  // TRD : 64-bit Windows with the Microsoft C compiler, any CPU
  #include <assert.h>
  #include <stdlib.h>
  #include <stdio.h>
  #include <windows.h>
  typedef unsigned __int64            lfds601_atom_t;
  #define LFDS601_INLINE                extern __forceinline
  #define LFDS601_ALIGN(alignment)      __declspec( align(alignment) )
  #define LFDS601_ALIGN_SINGLE_POINTER  8
  #define LFDS601_ALIGN_DOUBLE_POINTER  16
#endif

Existing Implementations Matrix

When an abstraction layer function is implemented, the protective #if exactly defines the platform requirements necessary for that implementation. As such, many functions although implemented for a specific port are in fact useable on a wide range of platforms. For example, there is an implementation of lfds601_abstraction_increment which only requires GCC 4.1.0 or later and so this implementation works on any operating system and any CPU.

Using this table can be a bit of a mind twister. The best way is to look at the function you're interested in and then go from left to right; you can then see what implementations exist for that function and if your platform matches any of them. Do that for each function and then you'll know what work you have to do to port. Generally speaking, on a Linux platform with GCC, the only work that is required is implementing double-word compare-and-swap, since this is not supported by GCC in its built-in atomic operations.

Notes; (u) denotes user-mode, (k) denotes kernel-mode, x86 means i486 or better, ARM means architecture 6K or better.

Standard
OS
CPU
Compiler
N/A
Windows (u)
Any
Microsoft C
N/A
Windows (k)
Any
Microsoft C
POSIX >= 6.0
Any
Any
Any
N/A
Any
Any
GCC >= 4.1.0
N/A
Any
x86, x64, ARM
GCC
lfds601_abstraction_aligned_free y y y
lfds601_abstraction_aligned_malloc y y y
lfds601_abstraction_cas y y y
lfds601_abstraction_dcas y y y
lfds601_abstraction_increment y y y

Test Program

Introduction

There are three issues when it comes to porting the test program;

  1. The test program needs to run multiple concurrent threads and the API for thread management varies by platform.
  2. The number of threads issued is a multiple of the number of CPUs and the API for determining the number of CPUs varies by platform.
  3. The prototype for a thread function varies by platform.

In response to these issues, the test program implements an abstraction layer. The abstraction layer provides to the test program a standard API for these platform dependent functions. Porting consists of implementing these functions on your platform.

Detailed Platform Requirements

To port the test program, the target platform must provide;

  • a function to start a thread
  • a function to wait for a thread to terminate
  • the type of the value returned by the thread function prototype
  • the type used for thread state

The Abstraction Layer

At this point in the guide, you should briefly open the abstraction layer API page, abstraction (test), to get an overview of the API function prototypes, defines and typedefs.

Each function, define and typedef has a wiki page and those pages contain detailed information and examples specifically written with implementation in mind.

Implementing the Abstraction Layer

Implementing the abstraction layer requires;

  • implementing the three abstraction layer functions;
    • a function for finding the number of CPUs
    • a function for starting a thread
    • a function for waiting for a thread to terminate
  • providing the necessary set of typedefs and defines in lfds601_abstraction.h;
    • a type for lfds601_thread_state_t
    • a type for lfds601_thread_return_t
    • #including any header files needed for your platform

The location of the functions and typedefs in the source files is as follows;

File Function
/test/src/lfds601_abstraction_cpu_count.c lfds601_abstraction_cpu_count
/test/src/lfds601_abstraction_thread_start.c lfds601_abstraction_thread_start
/test/src/lfds601_abstraction_thread_wait.c lfds601_abstraction_thread_wait
/test/src/lfds601_abstraction.h the typedefs and includes

When you come to implement, you will be adding your code alongside other existing implementations for other platforms. You must guard each function by an #if, such that it is only seen by platforms which can use that implementation. The typedefs in abstraction.h are similarly guarded by an identical #if.

So for example, lfds601_abstraction_cpu_count in lfds601_abstraction_cpu_count.c, on Windows looks like this;

#if (defined _WIN32 && defined _MSC_VER)

  /* TRD : any Windows on any CPU with the Microsoft C compiler

           _WIN32    indicates 64-bit or 32-bit Windows
           _MSC_VER  indicates Microsoft C compiler
  */

  unsigned int lfds601_abstraction_cpu_count()
  {
    SYSTEM_INFO
      si;

    GetSystemInfo( &si );

    return( (unsigned int) si.dwNumberOfProcessors );
  }

#endif

And the typedefs in lfds601_abstraction.h on Windows look like this;

#if (defined _WIN32 && defined _MSC_VER)
  /* TRD : any Windows on any CPU with the Microsoft C compiler

           _WIN32    indicates 64-bit or 32-bit Windows
           _MSC_VER  indicates Microsoft C compiler
  */

  #include <windows.h>
  typedef HANDLE  lfds601_thread_state_t;
  typedef DWORD   lfds601_thread_return_t;
#endif

The abstraction layer API root document page is abstraction (test).

Each abstraction layer function and typedef has a wiki page. Those pages contain detailed information and examples specifically written with implementation in mind. Refer to those pages for the information you need when you come to implement.

Existing Implementations Matrix

When an abstraction layer function is implemented, the protective #if exactly defines the platform requirements necessary for that implementation. As such, many functions although implemented for a specific port are in fact useable on a wide range of platforms. For example, there is an implementation of lfds601_abstraction_increment which only requires GCC 4.1.0 or later and so this implementation works on any operating system and any CPU.

Standard
OS
CPU
Compiler
N/A
Any Windows
Any
Microsoft C
pthreads
UNIX
Any
Any
Linux
Any
Any
GCC
lfds601_abstraction_cpu_count y y
lfds601_abstraction_thread_start y y
lfds601_abstraction_thread_end y y

Note that pthreads is not in fact restricted to UNIX. Implementations exist for Windows. However, Windows has it's platform-native threads, which are preferable, so the use of pthreads is for now with UNIXes.

See Also