Intel Intrinsic Headers

The Intel Intrinsic headers provide optimized functions that take advantage of the processor’s specialized features. Each intrinsic family has its own header. However, the <immintrin.h> header includes all other supported intrinsic headers.

  • Intel Intrinsics (including RDRND & FSGSBASE) – <immintrin.h>
  • I32 Intrinsics – <ia32intrin.h>
  • 3DNow! – <mm3dnow.h>
  • MMX – <mmintrin.h>
  • SSE –  <xmmintrin.h>
  • SSE2 – <emmintrin.h>
  • SSE3 – <pmmintrin.h>
  • SSSE3 – <tmmintrin.h>
  • SSE4.1 or SSE4.2 – <smmintrin.h>
  • ADX – <adxintrin.h>
  • AES or PCLMUL – <wmmintrin.h>
  • AVX – <avxintrin.h>
  • AVX2 – <avx2intrin.h>
  • AVX512BW – <avx512bwintrin.h>
  • AVX512ER – <avx512erintrin.h>
  • AVX512F – <avx512fintrin.h>
  • AVX512PF – <avx512pfintrin.h>
  • AVX512VL – <avx512vlintrin.h>
  • AVX512BW or AVX512VL – <avx512vlbwintrin.h>
  • BMI – <bmiintrin.h>
  • BMI2 – <bmi2intrin.h>
  • F16C – <f16cintrin.h>
  • FMA – <fmaintrin.h>
  • FMA4 – <fma4intrin.h>
  • FXSR – <fxsrintrin.h>
  • LWP – <lwpintrin.h>
  • LZCNT – <lzcntintrin.h>
  • POPCNT – <popcntintrin.h>
  • PREFETCHW – <prfchwintrin.h>
  • RDSEED – <rdseedintrin.h>
  • RTM – <rtmintrin.h>
  • SHA – <shaintrin.h>
  • XOP – <xopintrin.h>
  • XSAVE – <xsaveintrin.h>
  • XSAVEOPT – <xsaveoptintrin.h>
  • XTEST – <xtestintrin.h>

NOTE: Some intrinsics inside adxintrin.h are available only if ADX defined, while others are available if ADX undefined

Data Format Notations

  • ep – extended packed
  • p – packed
  • s – scalar

Datatype Notations

  • s – single-precision floating point
  • d – double-precision floating point
  • i128 – signed 128-bit integer
  • i64 – signed 64-bit integer
  • u64 – unsigned 64-bit integer
  • i32 – signed 32-bit integer
  • u32 – unsigned 32-bit integer
  • i16 – signed 16-bit integer
  • u16 – unsigned 16-bit integer
  • i8 – signed 8-bit integer
  • u8 – unsigned 8-bit integer

Machine Modes Used in C Attributes

Some compilers (such as GNU-GCC) support the mode attribute (__attribute__((mode(X)))). The mode attribute and the various machine-mode parameters can be used to create various datatypes. For instance, on an x86-64 system, 128-bit integers and Decimal-Floats can be created using the below code.

typedef signed int __attribute__((mode(TI)))   int128_t;
typedef unsigned int __attribute__((mode(TI)))   uint128_t;
typedef float __attribute__((__mode__(__SD__)))   _Decimal32;
typedef float __attribute__((__mode__(__DD__)))   _Decimal64;
typedef float __attribute__((__mode__(__TD__)))   _Decimal128;

The below list describes all of the possible parameters used by the mode attribute.

  • BI – 1 Bit
  • QI – Quarter Integer; 1 byte
  • HI – Half Integer; 2 bytes
  • PSI – Partial Single Integer; 4 bytes; not all bits used
  • SI – Single Integer; 4 bytes
  • PDI – Partial Double Integer; 8 bytes; not all bits used
  • DI – Double Integer; 8 bytes (64-bits)
  • TI – Tetra Integer; 16 bytes (128-bits)
  • OI – Octa Integer; 32 bytes (256-bits)
  • XI – Hexadeca Integer; 64 bytes (512-bits)
  • QF – Quarter Floating; 1 byte quarter-precision float-point
  • HF – Half Floating; 2 byte half-precision float-point
  • TQF – Three Quarter Floating; 3 byte three-quarter-precision float-point
  • SF – Single Floating; 4 byte single-precision float-point
  • DF – Double Floating; 8 byte double-precision float-point
  • XF – Extended Floating; 12 byte extended-precision float-point
  • TF – Tetra Floating; 16 byte tetra-precision float-point
  • SD – Single Decimal Floating; 4 byte (32-bit) decimal float-point
  • DD – Double Decimal Floating; 8 byte (64-bit) decimal float-point
  • TD – Tetra Decimal Floating; 4 byte (128-bit) decimal float-point
  • CQI – Complex Quarter Integer; 1 byte
  • CHI – Complex Half Integer; 2 bytes
  • CSI – Complex Single Integer; 4 bytes
  • CDI – Complex Double Integer; 8 bytes
  • CTI – Complex Tetra Integer; 16 bytes
  • COI – Complex Octa Integer; 32 bytes
  • QC – Quarter Complex; 1 byte quarter-precision complex float-point
  • HC – Half Complex; 2 byte half-precision complex float-point
  • SC – Single Complex; 4 byte single-precision complex float-point
  • DC – Double Complex; 8 byte double-precision complex float-point
  • XC – Extended Complex; 12 byte extended-precision complex float-point
  • TC – Tetra Complex; 16 byte tetra-precision complex float-point
  • QQ – Quarter-Fractional; 1-byte signed fractional number
  • HQ – Half-Fractional; 2-byte signed fractional number
  • SQ – Single-Fractional; 4-byte (32-bit) signed fractional number
  • DQ – Double-Fractional; 8-byte (64-bit) signed fractional number
  • TQ – Tetra-Fractional; 16-byte (128-bit) signed fractional number
  • UQQ – Unsigned Quarter-Fractional; 1-byte unsigned fractional number
  • UHQ – Unsigned Half-Fractional; 2-byte unsigned fractional number
  • USQ – Unsigned Single-Fractional; 4-byte (32-bit) unsigned fractional number
  • UDQ – Unsigned Double-Fractional; 8-byte (64-bit) unsigned fractional number
  • UTQ – Unsigned Tetra-Fractional; 16-byte (128-bit) unsigned fractional number
  • HA – Half-Accumulator; 2-byte (16-bit) signed accumulator
  • SA – Single-Accumulator; 4-byte (32-bit) signed accumulator
  • DA – Double-Accumulator; 8-byte (64-bit) signed accumulator
  • TA – Tetra-Accumulator; 16-byte (128-bit) signed accumulator
  • UHA – Unsigned Half-Accumulator; 2-byte (16-bit) unsigned accumulator
  • USA – Unsigned Single-Accumulator; 4-byte (32-bit) unsigned accumulator
  • UDA – Unsigned Double-Accumulator; 8-byte (64-bit) unsigned accumulator
  • UTA – Unsigned Tetra-Accumulator; 16-byte (128-bit) unsigned accumulator
  • CC – Condition Code
  • BLK – Block
  • VOID – Void
  • P – Address mode
  • V4SI – Vector; 4 single integers
  • V8QI – Vector; 8 single-byte integers
  • BND32 – 32-bit pointer bound
  • BND64 – 64-bit pointer bound

About Pointers in C

The C programming language (and similar languages) use a coding concept called “pointers”. Pointers point to a memory address. The pointer itself does not contain the data. Rather, the pointer stores the memory address (like the index of a book). Many people may have problems understanding pointers and addressing, so I hope this helps.

A pointer is a programming object that references a memory location. A pointer contains the memory address of a particular part of memory. A pointer is like a page number in the index of a book. The page number is the data stored by pointer. The words on the actual page is like the data in memory.

NOTE: “Dereferencing” is the act of obtaining the data on memory pointed to by the pointer.

A pointer must be of the same datatype as the data to which it points. Also, a pointer must be initialized before it can be used.

To make a pointer that points to an integer in memory, use the below code.

int int_in_mem = 32; // Initialize and declare integer
int* int_ptr; // Declare
int_ptr = &int_in_mem; // Initialize
int* ptr2 = ptr; // Create a second pointer that points to the same data

In the example, “int_in_mem” is an integer in memory. “int_ptr” only stores the memory address of the location of “int_in_mem“. “int_in_mem” has the value “32” while “int_ptr” contains the value the indicates where in memory “int_in_mem” resides. “ptr2” points at the same memory location as “ptr“. This is helpful when a particular memory location must be remembered when the other pointer will be changed to point to another location.

The ampersand (&) means memory location. Therefore, “&int_in_mem” gives the memory address of the data “int_in_mem“. The code &int_ptr would give the address in memory where “int_ptr” is stored. Thus, it is possible to have a pointer that points to a pointer. Also, the ampersand is helpful if code needs to know the literal memory address of a particular variable.

In the example, “int_ptr” is the plain pointer. *int_ptr is the data at that memory location (in this case, “32”). *int_ptr++ reads the value of the memory location and then increments the pointer. This means that the pointer will point to the memory location that comes after the location that is storing the “32”. (*int_ptr)++ will increment the data stored at that location. Thus, “32” will become “33”, but the pointer itself remains unchanged. Below are various pointer notations and their meaning and effects.

PointerMemory AddressData
ptrAccess memory addressUnchanged
*ptrUnchangedAccess data on memory
*ptr++Increment address after readingUnchanged
*(ptr++)Increment address after readingUnchanged
(*ptr)++UnchangedIncrement data after reading
*++ptrIncrement address before readingUnchanged
*(++ptr)Increment address before readingUnchanged
++*ptrUnchangedIncrement data before reading
++(*ptr)UnchangedIncrement data before reading
--*ptrUnchangedDecrement data before reading
ptr*++InvalidInvalid
ptr++*InvalidInvalid

Be careful when declaring multiple pointers. For instance, int* ptr_a, ptr_b; is equivalent to int* ptr_a; int ptr_b;.

Arrays notation is a special form of a pointer as seen in the table below.

ArrayPointer Equivalent
arrayptr*
array[1]*(ptr + 1)
array[2]*(ptr + 2)
array[1]*(array + 1)

An array is a collection of elements of data. Arrays are stored on memory and pointers are used to point and retrieve members/elements from the array. For instance, array[0] is a pointer to the first element on the array (computers start counting at zero). In many languages (such as C), strings are arrays of characters.

If using two pointers, the length of an array can be measured. For instance, the below code creates a string (character array) and measures the length of the string.

char ALPHABET[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ\0";
char* first_ptr = &ALPHABET[0];
char* second_ptr = &ALPHABET[5];
int distance = second_ptr - first_ptr;

distance will contain the value “5” since the memory address pointed to by “second_ptr” minus the address “first_ptr” is five. Thus, there are five elements from the first element of the array/string ([0]) to the sixth element ([5]). Remember, computers start counting at zero.

Now for another example.

int int_in_mem = 32; // Initialize and declare integer
int* int_ptr; // Declare
int_ptr = &int_in_mem; // Initialize
int* ptr2 = ptr; // Create a second pointer that points to the same data
int our_num = *ptr2; // Access data at the memory location indicated by ptr2
int num2 = (*int_ptr)++; // Access data at the memory location indicated by ptr2

In the above example, int_in_mem stores “32” while both int_ptr and ptr2 store the location of int_in_memour_num contains “32” because the code retrieves the data stored at the pointed memory location. num2 will contain “33” because the code gets the data on memory and then increments the value before storing it in num2.

To change the value of data in memory use the below code which changes “32” to “7”. If the code were int_ptr = 7; instead of *int_ptr = 7;, then int_ptr would point to memory address 7.

int int_in_mem = 32; // Initialize and declare integer
int* int_ptr; // Declare
int_ptr = &int_in_mem; // Initialize
*int_ptr = 7; // Place "7" at the pointed memory address

As for pointers to pointers, the below code, d points to the memory address that stores the “3” placed in a.

int a = 3;
int *b = &a;
int **c = &b;
int ***d = &c;

A NULL pointer (such as int* ptr = 0;) is a pointer that pointers to zero. This means that it does not point to any data.

A function pointer is a pointer to a function. Function pointers allow code to take functions as arguments which may be used to tell a function to use a particular function.

void test_func(int x) { printf("%d\n", x); } // Function
void (*func_ptr)(int); // Declare function pointer
func_ptr = &test_func; // Initialize function pointer
func_ptr(2); // Same as test_func(2)

Type Qualifiers in C

There are many different type-qualifiers in the C programming language. Here is a brief description of some type qualifiers in C (and similar languages).

Do remember the “Clockwise/Spiral Rule” – Read the type qualifiers backwards

  • int* – pointer to int
  • int const* – pointer to const int (const int* == int const*)
  • int *const – const pointer to int
  • int const *const – const pointer to const int (const int* const == int const *const)
  • int ** – pointer to pointer to int
  • int **const – const pointer to pointer to int
  • int *const * – pointer to const pointer to int
  • int const ** – pointer to pointer to const int
  • int *const *const – const pointer to const pointer to int
  • volatile int *const – constant pointer to volatile int
  • void (*signal(int, void (*fp)(int)))(int) – signal is a function passing an int and a pointer to a function passing an int returning nothing (void) returning a pointer to a function passing an int returning nothing (void)

Storage classes (listed below) come before type-qualifiers (also listed below). Only one storage class can be used when declaring a variable. Both storage and type qualifiers come before the datatype.

Storage Classes

  • auto – Stored in stack during the code-block
  • extern – Lasts the whole program, block, or compilation unit; globally visible
  • register – Stored in stack or CPU-register during the code block
  • static – Lasts the whole program, block, or compilation unit; private in program
  • typedef – The data specifies a new datatype
  • __thread – Thread-local-storage; one instance per thread
  • _Thread_local – Thread-local data

Type-Qualifiers

  • const – Value does not change; read-only
  • restrict – For the lifetime of the pointer, the object can only be accessed via the pointer
  • volatile – Optimizing-compilers must not change
  • _Atomic – Map a variable to a basic built-in type (depending on the processor) so that reading and writing are guaranteed to happen in a single instruction