Home > Backend Development > C++ > How can vectorization techniques be used to accelerate the conversion of an IPv4 address from a string to an integer?

How can vectorization techniques be used to accelerate the conversion of an IPv4 address from a string to an integer?

DDD
Release: 2024-11-15 16:49:03
Original
585 people have browsed it

How can vectorization techniques be used to accelerate the conversion of an IPv4 address from a string to an integer?

Fastest Way to Obtain IPv4 Address from String

Original Code In Question:

UINT32 GetIP(const char *p)
{
    UINT32 dwIP=0,dwIP_Part=0;
    while(true)
    {
        if(p[0] == 0)
        {
            dwIP = (dwIP << 8) | dwIP_Part;
            break;
        }
        if(p[0]=='.') 
        {       
            dwIP = (dwIP << 8) | dwIP_Part;                     
            dwIP_Part = 0;
           p++;
        }
        dwIP_Part = (dwIP_Part*10)+(p[0]-'0');
        p++;
    }
    return dwIP;
}
Copy after login

Faster Vectorized Solution:

Utilizing the x86 instruction set, a more efficient solution to the problem is presented below:

UINT32 MyGetIP(const char *str) {
    // Load and convert input
    __m128i input = _mm_lddqu_si128((const __m128i*)str);
    input = _mm_sub_epi8(input, _mm_set1_epi8('0'));

    // Generate shuffled array
    __m128i cmp = input;
    UINT32 mask = _mm_movemask_epi8(cmp);
    __m128i shuf = shuffleTable[mask];
    __m128i arr = _mm_shuffle_epi8(input, shuf);

    // Calculate coefficients
    __m128i coeffs = _mm_set_epi8(0, 100, 10, 1, 0, 100, 10, 1, 0, 100, 10, 1, 0, 100, 10, 1);

    // Multiply and accumulate
    __m128i prod = _mm_maddubs_epi16(coeffs, arr);
    prod = _mm_hadd_epi16(prod, prod);

    // Reorder result
    __m128i imm = _mm_set_epi8(-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 6, 4, 2, 0);
    prod = _mm_shuffle_epi8(prod, imm);

    // Extract result
    return _mm_extract_epi32(prod, 0);
}
Copy after login

Precalculation of ShuffleTable:

void MyInit() {
    int len[4];
    for (len[0] = 1; len[0] <= 3; len[0]++)
        for (len[1] = 1; len[1] <= 3; len[1]++)
            for (len[2] = 1; len[2] <= 3; len[2]++)
                for (len[3] = 1; len[3] <= 3; len[3]++) {
                    int slen = len[0] + len[1] + len[2] + len[3] + 4;
                    int rem = 16 - slen;
                    for (int rmask = 0; rmask < 1<<rem; rmask++) {
                        int mask = 0;
                        char shuf[16] = {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
                        int pos = 0;
                        for (int i = 0; i < 4; i++) {
                            for (int j = 0; j < len[i]; j++) {
                                shuf[(3-i) * 4 + (len[i]-1-j)] = pos;
                                pos++;
                            }
                            mask ^= (1<<pos);
                            pos++;
                        }
                        mask ^= (rmask<<slen);
                        _mm_store_si128(&amp;shuffleTable[mask], _mm_loadu_si128((__m128i*)shuf));
                    }
                }
}
Copy after login

Evaluation:

This solution is significantly faster due to vectorization techniques, outperforming the original code by 7.8 times. It can process approximately 336 million IP addresses per second on a single core of a 3.4 GHz processor.

The above is the detailed content of How can vectorization techniques be used to accelerate the conversion of an IPv4 address from a string to an integer?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template