Hi All,
I am trying to port a module to RE and it relies heavily on SIMD structures and instructions like:
__m128
_mm_castsi128_ps
_mm_set_ps
_mm_storeu_ps
has anyone had any luck porting code that packs floats into these types of structures before, or should I give up?
Murf.
SIMD is such a pain.
SIMD is available in Jukebox, even though the syntax looks a bit different:
https://developer.reasonstudios.com/doc ... types_simd
I used SIMD pretty heavily in Resonans, it made the modal resonator mode way more efficient.
https://developer.reasonstudios.com/doc ... types_simd
I used SIMD pretty heavily in Resonans, it made the modal resonator mode way more efficient.
You legend thanksbuddard wrote: ↑05 Apr 2023SIMD is available in Jukebox, even though the syntax looks a bit different:
https://developer.reasonstudios.com/doc ... types_simd
I used SIMD pretty heavily in Resonans, it made the modal resonator mode way more efficient.
I am having fun with SIMD, and have been writing a bridge between Reason SIMD and xmmintrin/emmintrin/pmmintrin
I will make it publicly available when done.
My current struggle is with comparisons (the functions that return the masks)
at first I tried:
This produced trash, and then I tried this as non SIMD based on the definition for the function
Still trash.
If anyone has pulled this off please let me know, as I said it will be made publicly available, but I dont expect anyway to divulge what they consider hard-earned private IP
by the way I have just defined the types per the Reason doco, and this has been working great for most intrinsics functions so far
Murf.
I will make it publicly available when done.
My current struggle is with comparisons (the functions that return the masks)
at first I tried:
Code: Select all
inline static __m128 _mm_cmpgt_ps(__m128 a, __m128 b) {
return a > b;
}
Code: Select all
TJBox_Float32 x0 = (TJBox_Float32)(a[0] > b[0] ? 0xffffffff : 0x0);
TJBox_Float32 x1 = (TJBox_Float32)(a[1] > b[1] ? 0xffffffff : 0x0);
TJBox_Float32 x2 = (TJBox_Float32)(a[2] > b[2] ? 0xffffffff : 0x0);
TJBox_Float32 x3 = (TJBox_Float32)(a[3] > b[3] ? 0xffffffff : 0x0);
return (__m128) {x0,x1,x2,x3};
If anyone has pulled this off please let me know, as I said it will be made publicly available, but I dont expect anyway to divulge what they consider hard-earned private IP
by the way I have just defined the types per the Reason doco, and this has been working great for most intrinsics functions so far
Code: Select all
typedef TJBox_Float32 __m128 __attribute__((__vector_size__(16)));
typedef TJBox_Int32 __m128i __attribute__((__vector_size__(16)));
Murf.
Ok for anyone interested I worked this out.
In a nutshell, "return a > b;" works perfectly for the compare function.
What I was doing wrong is this:
This is correct: (positions 1,3 and 2,4 swapped)
Murf
In a nutshell, "return a > b;" works perfectly for the compare function.
What I was doing wrong is this:
Code: Select all
inline static __m128 _mm_set_ps (float w, float x, float y, float z ) {
return (__m128){w,x,y,z};
}
Code: Select all
inline static __m128 _mm_set_ps (float w, float x, float y, float z ) {
return (__m128){y,z,w,x};
}
-
- Information
-
Who is online
Users browsing this forum: No registered users and 3 guests