SIMD is such a pain.

This forum is for developers of Rack Extensions to discuss the RE SDK, share code, and offer tips to other developers.
Post Reply
User avatar
Murf
RE Developer
Posts: 656
Joined: 21 Jun 2019
Location: Brisbane, Australia
Contact:

05 Apr 2023

Hi All,
I am trying to port a module to RE and it relies heavily on SIMD structures and instructions like:
__m128
_mm_castsi128_ps
_mm_set_ps
_mm_storeu_ps

has anyone had any luck porting code that packs floats into these types of structures before, or should I give up?

Murf.

User avatar
buddard
RE Developer
Posts: 1245
Joined: 17 Jan 2015
Location: Stockholm
Contact:

05 Apr 2023

SIMD is available in Jukebox, even though the syntax looks a bit different:

https://developer.reasonstudios.com/doc ... types_simd

I used SIMD pretty heavily in Resonans, it made the modal resonator mode way more efficient.

User avatar
Murf
RE Developer
Posts: 656
Joined: 21 Jun 2019
Location: Brisbane, Australia
Contact:

05 Apr 2023

buddard wrote:
05 Apr 2023
SIMD is available in Jukebox, even though the syntax looks a bit different:

https://developer.reasonstudios.com/doc ... types_simd

I used SIMD pretty heavily in Resonans, it made the modal resonator mode way more efficient.
You legend thanks :)

User avatar
Murf
RE Developer
Posts: 656
Joined: 21 Jun 2019
Location: Brisbane, Australia
Contact:

14 Apr 2023

I am having fun with SIMD, and have been writing a bridge between Reason SIMD and xmmintrin/emmintrin/pmmintrin
I will make it publicly available when done.

My current struggle is with comparisons (the functions that return the masks)
at first I tried:

Code: Select all

inline static  __m128 _mm_cmpgt_ps(__m128 a, __m128 b) {
	return a > b;
}
This produced trash, and then I tried this as non SIMD based on the definition for the function

Code: Select all

	TJBox_Float32 x0 = (TJBox_Float32)(a[0] > b[0] ? 0xffffffff : 0x0);
	TJBox_Float32 x1 = (TJBox_Float32)(a[1] > b[1] ? 0xffffffff : 0x0);
	TJBox_Float32 x2 = (TJBox_Float32)(a[2] > b[2] ? 0xffffffff : 0x0);
	TJBox_Float32 x3 = (TJBox_Float32)(a[3] > b[3] ? 0xffffffff : 0x0);
	return (__m128) {x0,x1,x2,x3};
Still trash.

If anyone has pulled this off please let me know, as I said it will be made publicly available, but I dont expect anyway to divulge what they consider hard-earned private IP :)

by the way I have just defined the types per the Reason doco, and this has been working great for most intrinsics functions so far

Code: Select all

typedef TJBox_Float32 __m128 __attribute__((__vector_size__(16)));
typedef TJBox_Int32 __m128i __attribute__((__vector_size__(16)));

Murf.

User avatar
Murf
RE Developer
Posts: 656
Joined: 21 Jun 2019
Location: Brisbane, Australia
Contact:

15 Apr 2023

Ok for anyone interested I worked this out.
In a nutshell, "return a > b;" works perfectly for the compare function.
What I was doing wrong is this:

Code: Select all

inline static __m128 _mm_set_ps (float w, float x, float y, float z ) {
	return (__m128){w,x,y,z};
}
This is correct: (positions 1,3 and 2,4 swapped)

Code: Select all

inline static __m128 _mm_set_ps (float w, float x, float y, float z ) {
	return (__m128){y,z,w,x};
}
Murf

Post Reply
  • Information
  • Who is online

    Users browsing this forum: No registered users and 1 guest