Abstract—This paper presents a new algorithm for improving the speed of OpenCV’s addWeighted function for blending images. We propose two implementations: one using the SSE (Streaming SIMD Extension), the other employing the AVX (Advanced Vector Extension), which increases the function’s speed by 3.49x and 5.77x respectively. The multi-core version of our algorithm utilizes load balancing to distribute loads between user threads while keeping the correct memory alignment for each SIMD instruction type. This approach improves the function’s speed by 23.08 times compared to its original implementation in the OpenCV library.
Index Terms—Image blending, multicore programming, AVX.
Panyayot Chaikan is with the Department of Computer Engineering, Faculty of Engineering, Prince of Songkla University, Thailand (e-mail: panyayot@coe.psu.ac.th).
Somsak Mitatha is with the Department of Computer Engineering, Faculty of Engineering, King Mongkhut’s Institute of Technology Ladkrabang, Thailand (e-mail: kmsomsak@kmitl.ac.th).
[PDF]
Cite: Panyayot Chaikan and Somsak Mitatha, "Improving the Addweighted Function in OpenCV 3.0 Using SSE and AVX Intrinsics," International Journal of Engineering and Technology vol. 9, no. 1, pp. 45-49, 2017.