Bitcoin Forum
July 22, 2024, 02:53:11 AM *
News: Help 1Dq create 15th anniversary forum artwork.
 
   Home   Help Search Login Register More  
Pages: [1]
  Print  
Author Topic: scrypt130511.cl EndianSwap optimization  (Read 1826 times)
clyfish (OP)
Newbie
*
Offline Offline

Activity: 1
Merit: 0


View Profile
February 03, 2014, 08:04:34 AM
Last edit: February 03, 2014, 02:02:32 PM by clyfish
 #1

Hello,

I optimized EndianSwap in scrypt130511.cl a little with v_bfi_b32 instruction today.
For me it improved my hashrate from 408 kh/s to 415 kh/s on average.

Search EndianSwap in scrypt130511.cl, and replace
Code:
#define EndianSwap(n) (rotl(n & ES[0], 24U)|rotl(n & ES[1], 8U))
With
Code:
#define EndianSwap(n) (Ch(ES[0], rotl(n, 8U), rotl(n, 24U)))
Then delete *.bin, and restart cgminer.

Before optimization, EndianSwap compiles to:
Code:
v_and_b32     v2, 0x00ff00ff, v1
v_and_b32     v1, 0xff00ff00, v1
v_alignbit_b32  v2, v2, v2, 8
v_alignbit_b32  v1, v1, v1, 24
v_or_b32      v1, v2, v1
After:
Code:
v_alignbit_b32  v2, v1, v1, 24
v_alignbit_b32  v1, v1, v1, 8
s_mov_b32     s0, 0x00ff00ff
v_bfi_b32     v1, s0, v2, v1
From 5 VALUs to 3 VALUs and 1 SALU.

BTW, you can try another optimization I found two weeks ago, It also speeds up my hashrate by about 5kh/s.
Search "i<LOOKUP_GAP" and add "#pragma unroll" before the line.
Code:
#pragma unroll
for(uint i=0; i<LOOKUP_GAP; ++i)
salsa(X);

If you like my work, please donate.

BTC Donate: 1C1Dzhe9V8qwZvCMCTLSu5p4D582rgpBGR
LTC Donate: LVmUpF9eP3YwcZ5r8LLMiK74YfVk7oJpCB
DOGE Donate: D5AQ1y7ukUzo1R6iJxYwbCRD9tgvi3mgws
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2006-2009, Simple Machines Valid XHTML 1.0! Valid CSS!