Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 03:50:07 PM
Last edit: October 05, 2017, 04:01:35 PM by palgin

#61

Quote from: stas260385 on October 05, 2017, 03:42:47 PM

Didn't help cause this file are allready exist in source files. If i remoove it and add again - same errore:

Error 1 error LNK2001: unresolved external symbol "void __cdecl xevan_sha512_cpu_hash_64(int,int,unsigned int *)" (?xevan_sha512_cpu_hash_64@@YAXHHPEAI@Z) D:\ccminer-xevan-master\x17.cu.obj

Nothing will help with Winbuild before any Linux user send hashing output using "debug" ver of x17.cu, so don't even bother. You can check yiimp bench, no ccminer-krnlx running on Win Wink

Think your error is caused by wrong Curl lib or Dev environment, you need VS2013 and Curl built with the same VS version, and don't forget about architecture (x86/x64).

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

stas260385

Newbie

Offline

Activity: 38
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 03:53:14 PM

#62

Quote from: palgin on October 05, 2017, 03:50:07 PM

Quote from: stas260385 on October 05, 2017, 03:42:47 PM

Nothing will help with Winbuild before any Linux user send hashing output using "debug" ver of x17.cu, so don't even bother. You can check yiimp bench, no ccminer-krnlx running on Win Wink

Think your error is caused by wrong Curl lib or Dev environment, you need VS2013 and Curl built with the same VS version, and don't forget about architecture (x86/x64).

All proper Curl libs, VS2013 and CUDA 7.5 are installed

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 03:56:55 PM

#63

Quote from: stas260385 on October 05, 2017, 03:53:14 PM

All proper Curl libs, VS2013 and CUDA 7.5 are installed

cuda_sha512_lbry.cu - function you need is in this file, check if it's attached to your project.

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

anorganix

Copper Member
Sr. Member

Offline

Activity: 970
Merit: 287

Per aspera ad astra

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:13:53 PM

#64

Finally managed to compile it for Windows x64 @ CUDA 7.5.
Of course it's not working, looking forward to the fix from the experienced ones.

I will never send private messages with payment requests for my auctions. I only communicate transparently via the forum (not Telegram, Discord, Skype & others). Please be wary of scammers.

stas260385

Newbie

Offline

Activity: 38
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:20:36 PM

#65

Quote from: anorganix on October 05, 2017, 04:13:53 PM

Finally managed to compile it for Windows x64 @ CUDA 7.5.
Of course it's not working, looking forward to the fix from the experienced ones.

https://i.imgur.com/Mgh3vBk.png

I've compiled too but the same issue...(

integrale

Full Member

Offline

Activity: 144
Merit: 100

Eager to learn

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:22:50 PM
Last edit: October 05, 2017, 04:46:59 PM by integrale

#66

Great Job done Guys

have it succesfull compiled on Ubuntu 16.04 LTS Cuda 7.5 Nv 384.90

running on a poor small GTX 1050 @ 1.1Mh/s with intensity of 19

AltCoin-Mining @ Xubuntu 16.04 LTS

anorganix

Copper Member
Sr. Member

Offline

Activity: 970
Merit: 287

Per aspera ad astra

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:22:59 PM

#67

Quote from: palgin on October 05, 2017, 03:21:53 PM

If anyone with Linux wants to help Win users, replace your x17.cu contents with this code:

Code: (x17.cu)

/**
 * X17 algorithm (X15 + sha512 + haval256)
 */

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

void print_hash(unsigned int *data, int size){
	for (int i = 0; i<size; i++)
		gpulog(LOG_WARNING, 0, "%x ", data[i]);
	//gpulog(LOG_WARNING, 0, "-------------");
}

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads,  uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);



// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
	uint32_t _ALIGN(64) hash[32]; // 128 bytes required
	uint32_t input_zero[20] = { 0 };
	const int dataLen = 128;

//return;
	sph_blake512_context     ctx_blake;
	sph_bmw512_context       ctx_bmw;
	sph_groestl512_context   ctx_groestl;
	sph_skein512_context     ctx_skein;
	sph_jh512_context        ctx_jh;
	sph_keccak512_context    ctx_keccak;
	sph_luffa512_context     ctx_luffa;
	sph_cubehash512_context  ctx_cubehash;
	sph_shavite512_context   ctx_shavite;
	sph_simd512_context      ctx_simd;
	sph_echo512_context      ctx_echo;
	sph_hamsi512_context     ctx_hamsi;
	sph_fugue512_context     ctx_fugue;
	sph_shabal512_context    ctx_shabal;
	sph_whirlpool_context    ctx_whirlpool;
	sph_sha512_context       ctx_sha512;
	sph_haval256_5_context   ctx_haval;

	print_hash(input_zero,20);
	gpulog(LOG_WARNING, 0, "--INPUT ZEROES--");

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, input_zero, 80);
	sph_blake512_close(&ctx_blake, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	memset(&hash[16], 0, 64);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--AFTER MEMSET--");

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BMW512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--GROESTL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SKEIN512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--JH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--KECCAK512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--LUFFA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--CUBEHASH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHAVITE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--ECHO512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAMSI512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--FUGUE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHABAL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--WHIRLPOOL--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAVAL256--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	memset(&hash[8], 0, dataLen - 32);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--AFTER MEMSET--");

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, hash, dataLen);
	sph_blake512_close(&ctx_blake, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BMW512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--GROESTL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SKEIN512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--JH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--KECCAK512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--LUFFA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--CUBEHASH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHAVITE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SIMD512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--ECHO512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAMSI512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--FUGUE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHABAL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--WHIRLPOOL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAVAL256--");

	for (int i = 0; i<32; i++) hash[i] = 0;

	memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };





extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

	int dev_id = device_map[thr_id];

	uint32_t *pdata = work->data;
	uint32_t *ptarget = work->target;
	const uint32_t first_nonce = pdata[19];
	uint32_t default_throughput;
	if(device_sm[dev_id]<=500) default_throughput = 1<<20;
	else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
	else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);
	
	if((strstr(device_name[dev_id], "1060")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
	if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

	throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

	if (opt_benchmark)
		((uint32_t*)ptarget)[7] = 0xff;

	gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
        gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

	if (!init[thr_id])
	{
		cudaSetDevice(device_map[thr_id]);
		if (opt_cudaschedule == -1 && gpu_threads == 1) {
			cudaDeviceReset();
			// reduce cpu usage
			cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
		}
		gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

		x15_whirlpool_cpu_init(thr_id, throughput, 0);
		groestl512_cpu_init(thr_id, throughput);
		x11_simd512_cpu_init(thr_id, throughput);
		CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
		CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
		h_resNonce[thr_id] = (uint32_t*) malloc(NBN  * 8 * sizeof(uint32_t));
		if(h_resNonce[thr_id] == NULL){
			gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
			exit(EXIT_FAILURE);
		}		
		init[thr_id] = true;
	}

	uint32_t _ALIGN(64) endiandata[20];
	for (int k=0; k < 20; k++)
		be32enc(&endiandata[k], pdata[k]);

	quark_blake512_cpu_setBlock_80(thr_id, endiandata);
	cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));


	do {
		// Hash with CUDA

		quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A

		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

		quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

		quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast

		x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

		x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256

		xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2
		 
        x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);  //A slow r3
		 
		x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A
		 
        x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++
		 
		//x13_hamsi_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2
		 
		xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
        quark_blake512_cpu_hash_128(thr_id, throughput,  d_hash[thr_id]);//BAD
		 
        quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);
		 
		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);
		 
        quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
		 
        quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
		 
        x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A
		 
        x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared
		 
        x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); 
		 
        x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
		//x13_hamsi_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
        x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
		xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);
		 

		cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

		if (h_resNonce[thr_id][0] != UINT32_MAX){
			const uint32_t Htarg = ptarget[7];
			const uint32_t startNounce = pdata[19];
			uint32_t vhash64[8];
			be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
			x17hash(vhash64, endiandata);
//			*hashes_done = pdata[19] - first_nonce + throughput + 1;
//			pdata[19] = startNounce + h_resNonce[thr_id][0];
			gpulog(LOG_WARNING, 0,"NONCE FOUND ");
//			return 1;
			if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
				int res = 1;
				*hashes_done = pdata[19] - first_nonce + throughput + 1;
				work_set_target_ratio(work, vhash64);
				pdata[19] = startNounce + h_resNonce[thr_id][0];
				if (h_resNonce[thr_id][1] != UINT32_MAX) {
					pdata[21] = startNounce+h_resNonce[thr_id][1];
					if(!opt_quiet)
						gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
					be32enc(&endiandata[19], pdata[21]);
					x17hash(vhash64, endiandata);
					if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
						work_set_target_ratio(work, vhash64);
						xchg(pdata[19],pdata[21]);
					}
					res++;
				}
				return res;
			}
			else {
				gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
				cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));				
			}
		}

		pdata[19] += throughput;
	} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

	*hashes_done = pdata[19] - first_nonce + 1;

	return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
	if (!init[thr_id])
		return;

	cudaDeviceSynchronize();

	free(h_resNonce[thr_id]);
	cudaFree(d_resNonce[thr_id]);
	cudaFree(d_hash[thr_id]);

	x11_simd_echo_512_cpu_free(thr_id);
	x15_whirlpool_cpu_free(thr_id);
	cudaDeviceSynchronize();
	init[thr_id] = false;
}

Then recompile and run for a few seconds, before first rejected share. Send me your output in PM or publish it here.

Thanks for everybody in advance!

Bump!
Any help much appreciated.

I will never send private messages with payment requests for my auctions. I only communicate transparently via the forum (not Telegram, Discord, Skype & others). Please be wary of scammers.

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:27:12 PM

#68

Quote from: anorganix on October 05, 2017, 04:22:59 PM

Quote from: palgin on October 05, 2017, 03:21:53 PM

If anyone with Linux wants to help Win users, replace your x17.cu contents with this code:

Code: (x17.cu)

/**
 * X17 algorithm (X15 + sha512 + haval256)
 */

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

void print_hash(unsigned int *data, int size){
	for (int i = 0; i<size; i++)
		gpulog(LOG_WARNING, 0, "%x ", data[i]);
	//gpulog(LOG_WARNING, 0, "-------------");
}

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads,  uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);



// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
	uint32_t _ALIGN(64) hash[32]; // 128 bytes required
	uint32_t input_zero[20] = { 0 };
	const int dataLen = 128;

//return;
	sph_blake512_context     ctx_blake;
	sph_bmw512_context       ctx_bmw;
	sph_groestl512_context   ctx_groestl;
	sph_skein512_context     ctx_skein;
	sph_jh512_context        ctx_jh;
	sph_keccak512_context    ctx_keccak;
	sph_luffa512_context     ctx_luffa;
	sph_cubehash512_context  ctx_cubehash;
	sph_shavite512_context   ctx_shavite;
	sph_simd512_context      ctx_simd;
	sph_echo512_context      ctx_echo;
	sph_hamsi512_context     ctx_hamsi;
	sph_fugue512_context     ctx_fugue;
	sph_shabal512_context    ctx_shabal;
	sph_whirlpool_context    ctx_whirlpool;
	sph_sha512_context       ctx_sha512;
	sph_haval256_5_context   ctx_haval;

	print_hash(input_zero,20);
	gpulog(LOG_WARNING, 0, "--INPUT ZEROES--");

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, input_zero, 80);
	sph_blake512_close(&ctx_blake, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	memset(&hash[16], 0, 64);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--AFTER MEMSET--");

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BMW512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--GROESTL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SKEIN512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--JH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--KECCAK512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--LUFFA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--CUBEHASH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHAVITE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--ECHO512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAMSI512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--FUGUE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHABAL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--WHIRLPOOL--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAVAL256--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	memset(&hash[8], 0, dataLen - 32);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--AFTER MEMSET--");

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, hash, dataLen);
	sph_blake512_close(&ctx_blake, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BLAKE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--BMW512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--GROESTL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SKEIN512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--JH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--KECCAK512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--LUFFA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--CUBEHASH512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHAVITE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SIMD512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--ECHO512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAMSI512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--FUGUE512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHABAL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--WHIRLPOOL512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--SHA512--");
	//for (int i = 0; i<32; i++) hash[i] = 0;

	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);

	print_hash(hash, 32);
	gpulog(LOG_WARNING, 0, "--HAVAL256--");

	for (int i = 0; i<32; i++) hash[i] = 0;

	memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };





extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

	int dev_id = device_map[thr_id];

	uint32_t *pdata = work->data;
	uint32_t *ptarget = work->target;
	const uint32_t first_nonce = pdata[19];
	uint32_t default_throughput;
	if(device_sm[dev_id]<=500) default_throughput = 1<<20;
	else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
	else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);
	
	if((strstr(device_name[dev_id], "1060")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
	if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

	throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

	if (opt_benchmark)
		((uint32_t*)ptarget)[7] = 0xff;

	gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
        gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

	if (!init[thr_id])
	{
		cudaSetDevice(device_map[thr_id]);
		if (opt_cudaschedule == -1 && gpu_threads == 1) {
			cudaDeviceReset();
			// reduce cpu usage
			cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
		}
		gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

		x15_whirlpool_cpu_init(thr_id, throughput, 0);
		groestl512_cpu_init(thr_id, throughput);
		x11_simd512_cpu_init(thr_id, throughput);
		CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
		CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
		h_resNonce[thr_id] = (uint32_t*) malloc(NBN  * 8 * sizeof(uint32_t));
		if(h_resNonce[thr_id] == NULL){
			gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
			exit(EXIT_FAILURE);
		}		
		init[thr_id] = true;
	}

	uint32_t _ALIGN(64) endiandata[20];
	for (int k=0; k < 20; k++)
		be32enc(&endiandata[k], pdata[k]);

	quark_blake512_cpu_setBlock_80(thr_id, endiandata);
	cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));


	do {
		// Hash with CUDA

		quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A

		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

		quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

		quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast

		x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

		x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256

		xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2
		 
        x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);  //A slow r3
		 
		x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A
		 
        x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++
		 
		//x13_hamsi_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2
		 
		xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
		xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		 
        quark_blake512_cpu_hash_128(thr_id, throughput,  d_hash[thr_id]);//BAD
		 
        quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);
		 
		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);
		 
        quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
		 
        quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
		 
        x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A
		 
        x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared
		 
        x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); 
		 
        x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
		//x13_hamsi_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
        x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
        xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
		 
		xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);
		 

		cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

		if (h_resNonce[thr_id][0] != UINT32_MAX){
			const uint32_t Htarg = ptarget[7];
			const uint32_t startNounce = pdata[19];
			uint32_t vhash64[8];
			be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
			x17hash(vhash64, endiandata);
//			*hashes_done = pdata[19] - first_nonce + throughput + 1;
//			pdata[19] = startNounce + h_resNonce[thr_id][0];
			gpulog(LOG_WARNING, 0,"NONCE FOUND ");
//			return 1;
			if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
				int res = 1;
				*hashes_done = pdata[19] - first_nonce + throughput + 1;
				work_set_target_ratio(work, vhash64);
				pdata[19] = startNounce + h_resNonce[thr_id][0];
				if (h_resNonce[thr_id][1] != UINT32_MAX) {
					pdata[21] = startNounce+h_resNonce[thr_id][1];
					if(!opt_quiet)
						gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
					be32enc(&endiandata[19], pdata[21]);
					x17hash(vhash64, endiandata);
					if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
						work_set_target_ratio(work, vhash64);
						xchg(pdata[19],pdata[21]);
					}
					res++;
				}
				return res;
			}
			else {
				gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
				cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));				
			}
		}

		pdata[19] += throughput;
	} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

	*hashes_done = pdata[19] - first_nonce + 1;

	return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
	if (!init[thr_id])
		return;

	cudaDeviceSynchronize();

	free(h_resNonce[thr_id]);
	cudaFree(d_resNonce[thr_id]);
	cudaFree(d_hash[thr_id]);

	x11_simd_echo_512_cpu_free(thr_id);
	x15_whirlpool_cpu_free(thr_id);
	cudaDeviceSynchronize();
	init[thr_id] = false;
}

Then recompile and run for a few seconds, before first rejected share. Send me your output in PM or publish it here.

Thanks for everybody in advance!

Bump!
Any help much appreciated.

Seems no *nix user wants to help MS guys Grin

Will have to wait for krnlx to run it on his machine.

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

integrale

Full Member

Offline

Activity: 144
Merit: 100

Eager to learn

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:30:26 PM

#69

im running on "nix" if i can help ? im not a pro but can try it

AltCoin-Mining @ Xubuntu 16.04 LTS

stas260385

Newbie

Offline

Activity: 38
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:30:42 PM

#70

It's from Z-pool bench:

xevan   26h   1070   GeForce GTX 1070   10de:119d   SM 6.1   2.27 MH   21   1822   109   20.791 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 382.53
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.26 MH   21   1797   101   22.363 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   16h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.26 MH   21   1797   108   20.905 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.26 MH   21   1797   107   21.088 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.25 MH   21   1797   107   21.074 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   26h   1070   GeForce GTX 1070   10de:119d   SM 6.1   2.25 MH   21   1822   110   20.479 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 382.53
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.25 MH   21   1797   104   21.658 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   16h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.25 MH   21   1797   140   16.056 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.25 MH   21   1797   91   24.674 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.24 MH   21   1797   108   20.739 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.24 MH   21   1797   105   21.308 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   17h   1070   EVGA GTX 1070 FTW   3842:6276   SM 6.1   2.24 MH   21   1797   96   23.289 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 385.28
xevan   2d   1070   GeForce GTX 1070   10de:119d   SM 6.1   2 MH   21   1683   79   25.305 k   ccminer/8.13-KlausT   linux   CUDA 8.0 375.66
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.97 MH   21   1721   81   24.36 k   cpuminer-opt/6.6.6   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.9 MH   21   1721   186   10.209 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.71 MH   21   1721   96   17.835 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.61 MH   21   1721   153   10.535 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.54 MH   21   1721   117   13.157 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.52 MH   21   1721   106   14.386 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94
xevan   2d   1080 Ti   GeForce GTX 1080 Ti   10de:120f   SM 6.1   1.48 MH   21   1721   124   11.964 k   ccminer/8.13-dj   win64   CUDA 8.0 384.94

sickofscamcoins

Newbie

Offline

Activity: 7
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:36:25 PM

#71

http://paste.ubuntu.com/25680610/

this is results of output, replacing x17.cu and recompile with one posted palgin.

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:37:47 PM
Last edit: October 05, 2017, 04:51:06 PM by palgin

#72

Quote from: sickofscamcoins on October 05, 2017, 04:36:25 PM

http://paste.ubuntu.com/25680610/

this is results of output, replacing x17.cu and recompile with one posted palgin.

Thank you, will check it!

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

integrale

Full Member

Offline

Activity: 144
Merit: 100

Eager to learn

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:39:29 PM
Last edit: October 05, 2017, 04:51:15 PM by integrale

#73

copied from my Linux build x17.cu hopefully it helps

Code:

/**
 * X17 algorithm (X15 + sha512 + haval256)
 */

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads,  uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);



// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
	uint32_t _ALIGN(64) hash[32]; // 128 bytes required
	const int dataLen = 128;
//return;
	sph_blake512_context     ctx_blake;
	sph_bmw512_context       ctx_bmw;
	sph_groestl512_context   ctx_groestl;
	sph_skein512_context     ctx_skein;
	sph_jh512_context        ctx_jh;
	sph_keccak512_context    ctx_keccak;
	sph_luffa512_context     ctx_luffa;
	sph_cubehash512_context  ctx_cubehash;
	sph_shavite512_context   ctx_shavite;
	sph_simd512_context      ctx_simd;
	sph_echo512_context      ctx_echo;
	sph_hamsi512_context     ctx_hamsi;
	sph_fugue512_context     ctx_fugue;
	sph_shabal512_context    ctx_shabal;
	sph_whirlpool_context    ctx_whirlpool;
	sph_sha512_context       ctx_sha512;
	sph_haval256_5_context   ctx_haval;

//print_hash(input,20);
	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, input, 80);
	sph_blake512_close(&ctx_blake, hash);
//print_hash(hash,32);
	memset(&hash[16], 0, 64);

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);
//print_hash(hash,32);

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);
//print_hash(hash,32);
	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);
//print_hash(hash,32);
	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);
//print_hash(hash,32);
	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);
//print_hash(hash,32);
	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);
//print_hash(hash,32);
	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);
//print_hash(hash,32);
	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,32);

	memset(&hash[8], 0, dataLen - 32);

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, hash, dataLen);
	sph_blake512_close(&ctx_blake, hash);

//print_hash(hash,32);

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

//print_hash(hash,32);
	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,8);
	memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };


void print_hash(unsigned int *data,int size){
for(int i=0;i<size;i++)
        gpulog(LOG_WARNING, 0,"%x ",data[i]);
gpulog(LOG_WARNING, 0,"-------------");
}


extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

	int dev_id = device_map[thr_id];

	uint32_t *pdata = work->data;
	uint32_t *ptarget = work->target;
	const uint32_t first_nonce = pdata[19];
/*
	uint32_t default_throughput = 1<<20;
	
	if (strstr(device_name[dev_id], "GTX 970")) default_throughput+=256*256*6;
	if (strstr(device_name[dev_id], "GTX 980")) default_throughput =1<<22;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
*/
	uint32_t default_throughput;
	if(device_sm[dev_id]<=500) default_throughput = 1<<20;
	else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
	else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);
	
	if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
	if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

	throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

	if (opt_benchmark)
		((uint32_t*)ptarget)[7] = 0xff;

	gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
        gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

	if (!init[thr_id])
	{
		cudaSetDevice(device_map[thr_id]);
		if (opt_cudaschedule == -1 && gpu_threads == 1) {
			cudaDeviceReset();
			// reduce cpu usage
			cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
//			cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
		}
		gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

//		x11_simd_echo_512_cpu_init(thr_id, throughput);
		x15_whirlpool_cpu_init(thr_id, throughput, 0);
		groestl512_cpu_init(thr_id, throughput);
		x11_simd512_cpu_init(thr_id, throughput);
//for(;;);
		CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
		CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
		h_resNonce[thr_id] = (uint32_t*) malloc(NBN  * 8 * sizeof(uint32_t));
		if(h_resNonce[thr_id] == NULL){
			gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
			exit(EXIT_FAILURE);
		}		
		init[thr_id] = true;
	}

	uint32_t _ALIGN(64) endiandata[20];
	for (int k=0; k < 20; k++)
		be32enc(&endiandata[k], pdata[k]);
//		endiandata[k]=0;
//	print_hash(endiandata,20);
	quark_blake512_cpu_setBlock_80(thr_id, endiandata);
	cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
//	x11_simd512_cpu_init(thr_id, throughput);
//	for(;;);
	do {
		// Hash with CUDA


		quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A
		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

		quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
		quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast
//		keccak_xevan_cpu_hash_64_A(thr_id, throughput,  d_hash[thr_id]);//A

//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));
//		x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //P
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

		x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//for(;;);

		x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256
		xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2
                x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);  //A slow r3

//                cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));


//		xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //slow r1

//                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//		print_hash(h_resNonce[thr_id],16);


  //              cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

		x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A


//                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
 //               print_hash(h_resNonce[thr_id],16);

//for(;;);

                x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++
		x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2
		xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast
		xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast


//		xevan_blake512_cpu_hash_64(thr_id, throughput,  d_hash[thr_id]);//BAD
quark_blake512_cpu_hash_128(thr_id, throughput,  d_hash[thr_id]);//BAD

//
                quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);
//                xevan_groestl512_cpu_hash(thr_id, throughput, d_hash[thr_id]);
quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

//                xevan_skein512(thr_id, throughput, d_hash[thr_id]);
                quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

                quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);
//                keccak_xevan_cpu_hash_64_A(thr_id, throughput,  d_hash[thr_id]);
//                x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

                x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared
                x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); 

//                xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

                x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);







/*
for(int i = 10000;i< 10016;i++){
                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][16*i], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id],8);
}
		for(;;);

*/
		xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);

		cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

		if (h_resNonce[thr_id][0] != UINT32_MAX){
			const uint32_t Htarg = ptarget[7];
			const uint32_t startNounce = pdata[19];
			uint32_t vhash64[8];
			be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
			x17hash(vhash64, endiandata);
//			*hashes_done = pdata[19] - first_nonce + throughput + 1;
//			pdata[19] = startNounce + h_resNonce[thr_id][0];
			gpulog(LOG_WARNING, 0,"NONCE FOUND ");
//			return 1;
			if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
				int res = 1;
				*hashes_done = pdata[19] - first_nonce + throughput + 1;
				work_set_target_ratio(work, vhash64);
				pdata[19] = startNounce + h_resNonce[thr_id][0];
				if (h_resNonce[thr_id][1] != UINT32_MAX) {
					pdata[21] = startNounce+h_resNonce[thr_id][1];
					if(!opt_quiet)
						gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
					be32enc(&endiandata[19], pdata[21]);
					x17hash(vhash64, endiandata);
					if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
						work_set_target_ratio(work, vhash64);
						xchg(pdata[19],pdata[21]);
					}
					res++;
				}
				return res;
			}
			else {
				gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
				cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));				
			}
		}

		pdata[19] += throughput;
	} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

	*hashes_done = pdata[19] - first_nonce + 1;

	return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
	if (!init[thr_id])
		return;

	cudaDeviceSynchronize();

	free(h_resNonce[thr_id]);
	cudaFree(d_resNonce[thr_id]);
	cudaFree(d_hash[thr_id]);

	x11_simd_echo_512_cpu_free(thr_id);
	x15_whirlpool_cpu_free(thr_id);
	cudaDeviceSynchronize();
	init[thr_id] = false;
}

AltCoin-Mining @ Xubuntu 16.04 LTS

sickofscamcoins

Newbie

Offline

Activity: 7
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:42:56 PM

#74

Quote from: palgin on October 05, 2017, 04:37:47 PM

Quote from: sickofscamcoins on October 05, 2017, 04:36:25 PM

http://paste.ubuntu.com/25680610/

this is results of output, replacing x17.cu and recompile with one posted palgin.

Thank you, will check it!

http://paste.ubuntu.com/25680643/

no escape characters this time

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:51:19 PM

#75

UPDATE: Hashing is exactly the same as on Win, so it comes out to be very weird problem. The one thing I suspect is ulong and uint compiler handling difference on Win and on *nix, for example, ulong will be equal to uint64_t on 64-bit build, but different on 32-bit build. Will look into this way, maybe will come up with something.

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

sickofscamcoins

Newbie

Offline

Activity: 7
Merit: 0

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 04:58:52 PM

#76

welp i tried for my cookie anyways. Anything else I can do to help lmk

Nesp

Full Member

Offline

Activity: 124
Merit: 100

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 05:57:04 PM

#77

Any binaries?

anorganix

Copper Member
Sr. Member

Offline

Activity: 970
Merit: 287

Per aspera ad astra

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 05:59:06 PM

#78

Quote from: Nesp on October 05, 2017, 05:57:04 PM

Any binaries?

Yes, we have 0 and 1.

I will never send private messages with payment requests for my auctions. I only communicate transparently via the forum (not Telegram, Discord, Skype & others). Please be wary of scammers.

palgin

Sr. Member

Offline

Activity: 266
Merit: 250

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 06:01:08 PM
Last edit: October 05, 2017, 06:18:28 PM by palgin

#79

Next part of check, now GPU part. All the same, x17.cu is following:

Code:

/**
 * X17 algorithm (X15 + sha512 + haval256)
 */

extern "C" {
#include "sph/sph_blake.h"
#include "sph/sph_bmw.h"
#include "sph/sph_groestl.h"
#include "sph/sph_skein.h"
#include "sph/sph_jh.h"
#include "sph/sph_keccak.h"

#include "sph/sph_luffa.h"
#include "sph/sph_cubehash.h"
#include "sph/sph_shavite.h"
#include "sph/sph_simd.h"
#include "sph/sph_echo.h"

#include "sph/sph_hamsi.h"
#include "sph/sph_fugue.h"

#include "sph/sph_shabal.h"
#include "sph/sph_whirlpool.h"

#include "sph/sph_sha2.h"
#include "sph/sph_haval.h"
}

#include "miner.h"
#include "cuda_helper.h"
#include "x11/cuda_x11.h"

#define NBN 2

// Memory for the hash functions
static uint32_t *d_hash[MAX_GPUS];
static uint32_t *d_resNonce[MAX_GPUS];
static uint32_t *h_resNonce[MAX_GPUS];

extern void x13_hamsi_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x14_shabal512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x15_whirlpool_cpu_init(int thr_id, uint32_t threads, int mode);
extern void x15_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x15_whirlpool_cpu_free(int thr_id);

extern void x17_sha512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);

extern void x17_haval256_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t* resNonce, uint64_t target);
extern void bmw256_cpu_hash_32_full(int thr_id, uint32_t threads, uint32_t *g_hash);
extern void quark_bmw512_cpu_hash_64x(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_hash);
extern void quark_groestl512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void groestl512_cpu_init(int thr_id, uint32_t threads);
extern void groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_skein512(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void qubit_luffa512_cpu_hash_80(int thr_id, uint32_t threads, uint32_t startNounce, uint32_t *d_outputHash);
extern void x11_cubehash512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_shavite512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_echo512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x11_simd512_cpu_init(int thr_id, uint32_t threads);
extern void xevan_simd512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_hamsi512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void x13_fugue512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_whirlpool_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_sha512_cpu_hash_64(int thr_id, int threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void xevan_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void xevan_haval512_cpu_hash_64_final(int thr_id, uint32_t threads, uint32_t *d_hash, uint32_t *resNonce, uint64_t target);
extern void xevan_groestl512_cpu_hash(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void keccak_xevan_cpu_hash_64_A(int thr_id, uint32_t threads, uint32_t *d_hash);
extern void quark_blake512_cpu_hash_64(int thr_id, uint32_t threads, uint32_t *d_nonceVector, uint32_t *d_outputHash);
extern void quark_blake512_cpu_hash_128(int thr_id, uint32_t threads, uint32_t *d_outputHash);
extern void quark_groestl512_cpu_hash_128(int thr_id, uint32_t threads,  uint32_t *d_hash);
extern void x11_luffa512_cpu_hash_128(int thr_id, uint32_t threads,uint32_t *d_hash);



// X17 CPU Hash (Validation)
extern "C" void x17hash(void *output, const void *input)
{
	uint32_t _ALIGN(64) hash[32]; // 128 bytes required
	const int dataLen = 128;
//return;
	sph_blake512_context     ctx_blake;
	sph_bmw512_context       ctx_bmw;
	sph_groestl512_context   ctx_groestl;
	sph_skein512_context     ctx_skein;
	sph_jh512_context        ctx_jh;
	sph_keccak512_context    ctx_keccak;
	sph_luffa512_context     ctx_luffa;
	sph_cubehash512_context  ctx_cubehash;
	sph_shavite512_context   ctx_shavite;
	sph_simd512_context      ctx_simd;
	sph_echo512_context      ctx_echo;
	sph_hamsi512_context     ctx_hamsi;
	sph_fugue512_context     ctx_fugue;
	sph_shabal512_context    ctx_shabal;
	sph_whirlpool_context    ctx_whirlpool;
	sph_sha512_context       ctx_sha512;
	sph_haval256_5_context   ctx_haval;

//print_hash(input,20);
	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, input, 80);
	sph_blake512_close(&ctx_blake, hash);
//print_hash(hash,32);
	memset(&hash[16], 0, 64);

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);
//print_hash(hash,32);

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);
//print_hash(hash,32);
	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);
//print_hash(hash,32);
	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);
//print_hash(hash,32);
	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);
//print_hash(hash,32);
	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);
//print_hash(hash,32);
	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);
//print_hash(hash,32);
//for(int i=0;i<32;i++)hash[i]=0;
	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);
//print_hash(hash,32);
	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,32);

	memset(&hash[8], 0, dataLen - 32);

	sph_blake512_init(&ctx_blake);
	sph_blake512(&ctx_blake, hash, dataLen);
	sph_blake512_close(&ctx_blake, hash);

//print_hash(hash,32);

	sph_bmw512_init(&ctx_bmw);
	sph_bmw512(&ctx_bmw, hash, dataLen);
	sph_bmw512_close(&ctx_bmw, hash);

	sph_groestl512_init(&ctx_groestl);
	sph_groestl512(&ctx_groestl, hash, dataLen);
	sph_groestl512_close(&ctx_groestl, hash);

	sph_skein512_init(&ctx_skein);
	sph_skein512(&ctx_skein, hash, dataLen);
	sph_skein512_close(&ctx_skein, hash);

	sph_jh512_init(&ctx_jh);
	sph_jh512(&ctx_jh, hash, dataLen);
	sph_jh512_close(&ctx_jh, hash);

	sph_keccak512_init(&ctx_keccak);
	sph_keccak512(&ctx_keccak, hash, dataLen);
	sph_keccak512_close(&ctx_keccak, hash);

	sph_luffa512_init(&ctx_luffa);
	sph_luffa512(&ctx_luffa, hash, dataLen);
	sph_luffa512_close(&ctx_luffa, hash);

	sph_cubehash512_init(&ctx_cubehash);
	sph_cubehash512(&ctx_cubehash, hash, dataLen);
	sph_cubehash512_close(&ctx_cubehash, hash);

	sph_shavite512_init(&ctx_shavite);
	sph_shavite512(&ctx_shavite, hash, dataLen);
	sph_shavite512_close(&ctx_shavite, hash);

	sph_simd512_init(&ctx_simd);
	sph_simd512(&ctx_simd, hash, dataLen);
	sph_simd512_close(&ctx_simd, hash);

	sph_echo512_init(&ctx_echo);
	sph_echo512(&ctx_echo, hash, dataLen);
	sph_echo512_close(&ctx_echo, hash);

	sph_hamsi512_init(&ctx_hamsi);
	sph_hamsi512(&ctx_hamsi, hash, dataLen);
	sph_hamsi512_close(&ctx_hamsi, hash);

	sph_fugue512_init(&ctx_fugue);
	sph_fugue512(&ctx_fugue, hash, dataLen);
	sph_fugue512_close(&ctx_fugue, hash);

	sph_shabal512_init(&ctx_shabal);
	sph_shabal512(&ctx_shabal, hash, dataLen);
	sph_shabal512_close(&ctx_shabal, hash);

	sph_whirlpool_init(&ctx_whirlpool);
	sph_whirlpool(&ctx_whirlpool, hash, dataLen);
	sph_whirlpool_close(&ctx_whirlpool, hash);

	sph_sha512_init(&ctx_sha512);
	sph_sha512(&ctx_sha512,(const void*) hash, dataLen);
	sph_sha512_close(&ctx_sha512,(void*) hash);

//print_hash(hash,32);
	sph_haval256_5_init(&ctx_haval);
	sph_haval256_5(&ctx_haval,(const void*) hash, dataLen);
	sph_haval256_5_close(&ctx_haval, hash);
//print_hash(hash,8);
	memcpy(output, hash, 32);
}

static bool init[MAX_GPUS] = { 0 };


void print_hash(unsigned int *data,int size){
for(int i=0;i<size;i++)
        gpulog(LOG_WARNING, 0,"%x ",data[i]);
gpulog(LOG_WARNING, 0,"-------------");
}


extern "C" int scanhash_x17(int thr_id, struct work* work, uint32_t max_nonce, unsigned long *hashes_done){

	int dev_id = device_map[thr_id];

	uint32_t *pdata = work->data;
	uint32_t *ptarget = work->target;
	const uint32_t first_nonce = pdata[19];
/*
	uint32_t default_throughput = 1<<20;
	
	if (strstr(device_name[dev_id], "GTX 970")) default_throughput+=256*256*6;
	if (strstr(device_name[dev_id], "GTX 980")) default_throughput =1<<22;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
*/
	uint32_t default_throughput;
	if(device_sm[dev_id]<=500) default_throughput = 1<<20;
	else if(device_sm[dev_id]<=520) default_throughput = 1<<21;
	else if(device_sm[dev_id]>520) default_throughput = (1<<22) + (1<<21);
	
	if((strstr(device_name[dev_id], "1070")))default_throughput = 1<<20;
	if((strstr(device_name[dev_id], "1080")))default_throughput = 1<<20;
	
	uint32_t throughput = cuda_default_throughput(thr_id, default_throughput); // 19=256*256*8;
	if (init[thr_id]) throughput = min(throughput, max_nonce - first_nonce);

	throughput&=0xFFFFFF70; //multiples of 128 due to simd_echo kernel

	if (opt_benchmark)
		((uint32_t*)ptarget)[7] = 0xff;

	gpulog(LOG_INFO,thr_id,"target %x %x %x",ptarget[5], ptarget[6], ptarget[7]);
        gpulog(LOG_INFO,thr_id,"target %llx",*(uint64_t*)&ptarget[6]);

	if (!init[thr_id])
	{
		cudaSetDevice(device_map[thr_id]);
		if (opt_cudaschedule == -1 && gpu_threads == 1) {
			cudaDeviceReset();
			// reduce cpu usage
			cudaSetDeviceFlags(cudaDeviceScheduleBlockingSync);
//			cudaDeviceSetCacheConfig(cudaFuncCachePreferShared);
		}
		gpulog(LOG_INFO,thr_id, "Intensity set to %g, %u cuda threads", throughput2intensity(throughput), throughput);

//		x11_simd_echo_512_cpu_init(thr_id, throughput);
		x15_whirlpool_cpu_init(thr_id, throughput, 0);
		groestl512_cpu_init(thr_id, throughput);
		x11_simd512_cpu_init(thr_id, throughput);
//for(;;);
		CUDA_SAFE_CALL(cudaMalloc(&d_hash[thr_id], 8 * sizeof(uint64_t) * throughput));
		CUDA_SAFE_CALL(cudaMalloc(&d_resNonce[thr_id], NBN * sizeof(uint32_t)));
		h_resNonce[thr_id] = (uint32_t*) malloc(NBN  * 8 * sizeof(uint32_t));
		if(h_resNonce[thr_id] == NULL){
			gpulog(LOG_ERR,thr_id,"Host memory allocation failed");
			exit(EXIT_FAILURE);
		}		
		init[thr_id] = true;
	}

	uint32_t _ALIGN(64) endiandata[20];
	for (int k=0; k < 20; k++)
		be32enc(&endiandata[k], pdata[k]);
//		endiandata[k]=0;
//	print_hash(endiandata,20);
	quark_blake512_cpu_setBlock_80(thr_id, endiandata);
	cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));
//	x11_simd512_cpu_init(thr_id, throughput);
//	for(;;);
	do {
		// Hash with CUDA


		quark_blake512_cpu_hash_80(thr_id, throughput, pdata[19], d_hash[thr_id]);//A

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id],16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);//A //fast

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));
//		keccak_xevan_cpu_hash_64_A(thr_id, throughput,  d_hash[thr_id]);//A

//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));
//		x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //P
//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

		x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

//cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//print_hash(h_resNonce[thr_id],16);
//for(;;);

		x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //A 256

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//P slow r2

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);  //A slow r3

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


//                cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));


//		xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //slow r1

//                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
//		print_hash(h_resNonce[thr_id],16);


  //              cudaMemset(d_hash[thr_id], 0x00, 16*sizeof(uint32_t));

		x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//A

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));



//                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
 //               print_hash(h_resNonce[thr_id],16);

//for(;;);

                x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast ++

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //opt2

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

		xevan_haval512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); //fast

		cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
		print_hash(h_resNonce[thr_id], 16);
		cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));



//		xevan_blake512_cpu_hash_64(thr_id, throughput,  d_hash[thr_id]);//BAD
quark_blake512_cpu_hash_128(thr_id, throughput,  d_hash[thr_id]);//BAD

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);
cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


//
                quark_bmw512_cpu_hash_64x(thr_id, throughput, NULL, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

//                xevan_groestl512_cpu_hash(thr_id, throughput, d_hash[thr_id]);
quark_groestl512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);

cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id], 16);


//                xevan_skein512(thr_id, throughput, d_hash[thr_id]);
                quark_skein512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


                quark_jh512_cpu_hash_64(thr_id, throughput, NULL, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

//                keccak_xevan_cpu_hash_64_A(thr_id, throughput,  d_hash[thr_id]);
//                x11_luffa512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x11_luffa512_cpu_hash_128(thr_id, throughput, d_hash[thr_id]);//A

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


                x11_cubehash512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                xevan_shavite512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);//move to shared

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                x11_simd512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]); 

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


//                xevan_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);
                x11_echo512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));


                x13_hamsi512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                x13_fugue512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                x14_shabal512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                xevan_whirlpool_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));

                xevan_sha512_cpu_hash_64(thr_id, throughput, d_hash[thr_id]);

				cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][0], 16 * sizeof(uint32_t), cudaMemcpyDeviceToHost);
				print_hash(h_resNonce[thr_id], 16);
				cudaMemset(d_hash[thr_id], 0x00, 16 * sizeof(uint32_t));








/*
for(int i = 10000;i< 10016;i++){
                cudaMemcpy(h_resNonce[thr_id], &d_hash[thr_id][16*i], 16*sizeof(uint32_t), cudaMemcpyDeviceToHost);
print_hash(h_resNonce[thr_id],8);
}
		for(;;);

*/
		xevan_haval512_cpu_hash_64_final(thr_id, throughput, d_hash[thr_id],d_resNonce[thr_id],*(uint64_t*)&ptarget[6]);

		cudaMemcpy(h_resNonce[thr_id], d_resNonce[thr_id], NBN*sizeof(uint32_t), cudaMemcpyDeviceToHost);

		print_hash(h_resNonce[thr_id], 16);

		if (h_resNonce[thr_id][0] != UINT32_MAX){
			const uint32_t Htarg = ptarget[7];
			const uint32_t startNounce = pdata[19];
			uint32_t vhash64[8];
			be32enc(&endiandata[19], startNounce + h_resNonce[thr_id][0]);
			x17hash(vhash64, endiandata);
//			*hashes_done = pdata[19] - first_nonce + throughput + 1;
//			pdata[19] = startNounce + h_resNonce[thr_id][0];
			gpulog(LOG_WARNING, 0,"NONCE FOUND ");
//			return 1;
			if (vhash64[7] <= Htarg && fulltest(vhash64, ptarget)) {
				int res = 1;
				*hashes_done = pdata[19] - first_nonce + throughput + 1;
				work_set_target_ratio(work, vhash64);
				pdata[19] = startNounce + h_resNonce[thr_id][0];
				if (h_resNonce[thr_id][1] != UINT32_MAX) {
					pdata[21] = startNounce+h_resNonce[thr_id][1];
					if(!opt_quiet)
						gpulog(LOG_BLUE,dev_id,"Found 2nd nonce: %08x", pdata[21]);
					be32enc(&endiandata[19], pdata[21]);
					x17hash(vhash64, endiandata);
					if (bn_hash_target_ratio(vhash64, ptarget) > work->shareratio[0]){
						work_set_target_ratio(work, vhash64);
						xchg(pdata[19],pdata[21]);
					}
					res++;
				}
				return res;
			}
			else {
				gpulog(LOG_WARNING, thr_id, "result for %08x does not validate on CPU!", h_resNonce[thr_id][0]);
				cudaMemset(d_resNonce[thr_id], 0xff, NBN*sizeof(uint32_t));				
			}
		}

		pdata[19] += throughput;
	} while (!work_restart[thr_id].restart && ((uint64_t)max_nonce > (uint64_t)throughput + pdata[19]));

	*hashes_done = pdata[19] - first_nonce + 1;

	return 0;
}

// cleanup
extern "C" void free_x17(int thr_id)
{
	if (!init[thr_id])
		return;

	cudaDeviceSynchronize();

	free(h_resNonce[thr_id]);
	cudaFree(d_resNonce[thr_id]);
	cudaFree(d_hash[thr_id]);

	x11_simd_echo_512_cpu_free(thr_id);
	x15_whirlpool_cpu_free(thr_id);
	cudaDeviceSynchronize();
	init[thr_id] = false;
}

Thanks for everybody who helps in investigation.

UPDATE: think I've found the bug, it's in xevan_haval512_cpu_hash_64_final function, so h_resNonce[0] and h_resNonce[1] are always random on Win with exactly the same input data... Need confirmation that behaviour is different on *nix, of course. CUDA memcpy bug?

BTC tips welcome: 16DHzyuqenEoHRA3w3YVGcYSDSHks7mor4

Nesp

Full Member

Offline

Activity: 124
Merit: 100

Re: Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

October 05, 2017, 06:18:19 PM

#80

Quote from: anorganix on October 05, 2017, 05:59:06 PM

Quote from: Nesp on October 05, 2017, 05:57:04 PM

Any binaries?

Yes, we have 0 and 1.

Hahah, nice one

Pages: « 1 2 3 [4] 5 6 7 8 9 10 11 12 13 14 15 16 17 » All

Bitcoin Forum > Alternate cryptocurrencies > Mining (Altcoins) > Krnlx Nvidia xevan miner - 3.3+ mh on 1070, ~6mh on 80ti FREE, OPENSOURCE

« previous topic next topic »