Paid software project request

Hello,
we are looking for a developer able to modify an existing AMD/NVIDIA OpenCL software to make it run on an FPGA board instead of a GPU, in order to gain a significant performance boost while performing the same calculations.

The software in question is a cryptocurrency miner, so it basically performs cryptographic hash calculations.

The source code is here: https://github.com/OhGodAPet/wolf-xmr-miner
A docker file is also available here: https://github.com/minecoins/docker-wolf-xmr-miner
This software has many .cl kernels, one for each different cryptocurrency algorythm to be mined.
But the only kernel we ask you to modify to run on an FPGA is cryptonight.cl , which calculates the Cryptonight algorithm ( currencies: Monero, Bytecoin and others ).

We expect that you provide us with source+docker file of your software, which must be able to calculate at least 100 000 Hashes/second on a FPGA.

The FPGA we are considering is the Nallatech 510T , but before confirming that we are asking you to suggest which is the one you think has best performance/costs for this task.

You can suggest also a different mining software you may want to modify instead of the one above, if you think it is easier for you to get higher mining profit with your alternative. Can be even a different algorithm/currency.

Candidates for this project should give me:

  • Resume
  • Estimate of developing costs and time
    Either here or via PM

Thanks!

Cryptonight is designed to be memory bound. This means two things.

  1. Hashrate cannot exceed that of a GPU with the same memory bandwidth. If you’re serious about the project, consider using a board with HBM like this one: https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html
  2. The only advantage of an FPGA over a GPU would be the power efficiency. But even this advantage may be slim: since the algorithm is memory-bound, the main powerhog of a GPU - compute units - will stall for a certain percentage of time consuming practically nothing.

I cannot tell you any estimations without measurements, but please be wary about astronomical numbers some may offer.

[QUOTE=Salabar;42578]Cryptonight is designed to be memory bound. This means two things.

  1. Hashrate cannot exceed that of a GPU with the same memory bandwidth. If you’re serious about the project, consider using a board with HBM like this one: https://www.xilinx.com/products/silicon-devices/fpga/virtex-ultrascale-plus.html
  2. The only advantage of an FPGA over a GPU would be the power efficiency. But even this advantage may be slim: since the algorithm is memory-bound, the main powerhog of a GPU - compute units - will stall for a certain percentage of time consuming practically nothing.

I cannot tell you any estimations without measurements, but please be wary about astronomical numbers some may offer.[/QUOTE]

Thanks for the response.

So we need to aim at the highest memory bandwidth possible , then the FirePro W9100 or S9170 are the way to go, right? The 32GB versions have 320 GB/s memory bandwidth: technical.city/en/video/FirePro-V4900-vs-FirePro-S9170
www.amd.com/en-us/products/graphics/workstation/firepro-3d/9100#2
and they cost about 3k

The Xilinx board you linked has 460 GB/s but i guess it costs much more than 3k $, but we will verify that. Thanks

There is also the Tesla P100 which seems to offer 732 GB/s www.nvidia.com/object/tesla-p100.html
It’s 60% more bandwith than the Xilinx board, but it costs about 7k per unit…

An important aspect is also the failure rate of the component, and seems like GPUs have high failure rate when used for mining, i wonder if FPGAs have the same issue.

Honestly, there are consumer GPUs like the GTX 1080 (Ti) that can reach 484 GB/s bandwidth ( see www.techpowerup.com/gpudb/2877/geforce-gtx-1080-ti ,
www.pcgamesn.com/nvidia/nvidia-gtx-1080-ti-release-date ), and cost around 650$.
So there’s no reason to buy a Xilinx board unless it costs much less, because the mem bandwidth is the same.
Therefore i guess for Cryptonight we can expect the same performance as a 1080 Ti.

If i am right about that, i ask you if you can implement whatever mining software even different than Cryptonight , something that you think may run faster on a FPGA compared to high-end GPUs.
Maybe a miner for Equihash coins ( ZCash, Hush, etc.) or Lbry (LBC), Diamond, etc., it’s your call.

Let me know any proposals and estimations, thanks

P.S.
We might have some FPGAs available to test the performance, i’ll let you know which ones we can let you test

From what I can tell, most *-coins after Bitcoin and its forks were specifically designed to not allow such hardware to have a huge edge over off-the-shelf products. Among algorithms you’ve mentioned, it seems Diamond is about raw number-crunching, where FPGAs truly shine. Notably, “truly shine” in the context means performance per watt, since an FPGA often cannot run at high frequencies and is not actually guaranteed to outperform a GPU in terms of raw numbers.

We might have some FPGAs available to test the performance, i’ll let you know which ones we can let you test

Thanks for the offer, but I have to decline. I have some general expertise, but am lacking experience in HPC (especially with FPGAs) to pull off something worthy of the effort. I simply figured I could share a little insight to help you save some money should you run into a snake oil vendor promising miracles.