05:19:50
DataHoarder:
hmm sech1, but each one currently does get different initialization entropy
05:21:24
DataHoarder:
and each slot does get xor'd along the way with register data, which means you still have to store 3 values total across iterations
07:40:25
eureka:
speaking of zen-c, is there any major performance difference w/ RX on zen-c cores? after accounting for clock speed differences, that is
07:41:29
eureka:
clocks are lower and caches are smaller, but possibly cache latency?
07:42:24
eureka:
on 4/4c I see it's ~5ghz/3.7ghz so definitely some raw hashrate difference, but IPC is similar no?
08:00:14
DataHoarder:
afaik they left the other stuff equal in zen5c besides lowering l3
08:00:40
DataHoarder:
it's not similar to intel efficiency cores that change implementation
08:03:46
DataHoarder:
https://chipsandcheese.com/p/testing-amds-bergamo-zen-4c-spam < this says that L3 gets also a hit to latency on zen4c
13:16:21
hyc:
just finished building gcc/g++ 14.3.0, still failed to build. Maybe gas is also out of date
13:16:40
hyc:
debian paste refused the output: Could not add your entry to the paste database:
13:16:40
hyc:
Spam detected: Content is primarily a list of links or hashes.
13:18:23
sech1:
Yes, risc-v vector aes instructions are very new, so gas must be new too
13:20:32
hyc:
https://pastebin.com/D5QpcLvh
13:20:42
hyc:
I'll try to get that built...
13:44:09
sech1:
Orange Pi RV2 has gcc 15 out of the box (in the distro from their official site), so it's not that bad for risc-v boards
13:51:45
hyc:
Licheepi appears to be abandoned, they haven't updated their OS image in 2 years
13:52:23
hyc:
a shame, since I got 16GB of RAM in mine
13:56:01
hyc:
ok. latest gas built, got past there
13:57:49
hyc:
all tests passed, SSSE/AVX skipped
13:57:55
hyc:
and fast reciprocal skipped
14:03:00
hyc:
mining with 1 thread https://paste.debian.net/hidden/6c5e26ee
14:05:41
sech1:
Nice
14:06:00
sech1:
I see you're testing the final v2 code, not just that RV64 vector PR?
14:06:29
sech1:
No large pages? You have 16 GB RAM, you can enable them
14:11:54
hyc:
ah yes, I thought this branch was just the PR merged on
14:12:08
hyc:
do I still need to check just the PR itself?
14:12:55
hyc:
4 threads mining https://paste.debian.net/hidden/fc6d9c0d
14:14:03
sech1:
The PR itself is included in v2 branch, so if it works, it works
14:14:18
hyc:
ok then I'm going to approve the PR
14:14:20
sech1:
And large pages?
14:14:26
hyc:
checking that now
14:14:41
sech1:
That's a more powerful board than RV2
14:14:44
sech1:
better hashrate
14:15:10
hyc:
lemme see, how many hugepages do I need here
14:15:47
sech1:
I always set it to 1280
14:16:11
sech1:
1200 should be enough too, but 1280 is a more round number :)
14:16:29
hyc:
ok
14:16:36
hyc:
I set 1200
14:18:42
hyc:
largepages https://paste.debian.net/hidden/84f42e24
14:19:06
sech1:
Definitely faster than RV2
14:19:10
sech1:
Nice to see it working
14:19:25
hyc:
too bad no vector AES but all good
14:19:29
sech1:
RV2 could do 110.7 h/s max on v1 (with XMRig)
14:19:35
sech1:
And I think 99 h/s on v1 with randomx-benchmark
14:19:48
sech1:
and 67 h/s or so on v2
14:20:00
sech1:
RV2 doesn't have AES too
14:20:09
sech1:
I don't know of any boards that have it now
14:21:44
sech1:
yes, found in the chat log: "Orange Pi RV2 (Ky X1 CPU, software AES): v1 99.3 h/s, v2 67.1 h/s"
14:27:54
hyc:
have you tried PR#174, monsterpages?
14:28:07
hyc:
seems decent enough
14:34:21
sech1:
No, I haven't tried it yet. But I can try it when I get back home, in a week or so
14:34:42
hyc:
cool
14:34:51
sech1:
"Memory initialized in 29.9764 s"
14:34:57
sech1:
RV2 initializes dataset in 15 seconds
14:35:00
sech1:
Thanks to vector code
14:35:07
hyc:
nice
14:37:24
sech1:
And maybe also because it's 8-core
14:41:18
hyc:
did you look at all at the v0.7 vector instructions? I wonder if the stuff you used is all present in there
14:42:09
hyc:
granted, only THead's patched gas supports it
14:43:31
sech1:
No, I downloaded v1.0 vector specs and only looked at that documentation
14:43:44
sech1:
v0.7 was an intermediary standard, all new boards will have v1.0
14:44:01
hyc:
ok. I have both v0.7 and v1.0 docs around here but haven't looked at any of it in a while
15:01:46
hyc:
bitmain also trying to contact me on reddit https://paste.debian.net/hidden/014a2749
15:05:18
DataHoarder:
> We want to discuss/clarify on the new updates on Monero.
15:14:39
hyc:
personally I don't see why anything needs to be discussed, everything is there in the github repos
15:17:49
DataHoarder:
and the irc channels and matrix rooms are public
15:20:10
sech1:
PR merged, https://github.com/SChernykh/RandomX/tree/v2 rebased
15:20:33
sech1:
Bitmain can't IRC, apparently
15:20:38
sech1:
Is it blocked in China?
15:20:52
DataHoarder:
matrix :)
15:21:22
sech1:
They can make a Github issue, ffs :)
15:22:01
sech1:
Damn, that FreeBSD packaging bug is still not fixed... https://github.com/SChernykh/RandomX/actions/runs/21253931716/job/61162948713
15:22:50
sech1:
What does Bitmain tech team want to discuss anyway, we even have RISC-V code for them to use now :)
15:24:11
hyc:
yeah they don't even have to lift a finger
15:25:24
DataHoarder:
just send them the irc logs
15:25:38
DataHoarder:
https://libera.monerologs.net/monero-pow/20260122
15:27:13
hyc:
btw, SG2044 is most likely what they're using now https://browser.geekbench.com/v6/cpu/8661173
15:27:53
sech1:
Does SG2044 have hardware AES, or do they use some kind of SoC with AES accelerator on it - that is the question...
15:28:19
sech1:
SG2044 is a powerful beast, especially in the memory department
15:29:03
sech1:
https://arxiv.org/html/2508.13840v1
15:30:51
DataHoarder:
> 1 Processor, 1 Core, 64 Threads
15:32:13
hyc:
SG2042 4 memory controllers, 4 channels. SG2044 32 memory controllers, 32 channels. What a beast.
15:33:52
DataHoarder:
dataset for the core is afaik https://www.xrvm.com/product/xuantie/C920 https://www.xrvm.com/community/download?id=4240183866976964608
15:34:21
DataHoarder:
https://github.com/sophgo/sophgo-doc/blob/main/SG2042/T-Head/XuanTie-C910-C920-UserManual.pdf
15:35:22
DataHoarder:
cannot find AES mention there (?)
15:36:51
DataHoarder:
C920v2
15:37:31
DataHoarder:
if it implements 1.0 it should have the vector crypto no?
15:37:42
plowsof:
ask them for details via the provided email :D
15:37:51
hyc:
interesting, gcc 14 adopted support for THeadVector, so basically vector0.7
15:39:52
DataHoarder:
damn https://arxiv.org/html/2508.13840v1/graphs/streams.png
15:40:31
DataHoarder:
interesting comparisons against Zen2
15:40:33
hyc:
yeahI get the feeling it'd be a great database machine too
15:42:38
DataHoarder:
if it has aes, the x9 would still be a pretty nice platform if they don't become DoA like previous
15:43:24
hyc:
https://www.hackster.io/news/ghostwrite-a-serious-flaw-in-the-t-head-xuantie-c910-and-c920-cores-hits-popular-risc-v-sbcs-14833c98e33d
15:43:50
hyc:
apparently the vector unit can access any memory, independent of MMU permissions
15:43:52
DataHoarder:
that is afaik about the C920v1(?)
15:43:55
DataHoarder:
ahahahah
15:44:33
DataHoarder:
spectre is inevitable, just offer instructions to do this directly
15:44:40
hyc:
lol
15:46:35
DataHoarder:
email-to-irc service when :)
15:53:35
hyc:
I don't see any Zkr / Zkt references here https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677772.html
15:53:42
hyc:
so probably no crypto extensions
15:54:03
sech1:
Zk are scalar crypto extensions
15:54:09
sech1:
Zvk* are vector crypto extensions
15:54:12
sech1:
They are different
15:54:25
hyc:
either way, not referenced here
16:01:02
sech1:
In addition to the standard RV64GCB[V] ISA, C907 has also implemented the XIE (XuanTie Instruction Extension). The XIE consists of extended instructions optimized for load/store, arithmetic, bitwise and cache/TLB operations. When enabled, these instructions improve the performance significantly.
16:01:04
sech1:
Interesting
16:01:14
sech1:
SG2044 has XIE too
16:01:36
sech1:
Looking for the list of instructions now
16:01:56
hyc:
I think current gcc with -march=native should already turn those on
16:02:17
hyc:
though perhaps some of these only make sense in asm
16:02:37
hyc:
crypto specs https://github.com/riscv/riscv-crypto/releases
16:03:13
hyc:
So yeah, Zvk doesn't get mentioned anywhere. and I guess it's separate from RVV1.0?
16:06:00
sech1:
Probably this: https://github.com/XUANTIE-RV/thead-extension-spec/releases/tag/2.3.0
16:06:19
sech1:
I don't see any crypto instructions mentioned
16:08:56
sech1:
Those extensions add instructions similar to the regular RISC-V zba/zbb/zbc instructions, we already have support for those
16:09:12
hyc:
Zvk is still pretty new https://dl.acm.org/doi/abs/10.1145/3658644.3691394
16:10:56
DataHoarder:
without AES in-line, that'd be pretty devastating
16:11:24
DataHoarder:
I wonder if they have an accelerator for scratchpad AES
16:14:50
sech1:
hyc "According to the latest RVA profile, vector crypto should be preferred: https://github.com/riscv/riscv-profiles/releases/tag/rva23-rvb23-ratified" - tevador
16:15:07
hyc:
what the hell are folks running on. Linux kernel already supports it all. https://lwn.net/Articles/952854/
16:15:35
sech1:
I don't know what they're running on, but I debugged my vector aes code in qemu
16:16:28
sech1:
They did the same, probably, since it was in 2023
16:16:57
hyc:
I guess
16:18:00
hyc:
zvkt here https://www.rt-rk.com/gcc-tuning-for-spacemit-x60-building-an-in-order-dual-issue-scheduler-model-part-i/
16:19:54
sech1:
zvkt = make certain instructions constant time
16:20:00
sech1:
it's not a specific instruction set
17:17:35
hyc:
ok. so spacemit x60 claims RVA22 support and RVV1.0 but crypto is still optional in RVA22 and it sounds like they don't support it
17:28:57
sech1:
X5 used SG2042R which must be a custom chip, so maybe X9 uses a custom SG2044