10:41:47
basses:matrix.org:
good article, have you looked into AI auditing services where actual maintainers of a crypto software vouched for them and all findings were documented and patches were open source? > <@sgp_> Just in case anyone suggests an AI-driven security audit of Monero in the future: https://magicgrants.org/2026/03/09/AI-Not-Ready-for-Ethereum-Audits
10:44:26
basses:matrix.org:
> <@rbrunner7> @sgp_:monero.social: Thanks for the interesting info. But IMHO you won't be able to stop that, using rational arguments and examples where it already went wrong. Somebody will try with Monero. When the bursting of the AI bubble will be nearing, it might be that desperate companies with AI products will even sponsor such work with bounties ...
10:44:26
basses:matrix.org:
it is good that people are trying (and hopefully someone knowledgeable will report it to Monero!)
10:44:26
basses:matrix.org:
if monero community just ignored them completely and attackers started using them, then we will behind.
10:44:26
basses:matrix.org:
[... more lines follow, see https://mrelay.p2pool.observer/e/jNDihu4KSzRRUUFr ]
11:03:07
sgp_:
@basses:matrix.org: What are you referring to? The example I used was made by a security firm that typically does these audits so I thought it would be among the best fits
11:06:53
basses:matrix.org:
@sgp_: yeah but it was just a marketing material, no real test results were conducted. Compare it to https://blog.mozilla.org/en/firefox/hardening-firefox-anthropic-red-team/ (ik it is a different product, but real results were shown and verifiable patches)
11:12:13
basses:matrix.org:
from what I saw with anything benchmarks, it doesn't always reflect real testing
11:24:35
syntheticbird:
> <@basses:matrix.org> good article, have you looked into AI auditing services where actual maintainers of a crypto software vouched for them and all findings were documented and patches were open source?
11:24:35
syntheticbird:
AI are too retarded for crypto protocol audit. Only thing they catch CORRECTLY crypto related is documented bad practices. Like some morons setting RSA exponent to 1. Try telling claude to audit fcmp++ they will completely hallucinate. Not only because this is state of the art, so not assimilated, but it is far from the being [... too long, see https://mrelay.p2pool.observer/e/5d71h-4KbXc3WnAy ]
11:26:41
321bob321:
I think the public facing AI is shit compared to the privately used ones
11:27:10
syntheticbird:
@basses:matrix.org: I've never seen such an enthusiastic proof of incompetence and clear misguidance. Mozilla can't help it, they really love being dragged in the mud.
11:29:44
syntheticbird:
@321bob321: Not just that. In firefox case, it's very clear that Anthropic took their talents equipped with claude and told them to find as many vulns as possible. It's not just Claude that found these vulnerabilities. There were human filtering and analysis. Mozilla nor Anthropic will explicitly explain it however because that would undermine the involvement of Claude.
11:31:30
syntheticbird:
"We found 14 high severity vulnerability" really conludes into 1. Claude have probably shitted a hundred false report 2. Mozilla is fucking lost in its own codebase. 3. There were probably extensive human tinkering along the process.
12:01:55
sgp_:
Fwiw, even after this, it's hard for me to fully shake the idea of using AI to help with certain types of reviews. I keep comparing it to how fuzzing can help test a ton of cases. But unlike fuzzing, AI tends to make stuff up and lie. So you basically need a full time person (team?) just reviewing its output to triage. Maybe t [... too long, see https://mrelay.p2pool.observer/e/lL3-iO4KOTF6QU5i ]
13:07:42
sgp_:
https://mrelay.p2pool.observer/m/monero.social/mGkXMPrkqaWOBrasLcRVXTZr.pdf (justin_v12_zellic_serai__next_polkadot_sdk_e962cf4_findings_2026-03-03-findings.pdf)
13:08:01
sgp_:
@basses:matrix.org: this is the report I was given. You can see that V12 tried to make proofs of concepts and remediation steps as well
13:31:36
ofrnxmr:xmr.mx:
@sgp_: m$ clearly uses copilot to code and review changes to the github mobile app.. and theyve broken it in so many ways recently. I have a hard time accepting the current state of ai as reliable without in depth human review
13:56:22
jeffro256:
@sgp_: I would maybe use chatbots for code review only to brainstorm if I was in a rut, with the expectation that most of its ramblings will be useless (or worse than useless in this case). But so far, the chatbots I've seen for review only correctly pick up trivial things like typos, formatting, etc. I recently had some [... too long, see https://mrelay.p2pool.observer/e/7MWhjO4KNGRhaHl1 ]
15:19:32
nioc:
ChatRubberDuck
16:42:56
ravfx:xmr.mx:
@syntheticbird: Mozilla Corporation just released the AI enabled firefox. People don't want it. 4. They have to try to hype it up somehow.
17:28:04
kiersten5821:matrix.org:
@sgp_: i believe if you don't run your code through ai at least once, you are being negligent
17:28:24
kiersten5821:matrix.org:
https://x.com/ControlZ_1337/status/2005972623939756357#m
17:28:24
kiersten5821:matrix.org:
https://x.com/octane_security/status/2019603445368692738#m
17:29:39
kiersten5821:matrix.org:
unless you are smart enough to find a 250k bug, you do not have 100% strict coverage advantage over the ai
17:29:57
kiersten5821:matrix.org:
somehow i doubt most of these ai haters have done that
17:31:46
kiersten5821:matrix.org:
it needs heavy manual review, throwing it out the window however is an extremely poor choice
17:45:02
sgp_:
I think I agree with SRLabs' suggestion to use AI at the very end: https://srlabs.de/blog/ai-verification-bottleneck
17:45:02
sgp_:
"The implication for organizations is practical: if AI is introduced early and broadly, the team spends review capacity on sorting tool output before it has built a mental model of the system. If AI is introduced late and narrowly, after humans have established context, it becomes a coverage and QA tool instead of a triage generator."
17:45:26
sgp_:
and to not use it until the end
17:55:59
sgp_:
But I also don't need firms to do this usually; the original code writer can ask AI to check it if they want
18:19:48
jeffro256:
If you are resource / time constrained, consulting some oracle who is right 5% of the time is actively harmful because of opportunity cost. So I definitely disagree with this assertion. Literally just spent ~3 hours debunking a stupid slop report > <@kiersten5821:matrix.org> i believe if you don't run your code through ai at least once, you are being negligent
18:24:26
moneromooo:
Reminds me a bit of coverity tbh.
18:24:45
moneromooo:
Occasional good find. Pain to sift through otherwise.
18:32:47
kiersten5821:matrix.org:
@jeffro256: 5% is huge i would be consulting it all the time. think sgp is right that it should be at the end after you are "finished" with the code though. if you're the author you can debunk much faster if you're the one asking ai for issues since you know the whole thing much more thoroughly (assuming your 3h debunk is [... too long, see https://mrelay.p2pool.observer/e/7t-VlO4KU1otcl9T ]
18:37:56
jbabb:cypherstack.com:
open source as a whole is suffering a huge deluge of AI code submissions
18:37:56
jbabb:cypherstack.com:
used to, the code itself was proof to some extent that it was worth reviewing: somebody took the time to find some missing function or application and toiled to make that a reality
18:37:56
jbabb:cypherstack.com:
now, code can just be produced on a whim: there's no guarantee it works or even that it was needed in the first place
18:41:34
jbabb:cypherstack.com:
on the other hand I've also had reports from real people I trust that it can help research a lot: sorting thru many papers for the gems that need attention or has content relevant to open questions/ongoing work has been the best application I've heard that works consistently
19:17:03
321bob321:
Reminds me of huntarr
19:51:54
UkoeHB:
I’ve also heard of success using it as a glorified search engine. It can be effective at hunting down known bugs in web dev apps (eg identifying upstream mozilla bugs as culprits).
19:53:04
UkoeHB:
Not a perfect solution though, if the recent degradation of google search results is any indication.
19:55:39
intr:unredacted.org:
UkoeHB: this is what I use it for, if at all
19:56:06
intr:unredacted.org:
the trap with AI is it's way too easy to be lazy and take what it spits out at face value without doing further research
19:56:37
intr:unredacted.org:
also, nice to see your name again
19:57:05
UkoeHB:
:)
20:04:24
ravfx:xmr.mx:
UkoeHB: Yeah, That's the main use case I have for AI, Glorified search engine.
20:04:25
ravfx:xmr.mx:
Running big local model (like Minimax 2.5 235B or the like) is also nice for private search (you don't leak any packet on the internet for the search and you get an answer without loading a webpage).
20:04:25
ravfx:xmr.mx:
It's also faster and usually way more precise for technical question compared to digging thru the crap you get in google or other.
20:14:38
kayabanerve:matrix.org:
The issue is it doesn't function like a search engine and doesn't yield you the original sources, nor does it attribute and yield to the people who posted the information.
20:15:52
ravfx:xmr.mx:
Yeah, it does not give you the sources, but it get right to the point to what you asked for.
20:26:04
hbs:matrix.org:
Advanced models like Gemini Pro/Thinking can give out the/some sources
20:36:39
321bob321:
Ad services source