← Home ← Back to /g/

Thread 105564145

45 posts 8 images /g/
Anonymous No.105564145 >>105564932 >>105566865 >>105570643 >>105572138 >>105573196
formal verification
there was quite a big effortpost thread on this a month or too ago about F* and I still can't get the concept of formal verification out of my head. A lot of the ideas around it with shit like rocq still fly over my head but the idea of going beyond unit testing and branch coverage, actually mathematically proving you code works is cool as fuck and a lot more applicable than 'muh memory safety'. compcert looks cool as fuck but isn't true free software and sel4 seems very promising as a kernel and even works with genode (another seldom discussed project) but for the most part there is very little in terms of formally verified software. Is it really that high of a skill barrier? How hard would it be to work on a formally verified implementation of a conceptually simple language like forth or scheme and work up from there? Can the verification go as far as the bootloader too? I'm talking out my ass but imagine some shit like dusk os where you not only have an insanely powerful platform that runs on a zx spectrum but the whole system could be mathematically proven to work too :O
Anonymous No.105564282
https://model-checking.github.io/kani/
Anonymous No.105564406
>is it really that high of a skill barrier?
Last time i checked seL4 was about 20,000 lines of code and 1.3 million lines of proof, so yes, it's hard.
Anonymous No.105564932
>>105564145 (OP)
i think it depends on a system you want to verify, i am only learning about it but i am using lean4 quite a lot and sometimes its proof capabilites

with lean4 and its type system you can verify functions quite easily, but if you want to verify some other system i think that will be very complex

Also, the idea of a contract is part of verification, C# has contracts and Kotlin too, also Ada SPARK can be used to create verified software
Anonymous No.105565545 >>105565591 >>105571365
Anyone has some resources to get into formal verification? What are some of the requirements? Knowledge of discrete mathematics?
Anonymous No.105565591 >>105566981 >>105567004 >>105570455
>>105565545
https://softwarefoundations.cis.upenn.edu
Don't know if it's good though I'm barely starting it myself
Anonymous No.105566865 >>105566890 >>105568336 >>105570877
>>105564145 (OP)
why do good technology/software threads always die in this board, but shitty consumer crap always passes bump limit?
Anonymous No.105566890 >>105566981 >>105568336
>>105566865
because 80% of this board literally has not coded anything beyond hello world
95% of this board has never gone beyond basic data structures like linked lists or hash maps
the remaining 5% that have actually had to glue together functionality across many libs/programs/configs, deployed actual software, written useful user interfaces, etc. are literally unable to communicate with the other 95% due to the 95% not understanding what types of problems can arise beyond anything but a very surface level.
Anonymous No.105566981
>>105565591
I am also taking a look at this. It will be awhile though, because like >>105566890 said, I am also low skill currently.
Anonymous No.105567004
>>105565591
that is an overkill desu, start with design by contract
Anonymous No.105568331 >>105568406 >>105572182
This is the kind of shit AI will do for us automatically. Humans will just review the requirements which the AI will happily convert back and forth between natural language and formal language and point out edge cases for us.
I think people overcomplicate things though. I think a simple first order logic system is probably enough.
I don't think formal verification is that important though. Testing, code review, documentation, modularity, good design, functional programming gets you 99% of the way there.
Anonymous No.105568336 >>105568423
>>105566865
>>105566890
what do you expect? It's an extremely niche topic
99% of programmers have never heard about formal verification
Another cool rabbit hole to go into is correct by construction program synthesis
Anonymous No.105568406
>>105568331
delusional
Anonymous No.105568423 >>105568446 >>105571156
>>105568336
It is sadly, but because at every turn programmers made the bad choice, OOP for example because of how much there is mutability is extreamly hard to verify, and things like Spring with endless amount of CockAndBallsTotureBeans are imposible to verify
Anonymous No.105568446 >>105568542
>>105568423
Yes, the midwits like Uncle Bob won.
But it's mostly a fault of academics for circle jerking with each other and gatekeeping rather than making things accessible.
It goes back to the joke that a monad is just a monoid in the category of endofunctors.
With friends like that who needs enemies?
Anonymous No.105568542
>>105568446
true sadly, by simply making things like Monads easier, they could have won
and education should be different people should learn functional languages first, mutation should be explicit and done only explictly, that would made everything better

it would also make programming easier for students, because they learn math longer than programming
Anonymous No.105570455
>>105565591
It's very good if you want to learn Coq/Rocq.
t. did 90% of exercises in the first book
Anonymous No.105570559 >>105570643 >>105571102
Even if it works mathematically stuff like random bit flips can still screw it up, on space computers they'd have a voting system to decide the answer.
Anonymous No.105570643 >>105570840
>>105564145 (OP)
>a lot more applicable than 'muh memory safety'.
I think that's not correct (or at least depends on what you mean by "memory safety").

If you mean something static (Rust, affine types, linear types) that's actually pretty close to formal verification, because static type systems already _are_ formal verification. They prove the absence of certain errors at runtime. Usually it's about lack of type errors (think string to int addition), but you can also guarantee that every resource (allocated memory, file descriptor, db connection) is eventually closed and closed only once, or that there is at most one mutable reference to the object.

Also see Curry-Howard correspondence linking types with theorems and programs with proofs. You can make your type system more and more powerful (cf. Barendregt's λ-cube) and eventually you can specify and prove arbitrarily complex properties (and in process you lose nice features like type inference).

And properties like "memory safety" or "lack of type errors" are both relatively simple to check and important for bug prevention - that's why we have relatively widespread languages that enforce them. On the other hand, full correctness proofs are really really expensive (in terms of development speed), require knowledge of weird math and honestly rarely needed. As an example of "expensiveness", you know the algorithm to check that parentheses are balanced, right? Trust me, proving this (in both directions, formalized here https://www.codewars.com/kata/5de1761596cc1d002d5e5d0a) is really hard, took me several days. And yet just coding the algorithm takes at most 15 minutes and after some testing you're 99.999% certain that you made no errors.
>>105570559
You can add possible bitflips in the semantics of your language, though it will be cumbersome.
Anonymous No.105570840
>>105570643 (me)
Also I think it's pretty funny that expert programmers in functional languages with powerful type systems sometimes advice to avoid complex galaxy-brained type contraptions. For example, here's the PR in OCaml repo where several guys who're well-known in the OCaml ecosystem think that phantom types are pain in the ass and usually don't worth it:
https://github.com/ocaml/ocaml/pull/13097#issuecomment-2109606118
https://github.com/ocaml/ocaml/pull/13097#issuecomment-2110088489
https://github.com/ocaml/ocaml/pull/13097#issuecomment-2110333656
It's cool when you can guarantee that some bugs are impossible, but eventually you hit diminishing returns.

Also, even with full formal verification you still have to trust something (e.g. that interfaces are specified correctly or that certain assumptions are true). Interesting article:
https://www.cs.purdue.edu/homes/pfonseca/papers/eurosys2017-dsbugs.pdf
>This paper thoroughly analyzes three state-of-the-art, formally verified implementations of distributed systems[...]. Through code review and testing, we found a total of 16 bugs, many of which produce serious consequences, including crashing servers, returning incorrect results to clients, and invalidating verification guarantees. These bugs were caused by violations of a wide-range of assumptions on which the verified components relied.

I still vibe with the idea of formal verification, but come on, it's important to understand real-world constraints.
Anonymous No.105570877 >>105571108
>>105566865
this board is intended for ephemeral threads that are not supposed to be archived. you don't come here to have long and well though discussions. your idea of image boards is wrong.
Anonymous No.105571102 >>105572078
>>105570559
True, but ECC memory works to counteract this to some extent, and you can add mitigations in software (checksums etc.).
And those failures, unlike bugs, wont get you hacked.
Anonymous No.105571108
>>105570877
Conversations aren't archived either, but I'd have no problem nerding out with somebody else IRL if they were into this stuff.
Anonymous No.105571156 >>105571249 >>105572020
>>105568423
oop itself doesnt make for example a C code more mutable than it already is
modularity, low coupling and classes as abstract data types (contracts and type safety) comes automatically with oop. it's the most approachable way to bring formal safety in a software. 80% for 20% of the effort.
Anonymous No.105571249 >>105571292
>>105571156
It does, and happens constantly, have you ever worked in java enterpries ?
it is hell because of OOP
Anonymous No.105571292 >>105571409
>>105571249
>have you ever worked in java enterpries ?
no. though it would be better if you give an example. a piece of data that turned mutable because of oop
Anonymous No.105571365 >>105571396
>>105565545
Discrete mathematics might be enough if you're a self starter, and willing to learn as you go.

I'd say Non-Deterministic (LTL, CTL, CTL*) model checking is more mathematical maturity heavy, but light on the actual prerequisites.

Probabilistic model checking is much harder on the actual maths you need to know (you'll need to have a reasonable working grasp of Markov Chains), and if you want to get into the details of how checkers work you'll also need lots of Monte Carlo type stuff, but the topic is generally less abstract.

Ideally for the non-deterministic stuff (the main stuff), you should have a proper formal logic course (dealing logic as a grammar which generates formal sentences, and then semantics being derived Boolean functions on them, by way of assignments) and a "Models of Computation" type automata-theory/language course as LTL checking relies on omega regular languages and Generalised non-deterministic Buchi automata and these are quite a bitch to get your head around if you haven't seen DFAs/NFAs. I only took my Uni's base course in FV but I looked into the second course and there's a lot about working with infinite automata.
Anonymous No.105571396
>>105571365
Also a second course in Algorithms (the bullshit you don't get leetcoded on) is strongly advised, especially getting familiar with reducing things to SAT/using SAT solvers like Z3 in practice.
Anonymous No.105571409 >>105571651
>>105571292
it is hard to explain, but consider the idea of Spring and what it does, you can clearly see in Spring and surrounding ecosystem the problems with mutable state and why it is hard to verify it
Anonymous No.105571651
>>105571409
I don't know about Spring either. why can't you just give an example purely on OOP terms?
>the problems with mutable state and why it is hard to verify it
it's not necessarily harder.
the classes you will find after object decomposition will have their inherent own state. but think in terms of object interactions and contracts. it should be enough if you can verify a class can satisfy its own invariant and postconditions
Anonymous No.105572020 >>105572641
>>105571156
OOP makes the language more complex which makes specifying it in a formal language harder and if you don't have a formal spec of your language then you can't formally prove anything about the programs in that language.
where is the python formal spec? where is the c++ formal spec? where is the java formal spec? of course they don't exist. to be fair I don't think Haskell is formalized either and it's also a very complex language. but C is formalized, standard ML is formalized, I'm sure some variant of Scheme is. all relatively simple languages compared to the OOP behemoths.
Anonymous No.105572078
>>105571102
You're still gonna rely on library calls for stuff like reading and parsing files.
Anonymous No.105572138 >>105573319
>>105564145 (OP)
>Is it really that high of a skill barrier?
no, it's just a lot of work.
also if you really want to go this route, the best option for real (though not real-time) work is idris
https://www.idris-lang.org/

there are problems with formal verification by the way
https://spinroot.com/spin/Doc/rax.pdf
tl;dr formally verified system failed because of an issue with the spec
the more complicated the thing you're working on (operating systems are VERY complicated) the more holes and contradictions there are going to be in your spec, and you end up playing conceptual jenga as you try and work out the kinks. small changes turn into huge refactors and architectural re-thinking.
it's "doable" but you need a lot of really good people working over a long period of time, and that's not cheap.
Anonymous No.105572182 >>105572340 >>105572421
>>105568331
>This is the kind of shit AI will do for us automatically.
ai can't even write a non-optimizing c compiler. a non-trivial formally verified system is an order of magnitude more complex and requires intensive and very careful ahead-of-time architectural planning, as well as requiring tight conceptual coherence, and is literally the most domain-specific code you can write. it's literally a worst case scenario for transformer markov chains, every single aspect of how they function is in direct conflict with the actual requirements of writing proofs.

and that sucks, because you're right. this is the kind of shit AI should do for us automatically. but our current generation of AI systems are the worst possible tool for the job. expert systems are better at this, but much more labor intensive since you still have to build the knowledge base for the domain (and they take forever)
Anonymous No.105572340 >>105572641 >>105572867
>>105572182
>and requires intensive and very careful ahead-of-time architectural planning
Does it? Can't you just specify properties for each function and then prove that the code for that function meets them?
That's why I don't like OOP or even just complicated data structures in functional programming. because then the proof depends on the implementation of each object that the function interacts with. if you have a simpler data structure like, say, a list of lists of strings or something simple like that, then the properties of that data structure are obvious unless you add additional restrictions on what that list can contain.
And that's where I see AI coming in. you specify in natural language what the function does and the AI writes the formal spec for you, then tries to find a proof, possibly leveraging human intuition if the proof is hard but also helping with language features so the user doesn't have to be an expert on that specific proof language to help design the proofs. Nobody expects AI to verify a whole project all by itself. The Isabelle proof assistant already uses machine learning models to try to automatically prove lemmas.
Anonymous No.105572421
>>105572182
>ai can't even write a non-optimizing c compiler
AI code usually doesn't compile lol.
Anonymous No.105572641
>>105572020
I'm having hard time to understand the relevance of this post
>>105572340
>Can't you just specify properties for each function and then prove that the code for that function meets them?
that's already how automated theorem provers like CBMC work with assumptions and assertions afaik but since it's computationally complex you're required to split up your code to more modular structures that can be verified independently
>because then the proof depends on the implementation of each object that the function interacts with
it doesnt if you do oop. every object have clearly defined interfaces and contracts, so that implementation details only matter while you're proving the invariant and postconditions for individual objects
Anonymous No.105572867 >>105572895
>>105572340
>Can't you just specify properties for each function and then prove that the code for that function meets them?
That's already how it works.
But consider this, you have a function which is designed to work asynchronously. As you progress further along, you realize that your model where this function is asynchronous is actually inappropriate and it needs to become serial. This causes the proof itself to radically change, and the guarantees of the function are greatly different. Now every other function with a proof which relied on this first function to be concurrent now needs to be redesigned, and functions those functions rely on, and so on. On top of any actual code that needs to be changed. In a codebase with a couple hundred thousand functions, you don't even need to change a core assumption for this to explode into an issue.

This is just an obvious example. This kind of refactor hell is actually quite common in complex, formally verified software. That's why projects which rely on formal verification aim to have a very long conceptualization period where the whole thing is planned out from the very beginning. The more complex the project, the more you end up either leaning on very specific interfaces, or very wide proof chains. In both cases, you're getting sent to refactor hell if you have to change anything meaningful. If you're in a real-time domain where you NEED to write highly impure software out of strict performance constraints the problem increases by several orders of magnitude.

>then the properties of that data structure are obvious unless you add additional restrictions on what that list can contain.
Which is unavoidable for the overwhelming majority of software. Very rarely have I ever seen an actual usecase for a list of arbitrary strings. Those strings usually have a domain and half the reason for formal verification is ensuring they actually fit the domain. EG ensuring input is fuzzed properly before it can be passed to a database.
Anonymous No.105572895 >>105572941
>>105572867
>fuzzed
sanitized*
Anonymous No.105572941 >>105572995
>>105572895
Too late, your proof failed.
Anonymous No.105572995 >>105573087
>>105572941
Actually the proof checker didn't complain because fuzzing was also semantically valid to check for here. However now our database is vulnerable and we have no idea because this is what the spec said to do!
Anonymous No.105573087
>>105572995
And the meme magick was lifted from a small European town when Basedjak touched the core, yet it isn't a hacker's game for which exact reason?
Anonymous No.105573196
>>105564145 (OP)
I recently read through the little typer and would highly recommend it. Dependently typed functional programming and untyped programming shares the same relationship in my mind as conservative vector fields to non-conservative fields. Technically a non-conservative vector field doesn't exists and you could model any problem using only conservative fields if you are willing to include enough variables, but you're not so you model using non-conservative fields. Similarly if you want to program like this you have to arity of your types must match the arity of the problem domain exactly. Once you've modeled your types correctly the logic is really easy and natural.
Anonymous No.105573319 >>105573628
>>105572138
Where does it say there was a problem with the spec?
Anonymous No.105573628
>>105573319
I think that's implied because the code is proofed