Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta
2024. Volume 64. Pp. 17-33
MSC2020: 68N17, 03F03 © M. Joudakizadeh, A. P. Beltiukov
ADAPTIVE HUMAN-MACHINE THEOREM PROVING SYSTEM
This paper presents a novel approach to constructing human-machine theorem proving systems. These systems integrate machine learning capabilities, human expert knowledge, and rigorous logical control for the effective construction and verification of proofs. The innovation of this approach lies in its openness: the user is given a tool for building such systems. Users can create theorem proving systems by selecting existing logical inference strategies or adding new ones in accordance with the provided interfaces or relationship agreements. The systems being built have a significant human-machine adaptive nature. During their operation, a meta-strategy is developed and trained based on the foundational elements of the existing strategies. The system is designed as a universal framework for managing various strategies, with the provision of a basic architecture and a library of strategies with the possibility of adding new ones. During the learning process, a system of structural characteristics is also accumulated and trained, on the basis of which a decision is made on the use of specific strategy elements at the next step. The presentation is conducted for the sequential calculus of minimal positive predicate logic as the most suitable for the deductive synthesis of programs and algorithms, for which it is proposed to use this approach.
Keywords: automated theorem proving, minimal predicate logic, machine learning, human-machine interaction, artificial intelligence, sequential calculus, deep learning, proof verification.
DOI: 10.35634/2226-3594-2024-64-02
Introduction
Automated theorem proving is a key area of research in mathematical logic and artificial intelligence. Despite significant advancements in this field, existing automated theorem proving systems face several limitations, particularly when dealing with complex and non-trivial theorems. Conversely, traditional theorem proving methods that rely solely on human expertise are often labor-intensive and error-prone.
In this paper, we present a novel approach to addressing this issue: an integrated adaptive human-machine theorem proving system based on minimal predicate logic. Our system combines the strengths of machine learning, human intuition, and rigorous logical control to create an effective and reliable tool for constructing and verifying proofs.
A key feature of the proposed approach is its adaptability, which is realized through a mechanism for managing the elements of strategies. It is assumed that the proposed tool initially contains a ready-made set of strategic elements and structural characteristics that can be utilized in decision-making. For instance, one such characteristic is the data regarding the history of inference that is the history of the appearance of specific elements in the premises of the theorem being proven. This historical data enables the system to make decisions based on previously taken steps, avoiding repetitions and loops.
During the initial setup of the system, the user selects the most important strategic elements and characteristics of the propositions to be proven. If necessary, the user can add new strategic elements and characteristics in accordance with the provided interfaces and connection agreements. Applying the resulting system involves learning which strategic elements and characteristics to use based on the success of the proof attempts.
The choice of minimal predicate logic as the foundation for our system is motivated by its simplicity and its applicability to a wide range of mathematical and logical problems. The use of
straightforward sequential calculus ensures clarity and transparency in the proof process, which is crucial for the initial exposition of the system’s concepts.
It is important to note that our system serves as a universal framework for managing various proof strategies and characteristics of propositions used for training. The user is provided with a basic architecture and a library of strategies and characteristics with the option to select existing ones or add their own through the provided interface. This ensures the system’s flexibility and adaptability to different types of tasks, learning methods, and user preferences.
The process of working with the system involves the following stages:
• the user selects the desired strategies from the library or adds their own;
• the human-machine system is trained on example problems chosen by the user for a specific subject area;
• after training, the system is ready to tackle new problems in the selected domain.
The following sections discuss the system’s architecture and its main components: a module for representing sequences with history, a machine-based proposition generator, an expert interface, a logical filter, a knowledge base for proofs, and a training module. Special attention is given to the functioning of the system, demonstrating the interaction between the human expert and the machine component at various stages of the proof process.
Examples of the system’s performance on a range of logical problems of varying complexity are presented, showcasing its ability to find proofs and adapt to the reasoning style of specific experts.
§ 1. Related works
The field of human-machine learning systems and automated theorem proving has made significant advances in recent years, laying the groundwork for our innovative approach. This section provides a comprehensive overview of the seminal works that have shaped the landscape of theorem proving and formal verification.
§ 1.1. Foundations of automated theorem proving
The roots of modern automated theorem proving can be traced back to the pioneering work of Novikov (1977) [1], who made fundamental contributions to proof theory by developing constructive logic. This work was built upon the intuitionistic interpretation of logic proposed by Kolmogoroff (1925) [2], which later found application in constructive methods of theorem proving. Shanin (1962) [3] further advanced the field by developing an algorithmic approach to constructive analysis, making significant contributions to automated proof methods in mathematical analysis.
The resolution method proposed by Robinson (1965) [4] marked a turning point in the field, becoming the foundation for many contemporary automated theorem proving systems. This was complemented by Dragalin’s (2003) [5] exploration of constructive logics and its applications in computer science, which made a significant contribution to the development of automated theorem proving systems.
§ 1.2. Interactive theorem proving systems
The advent of interactive theorem proving systems represented a significant leap forward in the field. Paulson’s (1994) [6] creation of the Isabelle system, an interactive theorem proving assistant, has been widely adopted in formal verification. Similarly, Harrison’s (1996) [7] development of the HOL Light system has been instrumental in the formalization of complex mathematical proofs. Wiedijk’s (2006) [8] comparative analysis of various automated theorem proving systems enhanced our understanding of their respective strengths and weaknesses, paving the way for more sophisticated approaches.
§ 1.3. Machine learning in theorem proving
The integration of machine learning techniques with automated reasoning has opened new frontiers in theorem proving. Lenat’s (1983) [9] AM (Automated Mathematician) system demonstrated the potential of machine learning in mathematical discoveries. Urban (2007) [10] MaLARea system combined machine learning and automated reasoning to solve mathematical problems, setting the stage for more advanced hybrid approaches.
Recent years have seen a surge in the application of deep learning and reinforcement learning techniques to theorem proving. Bansal et al. (2019) [11] developed HOList, a machine learning environment for tactics-based theorem proving, demonstrating the potential of deep learning in guiding proof search. Piotrowski and Urban’s (2018) [12] ATPboost system showcased the effectiveness of learning-based premise selection in large-theory mathematics.
Kaliszyk et al. (2018) [13] presented GamePad, an environment for reinforcement learning of theorem proving strategies in the HOL Light proof assistant, highlighting the potential of reinforcement learning in this domain. Huang et al. (2018) [14] developed CoqGym, a learning environment and benchmark for automated theorem proving in Coq, providing valuable insights into the challenges of integrating machine learning with interactive theorem provers.
§ 1.4. Advanced language models and formal mathematics
The application of large language models to theorem proving has opened up exciting new avenues in the field. Polu and Sutskever (2020) [15] introduced GPT-f, a language model fine-tuned on formal mathematics, demonstrating the potential of these models to understand and generate mathematical proofs. Their work showed that language models can be effectively applied to tasks in formal mathematics, paving the way for further research in this direction.
Wu et al. (2022) [16] built upon this foundation by developing autoformalization techniques. Their research focused on translating informal mathematics into formal representations using large language models. This work addressed the challenge of bridging the gap between human-written mathematical texts and machine-readable formal proofs, a crucial step in automating mathematical reasoning.
Polu et al. (2022) [17] presented a formal mathematics statement curriculum learning approach. Their method focused on training large language models to generate and complete formal mathematical statements. By carefully designing a curriculum of increasingly complex mathematical statements, they achieved significant improvements in the model’s ability to reason about and generate formal mathematics, demonstrating the importance of structured learning approaches in this domain.
Chen et al. (2023) [18] introduced TheoremQA, a theorem-driven question answering dataset. This comprehensive dataset was designed to evaluate the mathematical reasoning capabilities of large language models across a wide range of mathematical topics. TheoremQA tests not only a model’s understanding of theorems but also its ability to apply those theorems in problem-solving contexts, providing a robust benchmark for assessing mathematical reasoning in AI systems.
§ 1.5. Recent advancements in theorem proving systems
Recent years have seen remarkable progress in the development of sophisticated theorem proving systems. Gauthier et al. (2021) [19] introduced TacticToe, a learning-based proof search system integrated into Isabelle/HOL. TacticToe employs machine learning techniques to recommend tactics in interactive theorem proving, significantly enhancing the efficiency of proof development by leveraging patterns from previously proven theorems.
Li et al. (2020) [20] developed IxyProver, an innovative tableau prover that incorporates learning-based guidance. This system showcases the effectiveness of integrating machine learning into traditional automated reasoning techniques, improving proof search strategies and overall theorem-proving performance.
Yang et al. (2024) [21] introduced LeanDojo, an innovative theorem proving system that combines retrieval-augmented language models with formal reasoning techniques. It utilizes a large-scale retrieval mechanism to access relevant mathematical knowledge and fine-tuned language models to understand and generate complex mathematical content. LeanDojo supports interactive proof development, allowing human guidance, and ensures correctness through formal verification within the Lean theorem prover. With an adaptive learning mechanism that improves over time, LeanDojo significantly enhances the efficacy of formal theorem proving, facilitating the tackling of complex mathematical problems and advancing research in the field.
§ 1.6. Comparison with our work
Our adaptive human-machine theorem proving system for minimal predicate logic offers several distinct advantages over existing approaches:
• Specialization in minimal predicate logic: Unlike more general systems, our focused approach allows for optimized algorithms and interfaces, potentially leading to more efficient human-machine interaction during the proof process.
• Adaptive learning: Our system continually adapts to the reasoning style of specific users, distinguishing it from systems with fixed models or pre-trained language models.
• Integrated approach: We combine the strengths of automated theorem proving with interactive proof capabilities, effectively utilizing both computational power and human expert intuition.
• Transparency and interpretability: Our system ensures complete transparency in the proof process through the use of inference history, making it particularly suitable for educational contexts and users who need to understand each step of the proof.
• Effective knowledge reuse: By leveraging previously found proof patterns stored in a knowledge base, our system achieves significant efficiency gains in proving related theorems.
• Flexible architecture: Our modular design allows for easy integration of new strategies and characteristics, enabling the system to evolve with advancements in the field.
• Human-machine collaboration: We emphasize the synergy between human intuition and machine capabilities, making our system particularly effective for complex proofs where human insight can guide the automated search.
• Domain adaptability: The system’s ability to be trained on user-selected example problems makes it adaptable to specific domains of mathematics or logic.
In conclusion, our system represents a unique combination of specialization, adaptability, and interpretability, making it particularly effective for working with proofs in minimal predicate logic. It integrates the strengths of existing approaches while offering novel solutions to enhance the efficiency and transparency of the theorem proving process.
§ 2. Simple sequential calculus of minimal predicate logic
§ 2.1. Syntax of first-order language
The calculus language is based on first-order formulas. Its context-free syntax is defined by the following rules (in Backus-Naur notation, where terminals are enclosed in single quotes and curly braces denote the operation of iteration):
< formula > ::= < name >/(/{< name >','} < name >')'
|/(/< formula >< connective >< formula >')'
| < quantifier >< name >< formula >
< connective >::=' &'|^'|' =>'
< quantifier >::=' A'I'E'
< sequent >::=' — >'< formula > | < formula >< sequent > | < name >< sequent >
In the simple calculus, complex terms are not used for simplicity but only names. This can be done without diminishing the applicability of such systems. Future complexities of the system are possible. For convenience in machine implementation, the quantifiers are denoted by the usual capital letters: A and E. Here, < name > represents non-empty sequences of letters (for example, from the English alphabet).
It is important to note that minimal calculus does not use negation, which makes it especially suitable for practical deductive synthesis of algorithms and programs. This approach implies not only the negation of certain possibilities in the processed values but also the detection and handling of errors if they arise.
§ 2.2. Postulates of the calculus
To formally define the rules of inference in minimal sequential calculus, the following metavariables are used:
• G — finite sequences of formulas and/or names,
• B,C,D — formulas,
• x,y — names of variables bound in formulas,
• b — names in the sequent,
• a — new name added to the sequent,
• f,g — proofs, where special terms reflecting the structure of the inference are used as notation for proofs. These terms can then be used in practical deductive synthesis of algorithms.
The postulates of the calculus are expressed in the form:
name: sequent ^ conditions premises
where:
• name — designation of the postulate, a name with possible parameters,
• sequent — conclusion of the postulate,
• conditions — conditions for applying the postulate,
• premises — premises of the postulate.
§ 2.3. Basic postulates
The basic postulates of our calculus define the fundamental rules of inference. Each postulate is presented in a structured format, consisting of a name, a sequent (which represents the conclusion), and any necessary conditions or premises. Let’s examine each postulate in detail.
The Axiom postulate, denoted as pr[i], allows us to derive a formula B from a sequence G if B is identical to the ith element of G. Formally:
pr[i] : (G ^ B) ^ (B == G[i]).
Here, == denotes graphical (literal) equality, and G[i] is the ith element of the list G, with elements numbered from right to left starting at zero.
§2.3.2. Conjunction Introduction (Cg)
The Conjunction Introduction rule, cg(f, g), allows us to derive a conjunction (B&C) if we can separately derive both B and C from the same sequence G:
cg(f, g): (G ^ (B&C)) ^ f: (G ^ B), g: (G ^ C).
§ 2.3.3. Disjunction Introduction (Dg)
We have two Disjunction Introduction rules, dg0(f) and dg1(f), which allow us to derive a disjunction (B V C) if we can derive either B or C:
dg0(f): (G ^ (B V C)) ^ f: (G ^ B); dg1 (f): (G ^ (B V C)) ^ f: (G ^ C).
§ 2.3.4. Implication Introduction (Ig)
The Implication Introduction rule, ig(f), allows us to derive an implication (B ^ C) if we can derive C from G extended with B:
ig(f): (G ^ (B ^ C)) ^ f: (GB ^ C).
§2.3.5. Universal Quantifier Introduction (Ag)
The Universal Quantifier Introduction rule, ag(f), allows us to derive a universally quantified formula (AxB(x)) if we can derive B(a) for an arbitrary name a:
ag(f): (G ^ AxB(x)) ^ f: (Ga ^ B(a)).
Here, B(a) is the result of substituting all free occurrences of x in B(x) with a. The names bound in the sequent and in the formulas are distinct.
§2.3.6. Existential Quantifier Introduction (Eg)
The Existential Quantifier Introduction rule, eg[i](f), allows us to derive an existentially quantified formula (ExB(x)) if we can derive B (b) for some specific name b:
eg[i](f): (G ^ ExB(x)) ^ (G[i] == b),f: (G ^ B(b)).
§2.3.7. Conjunction Elimination (Cu)
The Conjunction Elimination rule, cu[i](f), allows us to use a conjunction (B&C) to derive a formula D:
cu[i](f): (G ^ D) ^ (G[i] == (B&C)),f: (GBC ^ D).
The Disjunction Elimination rule, du[i](f, g), allows us to use a disjunction (B V C) to derive a formula D if we can derive D from both B and C separately:
du[i](f, g): (G ^ D) ^ (G[i] == (B V C)),f: (GB ^ D),g: (GC ^ D).
§ 2.3.9. Implication Elimination (Iu)
The Implication Elimination rule, iu[i](f, g), allows us to use an implication (B ^ C) to derive a formula D:
iu[i](f, g): (G ^ D) ^ (G[i] == (B ^ C)),f: (G ^ B),g: (GC ^ D).
§ 2.3.10. Universal Quantifier Elimination (Au)
The Universal Quantifier Elimination rule, au[i, j](f), allows us to use a universally quantified formula (AxB(x)) to derive a formula D:
au[i, j](f): (G ^ D) ^ (G[i] == AxB(x)), (G[j] == b),f: (GB(b) ^ D).
§ 2.3.11. Existential Quantifier Elimination (Eu)
The Existential Quantifier Elimination rule, eu[i](f), allows us to use an existentially quantified formula (ExB(x)) to derive a formula D:
eu[i](f): (G ^ D) ^ (G[i] == ExB(x)), f: (GaB(a) ^ D).
These postulates form the foundation of our calculus, providing a comprehensive set of rules for logical inference within the framework of minimal predicate logic.
§ 2.4. History of inference
In our system, each formula and variable in the premises of a sequent is assigned a history of inference, which allows for tracking the origin (the sequence of rule applications used to derive) of that formula or variable and enables analysis of the proof. The history can take one of the following forms:
• I: an initial sequent or variable introduced by the user,
• rf, rn: the formula is obtained by applying the rule au to the formula rf and the variable rn; here and thereafter, rf and rn can be represented by numbers or, in examples, by labels of formulas,
• rf: the formula or variable is obtained by applying the rule to the formula numbered rf; in the case of the rule eu, both a formula and a variable are produced.
This information about the origin of each formula plays a crucial role in the operation of the machine idea generator and in the learning process of the system.
§ 3. Architecture of the system
The architecture of our human-machine system consists of the following components.
§ 3.1. Sequent Representation Module with History performs the following functions:
• stores the current state of the proof in the form of sequents with attached histories of inference;
• provides an interface for manipulating sequents, including adding new formulas and updating histories;
• ensures efficient indexing and searching through formulas and their histories.
§ 3.2. Machine Proposal Generator performs the following functions:
• utilizes machine learning methods to suggest the next steps in the proof;
• takes the history of inference into account when generating proposals, analyzing successful patterns from past proofs;
• outputs a set of possible next sequents with an assessment of their viability, based on the statistics of the success of similar steps in the past;
• applies machine learning techniques to recognize complex logical structures and propose non-trivial steps in the proof.
§ 3.3. Expert Interface performs the following functions:
• provides a user-friendly graphical interface for interaction between the human expert and the system;
• allows the expert to select the best sequent from those proposed by the machine, evaluating their viability;
• enables the expert to propose their own proof steps by entering new sequents or modifying existing ones;
• visualizes the history of inference as a tree or graph for ease of analysis and navigation through the proof;
• provides tools for annotating proof steps and adding comments.
§3.4. Logical Filter is implemented as a verified program that guarantees the reliability of checks; it performs the following functions:
• verifies the correctness of each step in the proof, ensuring compliance with the rules of minimal sequential calculus;
• rejects incorrect steps, maintaining the logical integrity of the proof;
• provides detailed explanations for the rejection of incorrect steps to educate the system and inform the expert.
§ 3.5. Knowledge Base of Proofs performs the following functions:
• stores successful proofs and sub-proofs along with their histories in a structured format;
• utilizes efficient data structures for quick searching and retrieval of relevant proof patterns;
• allows for the reuse of identified patterns in new tasks, accelerating the proof process;
• supports versioning and the ability to revert to previous states of the knowledge base.
§ 3.6. Learning Module performs the following functions:
• analyzes successful proofs to identify effective strategies and improve search heuristics;
• updates machine learning models based on interactions with the expert, adapting to their reasoning style;
• applies reinforcement learning methods to optimize rule selection strategies.
§ 4. Functioning of the system
The proof process in our human-machine system occurs as follows:
§4.1. Initialization:
• the expert inputs the initial sequent through the system interface;
• the sequent representation module creates the initial state of all formulas and names in the premises with the history labeled as Init.
§ 4.2. Proof Completion Check or Proposal Generation:
• the machine generator analyzes the current state and history of inference;
• a set of possible next sequents is generated with viability assessments;
• each generated sequent is accompanied by its corresponding history.
§ 4.3. Expert Selection:
• the expert reviews the machine-proposed options through the interface;
• the expert can choose the best path for the proof or propose an alternative;
• if necessary, the expert can add comments or annotations.
§ 4.4. Filter Verification:
• the selected or expert-proposed sequent passes through the logical filter;
• the filter checks the correctness of rule applications;
• in the case of an error, the filter provides a detailed explanation.
§ 4.5. State Update:
• if the step passes the filter check, the sequent representation module updates the current state of the proof;
• the history of inference is updated according to the applied rule.
§ 4.6. Learning and Adaptation:
• the learning module analyzes the new step and updates the machine learning models;
• the knowledge base of proofs is enriched with a new pattern if deemed useful.
§4.7. Iteration:
• the process repeats from the second step.
• when the proof is completed, the system saves it in the knowledge base along with the complete history of inference;
• the expert can conduct a final analysis and add general comments to the proof.
This process ensures effective interaction between the human expert and the machine component of the system, leveraging the strengths of both approaches to solving logical problems. The history of inference plays a key role at each stage, providing transparency in the proof process and enabling analysis and improvement.
§ 5. Examples of system operation
To illustrate the effectiveness and capabilities of the proposed human-machine learning system, we will explore several specific examples of its operation within the framework of minimal sequential calculus. These examples will highlight both the system’s reasoning processes and the collaborative interaction between the machine and the human expert.
§5.1. Proof of a simple theorem
We begin by considering the following formula:
(3x(P(ж) Л Q(x)) ^ (3yP(y) Л 3zQ(z}))
This formula asserts that if there exists an x such that both P(x) and Q(x) hold true, then there must also exist a y for which P(y) is true and a z for which Q(z) is true.
§5.1.1. Initialization:
The expert initiates the process by inputting the sequent:
^ (3x(P(x) Л Q(x)) ^ (3yP(y) Л 3zQ(z)))
§ 5.1.2. First Iteration:
The machine begins generating an initial proof based on the provided sequent:
- fo (3x(P(x) Л Q(x)) ^ (3yP(y) Л 3zQ(z)))
- fo = ig(f1)
- fi : 3x(P(x) Л Q(x)) ^ (3yP(y) Л 3zQ(z))
Next, the proof is broken down into two core components:
- f1 = cg(f2 ,f3)
- f2 : 3x(P(x) Л Q(x)) ^ 3yP(y)
- f3 : 3x(P(x) Л Q(x)) ^ 3zQ(z)
Each of these components is further refined through the application of various rules:
- For f2:
- f2 = eu[0](f4)
- f4 : a(P(a) Л Q(a)) ^ 3yP(y)
- f4 = eg[1](f5)
- f5 : a(P(a) Л Q(a)) ^ P(a)
- f5 = cu[l](fe)
- fa : aP(a)Q(a) ^ P(a)
- f6 = pr[l]
- For f3:
- f3 = eu[0](f7)
- /7 : a(P(a) Л Q(a)) ^ 3zQ(z)
- f7 = eg[l](/s)
- /s : a(P(a) Л Q(a)) ^ Q(a)
- fS = cu[1](/9)
- /9 : aP(a)Q(a) ^ Q(a)
- /9 = Pr[2]
§ 5.1.3. Expert Analysis:
Upon reviewing the initial proof, the expert identifies opportunities for optimization, noting that the number of steps taken could be significantly reduced.
§5.1.4. Feedback:
The expert suggests starting with the existential quantifier in the second step.
§ 5.1.5. Second Iteration:
Taking the expert's suggestion into account, the machine generates a new proof:
- /o :^ (3x(p(x) Л Q(x)) ^ (3yp(y) Л 3zQ(z)))
- fo = ig(/1)
- /i : 3x(p(x) Л Q(x)) ^ (3yp(y) Л 3zQ(z))
- /i = eu[0](/2)
- /2 : a(p(a) Л Q(a)) ^ (3yp(y) Л 3zQ(z))
- /2 = cg(/3, /4)
- Уз : a(p(a) Л Q(a)) ^ 3yp(y)
- /4 : a(p(a) Л Q(a)) ^ 3zQ(z)
- Уз = eg[1](/5)
- /5 : a(p(a) Л Q(a)) ^ p(a)
- Л = cu[1](/6)
- /б : ap(a)Q(a) ^ p(a)
- Уб = pr[1]
- Л = eg[1](/7)
- /7 : a(p(a) Л Q(a)) ^ Q(a)
- /7 = cu[1](/s)
- /s : ap(a)Q(a) ^ Q(a)
- ZS = pr[2]
§5.1.6. Filter Verification:
The filter checks each step of the proof and confirms its logical correctness.
§ 5.1.7. Expert Analysis:
The expert reviews the new proof and notes its clarity and effectiveness.
§5.1.8. Completion:
The expert confirms the final version of the proof. The system saves the result in the knowledge base.
Consider the deduction:
- f0 :^ (3xP(x) ^ (Vy3zQ(y,z) ^ 3u3w(P(u) Л Q(u,w))))
- fo = ig(f1)
- f1 : 3xP(x) ^ (Vy3zQ(y,z) ^ 3u3w(P(u) Л Q(u,w)))
- f1 = ig(f2)
- f2 : 3xP(x)Vy3zQ(y, z) ^ 3u3w(P(u) Л Q(u,w))
- f2 = eu[0](f3)
- f3 : aP(a)Vy3zQ(y, z) ^ 3u3w(P(u) Л Q(u,w))
- f3 = eg[1](f4)
- f4 : aP(a)Vy3zQ(y, z) ^ 3w(P(a) Л Q(a,w))
- f4 = eg[1](f5)
- f5 : aP(a)Vy3zQ(y, z) ^ (P(a) Л Q(a,a))
- f5 = аu[l, l](fe)
- fa : aP(a)Vy3zQ(y, z)3zQ(a, z) ^ (P(a) Л Q(a, a))
- f6 = eu[2](f7)
- f7 : aP(a)Vy3zQ(y, z)3zQ(a, z)bQ(a, b) ^ (P(a) Л Q(a, a))
This deduction can continue indefinitely without success.
The expert suggests that it would have been better to execute:
- f4 = au[1, 1]
This deduction can then be completed:
- f0 :^ (3xP(x) ^ (Vy3zQ(y, z) ^ 3u3w(P(u) Л Q(u, w))))
- fo = ig(f1)
- f1 : 3xP(x) ^ (Vy3zQ(y,z) ^ 3u3w(P(u) Л Q(u,w)))
- f1 = ig(f2)
- f2 : 3xP(x)Vy3zQ(y,z) ^ 3u3w(P(u) Л Q(u,w))
- f2 = eu[0](f3)
- f3 : aP(a)Vy3zQ(y, z) ^ 3u3w(P(u) Л Q(u,w))
- f3 = eg[1](f4)
- f4 : aP(a)Vy3zQ(y, z) ^ 3w(P(a) Л Q(a,w))
- f4 = au[1, 1](f5)
- f5 : aP(a)Vy3zQ(y, z)3zQ(a, z) ^ 3w(P(a) Л Q(a,w))
- f5 = eu[3](f6)
- f6 : aP(a)Vy3zQ(y, z)3zQ(a, z)bQ(a, b) ^ 3w(P(a) Л Q(a,w))
- f6 = eg[4]( f7)
- f7 : aP(a)Vy3zQ(y, z)3zQ(a, z)bQ(a, b) ^ (P(a) Л Q(a, b))
- f7 = cg(f8, f9)
- f8 : aP(a)Vy3zQ(y, z)3zQ(a, z)bQ(a, b) ^ P(a)
- fg : aP(a)Vy3zQ(y, z)3zQ(a, z)bQ(a, b) ^ Q(a,b)
- f8 = pr[1]
- f9 = pr[4]
For forced deductions, the machine automatically attempts to complete the proof in k steps (where k is fixed, e.g., 1, 2, or 3). This allows the system to quickly test various options for continuing the proof without delving too deeply into potentially dead-end branches. If the proof cannot be completed in k steps, the system reverts to the previous state and tries a different path or requests assistance from the expert.
§6. Conclusion
In conclusion, it can be noted that the developed integrated human-machine system for adaptive theorem proving in minimal predicate logic demonstrates significant potential in the field of automating logical reasoning. The system successfully combines the strengths of machine learning and human intuition, enabling the resolution of complex logical problems more efficiently than using purely automated or purely manual methods.
A key achievement of the system is its adaptability and learnability. Thanks to its output history mechanism and learning module, the system can adapt to the reasoning style of a specific expert and improve its proof strategies over time. This not only enhances operational efficiency but also opens up new opportunities for a personalized approach to automated theorem proving.
Particular attention has been paid to the reliability and verifiability of results. The use of a logical filter ensures the correctness of each step in the proof. This is critically important for applying the system in areas that require mathematical rigor.
The system also demonstrates significant potential in the accumulation and utilization of knowledge. Its extensible knowledge base allows the system to gather successful proof samples, which significantly increases the efficiency of solving new problems. This feature is especially valuable for utilizing the system in research and educational contexts.
The transparency of the proof process, ensured by a detailed output history and the ability to add comments, makes the system a powerful tool not only for professional researchers but also for educational purposes. Step-by-step explanations and the possibility of interactive engagement make the process of learning logic and proof methods more accessible and effective.
The prospects for further development of the system are quite extensive. There are plans to extend its functionality to more complex logical systems, including classical and intuitionistic logic. The development of more advanced machine learning methods will improve the generation of ideas and proof strategies, while the creation of a library of frequently used theorems and lemmas will expedite the proof process for complex assertions.
Special emphasis will be placed on developing interfaces for integration with existing automated theorem proving systems. This will allow for the combination of the strengths of various approaches and create an even more powerful tool for addressing a wide range of logical problems.
Thus, the proposed system represents a significant advance in the field of automated theorem proving. It clearly demonstrates how the synergy between human and machine intelligence can lead to the creation of tools for solving complex logical challenges. Further development and refinement of this system promise improvements in approaches to automating logical reasoning and may find broad applications across various fields of science and engineering where rigorous logical justification is required. The presented system will serve as a foundation for further research in adaptive human-machine systems for logical inference and will contribute to advancements in understanding and applying formal logical methods.
REFERENCES
1. Novikov P. S. Konstruktivnaya matematicheskaya logika s tochki zreniya klassicheskoi (Constructive mathematical logic from the point of view of classical logic), Moscow: Nauka, 1977.
2. Kolmogoroff A. Sur le principe de tertium non datur, Matematicheskii Sbornik, 1925, vol. 32, no. 4, pp. 646-667 (in Russian). https://www.mathnet.ru/eng/sm7425
3. Shanin N. A. Constructive real numbers and constructive functional spaces, Trudy Matematicheskogo Instituta imeni V.A. Steklova, 1962, vol. 67, pp. 15-294 (in Russian). https://www.mathnet.ru/eng/tm 1757
4. Robinson J. A. A machine-oriented logic based on the resolution principle, Journal of the ACM, 1965, vol. 12, no. 1, pp. 23-41. https://doi.org/10.1145/321250.321253
5. Dragalin A. G. Konstruktivnaya teoriya dokazatel’stv i nestandartnyi analiz (Constructive proof theory and nonstandard analysis), Moscow: Editorial URSS, 2003. https://zbmath.org/1102.03003
6. Paulson L. C. (Ed.). Isabelle. A generic theorem prover, Berlin-Heidelberg: Springer, 1994. https://doi.org/10.1007/BFb0030541
7. Harrison J. HOL Light: A tutorial introduction, Formal methods in computer-aided design, Berlin-Heidelberg: Springer, 1996, pp. 265-269. https://doi.org/10.1007/BFb0031814
8. Wiedijk F. The seventeen provers of the world, Berlin-Heidelberg: Springer, 2006. https://doi.org/10.1007/11542384
9. Lenat D.B. Eurisko: A program that learns new heuristics and domain concepts: The nature of heuristics III: Program design and results, Artificial Intelligence, 1983, vol. 21, issues 1-2, pp. 61-98. https://doi.org/10.1016/S0004-3702(83)80005-8
10. Urban J. MaLARea: a metasystem for automated reasoning in large theories, ESARLT: Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, German, 15th July, 2007, 2007, pp. 45-58. https://ceur-ws.org/Vol-257/05_Urban.pdf
11. Bansal K., Loos S. M., Rabe M.N., Szegedy C., Wilcox S. HOList: An environment for machine learning of higher-order theorem proving, arXiv:1904.03241 [cs.LO], 2019. https://doi.org/10.48550/arXiv.1904.03241
12. Piotrowski B., Urban J. ATPBOOST: Learning premise selection in binary setting with ATP feedback, Automated Reasoning, Cham: Springer, 2018, pp. 566-574. https://doi.org/10.1007/978-3-319-94205-6_37
13. Kaliszyk C., Urban J., Michalewski H., Olsak M. Reinforcement learning of theorem proving, arXiv:1805.07563 [cs.AI], 2018. https://doi.org/10.48550/arXiv.1805.07563
14. Huang D., Dhariwal P., Song D., Sutskever I. GamePad: A learning environment for theorem proving, arXiv:1806.00608 [cs.LG], 2018. https://doi.org/10.48550/arXiv.1806.00608
15. Polu S., Sutskever I. Generative language modeling for automated theorem proving, arXiv:2009.03393 [cs.LG], 2020. https://doi.org/10.48550/arXiv.2009.03393
16. Wu Y., Jiang A. Q., Li W., Rabe M.N., Staats C., Jamnik M., Szegedy C. Autoformalization with Large Language Models, arXiv:2205.12615 [cs.LG], 2022. https://doi.org/10.48550/arXiv.2205.12615
17. Polu S., Han J. M., Zheng K., Baksys M., Babuschkin I., Sutskever I. Formal mathematics statement curriculum learning, arXiv:2202.01344 [cs.LG], 2022. https://doi.org/10.48550/arXiv.2202.01344
18. Chen Wenhu, Yin Ming, Ku Max, Lu Pan, Wan Yixin, Ma Xueguang, Xu Jianyu, Wang Xinyi, Xia Tony. TheoremQA: A theorem-driven question answering dataset, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore: Association for Computational Linguistics, 2023, pp. 7889-7901. https://doi.org/10.18653/v1/2023.emnlp-main.489
19. Thibault G., Cezary K., Urban J., Kumar R., Norrish M. TacticToe: learning to prove with tactics, Journal of Automated Reasoning, 2021, vol. 65, issue 2, pp. 257-286. https://doi.org/10.1007/s10817-020-09580-x
20. Li Wenda, Yu Lei, Wu Yuhuai, Paulson L. C. IsarStep: a benchmark for high-level mathematical reasoning, arXiv:2006.09265 [cs.LO], 2020. https://doi.org/10.48550/arXiv.2006.09265
21. Yang Kaiyu, Swope A. M., Gu Alex, Chalamala Rahul, Song Peiyang, Yu Shixing, Godil Saad, Prenger Ryan, Anandkumar Anima. LeanDojo: Theorem proving with retrieval-augmented language models, arXiv:2306.15626 [cs.LG], 2023. https://doi.org/10.48550/arXiv.2306.15626
Accepted 10.11.2024
Milad Joudakizadeh, Post-Graduate Student, Department of Theoretical Foundations of Computer Science, Udmurt State University, ul. Universitetskaya, 1, Izhevsk, 426034, Russia.
ORCID: https://orcid.org/0000-0002-6167-6237 E-mail: [email protected]
Anatoly Petrovich Beltiukov, Doctor of Physics and Mathematics, Professor, Department of Theoretical Foundations of Computer Science, Udmurt State University, ul. Universitetskaya, 1, Izhevsk, 426034, Russia.
ORCID: https://orcid.org/0000-0002-3433-9067 E-mail: [email protected]
Citation: M. Joudakizadeh, A. P. Beltiukov. Adaptive human-machine theorem proving system, Izvestiya Instituta Matematiki i Informatiki Udmurtskogo Gosudarstvennogo Universiteta, 2024, vol. 64, pp. 17-33.
M. Джудакизаде, А. П. Бельтюков
Адаптивная человеко-машинная система построения доказательств теорем
Ключевые слова: автоматическое доказательство теорем, минимальная предикатная логика, машинное обучение, человеко-машинное взаимодействие, секвенциальное исчисление, глубокое обучение, верификация доказательств.
УДК: 510.649
DOI: 10.35634/2226-3594-2024-64-02
В данной работе представлен новый подход к построению человеко-машинных систем доказательства теорем. Системы объединяют возможности машинного обучения, экспертные знания человека и строгий логический контроль для эффективного построения и верификации доказательств. Новизна подхода заключается в его открытости: в руки пользователя дается инструмент построения таких систем. С помощью него пользователь может создавать системы доказательства теорем, выбирая уже имеющиеся стратегии построения логического вывода или добавляя новые в соответствии с предоставленными интерфейсами или соглашениями о связях. Строящиеся системы имеют существенный человеко-машинный адаптивный характер. При их работе строится и обучается метастратегия на основе исходных элементов имеющихся стратегий. Система разработана как универсальная основа для управления различными стратегиями, с предоставлением базовой архитектуры и библиотеки стратегий с возможностью добавления новых. При обучении также накапливается и обучается система структурных характеристик, на основе которых принимается решение об использовании того или иного элемента стратегий на очередном шаге. Изложение ведется для секвенциального исчисления минимальной позитивной логики предикатов как наиболее подходящей для дедуктивного синтеза программ и алгоритмов, для чего и предполагается использовать данный подход.
СПИСОК ЛИТЕРАТУРЫ
1. Новиков П. С. Конструктивная математическая логика с точки зрения классической. М.: Наука, 1977.
2. Колмогоров А. Н. О принципе tertium non datur // Математический сборник. 1925. Т. 32. № 4. С. 646-667. https://www.mathnet.ru/rus/sm7425
3. Шанин Н.А. Конструктивные вещественные числа и конструктивные функциональные пространства // Труды Математического института имени В. А. Стеклова. 1962. Т. 67. С. 15-294. https://www.mathnet.ru/rus/tm1757
4. Robinson J. A. A machine-oriented logic based on the resolution principle // Journal of the ACM. 1965. Vol. 12. No. 1. P. 23-41. https://doi.org/10.1145/321250.321253
5. Драгалин А. Г. Конструктивная теория доказательств и нестандартный анализ. М.: Едиториал УРСС, 2003. https://zbmath.org/1102.03003
6. Paulson L. C. (Ed.). Isabelle. A generic theorem prover. Berlin-Heidelberg: Springer, 1994. https://doi.org/10.1007/BFb0030541
7. Harrison J. HOL Light: A tutorial introduction // Formal methods in computer-aided design. Berlin-Heidelberg: Springer, 1996. P. 265-269. https://doi.org/10.1007/BFb0031814
8. Wiedijk F. The seventeen provers of the world. Berlin-Heidelberg: Springer, 2006. https://doi.org/10.1007/11542384
9. Lenat D. B. Eurisko: A program that learns new heuristics and domain concepts: The nature of heuristics III: Program design and results //Artificial Intelligence. 1983. Vol. 21. Issues 1-2. P. 61-98. https://doi.org/10.1016/S0004-3702(83)80005-8
10. Urban J. MaLARea: a metasystem for automated reasoning in large theories // ESARLT: Proceedings of the CADE-21 Workshop on Empirically Successful Automated Reasoning in Large Theories, Bremen, German, 15th July, 2007. 2007. P. 45-58. https://ceur-ws.org/Vol-257/05_Urban.pdf
11. Bansal K., Loos S.M., Rabe M.N., Szegedy C., Wilcox S. HOList: An environment for machine learning of higher-order theorem proving // arXiv:1904.03241 [cs.LO]. 2019. https://doi.org/10.48550/arXiv.1904.03241
12. Piotrowski B., Urban J. ATPBOOST: Learning premise selection in binary setting with ATP feedback // Automated Reasoning. Cham: Springer, 2018. P. 566-574. https://doi.org/10.1007/978-3-319-94205-6_37
13. Kaliszyk C., Urban J., Michalewski H., Olsak M. Reinforcement learning of theorem proving // arXiv:1805.07563 [cs.AI]. 2018. https://doi.org/10.48550/arXiv.1805.07563
14. Huang D., Dhariwal P., Song D., Sutskever I. GamePad: A learning environment for theorem proving // arXiv:1806.00608 [cs.LG]. 2018. https://doi.org/10.48550/arXiv.1806.00608
15. Polu S., Sutskever I. Generative language modeling for automated theorem proving // arXiv:2009.03393 [cs.LG]. 2020. https://doi.org/10.48550/arXiv.2009.03393
16. Wu Y., Jiang A. Q., Li W., Rabe M. N., Staats C., Jamnik M., Szegedy C. Autoformalization with Large Language Models // arXiv:2205.12615 [cs.LG]. 2022. https://doi.org/10.48550/arXiv.2205.12615
17. Polu S., Han J. M., Zheng K., Baksys M., Babuschkin I., Sutskever I. Formal mathematics statement curriculum learning // arXiv:2202.01344 [cs.LG]. 2022. https://doi.org/10.48550/arXiv.2202.01344
18. Chen Wenhu, Yin Ming, Ku Max, Lu Pan, Wan Yixin, Ma Xueguang, Xu Jianyu, Wang Xinyi, Xia Tony. TheoremQA: A theorem-driven question answering dataset // Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Singapore: Association for Computational Linguistics, 2023. P. 7889-7901. https://doi.org/10.18653/v1/2023.emnlp-main.489
19. Gauthier T., Kaliszyk C., Urban J., Kumar R., Norrish M. TacticToe: learning to prove with tactics // Journal of Automated Reasoning. 2021. Vol. 65. Issue 2. P. 257-286. https://doi.org/10.1007/s10817-020-09580-x
20. Li Wenda, Yu Lei, Wu Yuhuai, Paulson L. C. IsarStep: a benchmark for high-level mathematical reasoning // arXiv:2006.09265 [cs.LO]. 2020. https://doi.org/10.48550/arXiv.2006.09265
21. Yang Kaiyu, Swope A. M., Gu Alex, Chalamala Rahul, Song Peiyang, Yu Shixing, Godil Saad, Prenger Ryan, Anandkumar Anima. LeanDojo: Theorem proving with retrieval-augmented language models // arXiv:2306.15626 [cs.LG]. 2023. https://doi.org/10.48550/arXiv.2306.15626
Поступила в редакцию 13.10.2024
Принята к публикации 10.11.2024
Джудакизаде Милад, аспирант, кафедра теоретических основ информатики, Удмуртский государственный университет, 426034, Россия, г. Ижевск, ул. Университетская, 1.
ORCID: https://orcid.org/0000-0002-6167-6237 E-mail: [email protected]
Бельтюков Анатолий Петрович, д. ф.-м. н., профессор, кафедра теоретических основ информатики, Удмуртский государственный университет, 426034, Россия, г. Ижевск, ул. Университетская, 1. ORCID: https://orcid.org/0000-0002-3433-9067 E-mail: [email protected]
Цитирование: M. Джудакизаде, А. П. Бельтюков. Адаптивная человеко-машинная система построения доказательств теорем // Известия Института математики и информатики Удмуртского государственного университета. 2024. Т. 64. С. 17-33.