Genetic Programming for Formula Evolution
A symbolic regression system that uses genetic programming to automatically discover mathematical formulas for predicting numerical outcomes. The project evolves mathematical expressions by minimizing Mean Squared Error (MSE) while balancing accuracy and complexity.
π GitHub Repository
πΉ Key Features
- Tree-Based Genetic Programming β Evolves hierarchical mathematical expressions.
- Optimized Evolutionary Process β Selection, crossover, and mutation to improve formula accuracy.
- Symbolic Simplification β Uses SymPy to refine evolved formulas for better readability.
- Configurable Parameters β Adjustable mutation rates, crossover rates, and selection mechanisms.
- Bloat Control & Parsimony β Prevents unnecessary formula complexity while maintaining accuracy.
π Technologies Used
- Python, NumPy (Vectorized numerical operations)
- SymPy (Symbolic manipulation & formula simplification)
- Genetic Programming (Tree-based evolution of mathematical expressions)
- Mean Squared Error (MSE) (Primary fitness evaluation metric)
π Optimization & Results
- Performance Analysis β Tuned mutation (0.3) and crossover (0.7) rates for optimal results.
- Formula Complexity vs. Accuracy Tradeoff β Adjusted parsimony to balance expression length and prediction quality.
- Dataset Generalization β Experimented across multiple datasets, refining hyperparameters dynamically.
- Best Evolution Strategies Identified β Fine-tuned selection mechanisms like tournament selection with elitism.
Conclusion
This project demonstrates evolutionary computing and genetic programming applied to symbolic regression. By combining tree-based expression evolution, symbolic simplification, and genetic optimization, it effectively discovers mathematical models that generalize across numerical datasets.