[pull] master from mruby:master#175
Merged
pull[bot] merged 15 commits intosysfce2:masterfrom Jan 12, 2026
Merged
Conversation
Add optimized squaring algorithm that exploits symmetry for ~1.5x speedup over general multiplication. Includes both schoolbook and Karatsuba variants. - mpz_sqr_basic_limbs: O(n(n+1)/2) multiplications instead of O(n^2) - mpz_sqr_karatsuba: 3 recursive squarings instead of 3 multiplications - mpz_sqr: high-level wrapper with fast paths for power-of-2 and all-ones The optimization triggers when mpz_mul is called with identical pointers (u == v), which occurs in internal operations like mpz_pow. Co-authored-by: Claude <noreply@anthropic.com>
Numbers with few bits set (popcount <= 8) are multiplied using shift-add instead of Karatsuba. This is O(k*n) where k is the popcount, much faster than O(n^1.585) for sparse patterns like 2^100000 + 2^50000 commonly generated by fuzzers. Co-authored-by: Claude <noreply@anthropic.com>
The udiv function had two buggy modifications to Knuth's Algorithm D: 1. A "three-limb pre-adjustment" that only decremented qhat once 2. A "3-limb refinement" loop with incorrect carry handling These caused incorrect quotients for certain decimal divisions like 10^52 / 10^26. Restored standard Knuth Algorithm D which uses only 2-limb qhat refinement with correction via subtract and add-back. Co-authored-by: Claude <noreply@anthropic.com>
The final carry was stored at z->p[y->sz], but when x is larger than y, this index falls within the already-computed result and corrupts it. Store at z->p[i] instead, which correctly points to max(x->sz, y->sz) after all loops complete. This bug caused incorrect results when adding a small number to an all-ones number with 1124+ limbs (35968+ bits). Co-authored-by: Claude <noreply@anthropic.com>
For base-10 conversion of numbers with >1000 digits, use a recursive divide-and-conquer algorithm that splits the number using precomputed powers of 10. This reduces complexity from O(n^2) to O(n log^2 n). The algorithm: 1. Precompute 10^1, 10^2, 10^4, 10^8, ... by repeated squaring 2. Find the largest power that splits digits roughly in half 3. Divide by this power to get high and low parts 4. Recursively convert each part, padding low part with zeros 5. Base case: use simple divide-by-10 for <= 1000 digits Co-authored-by: Claude <noreply@anthropic.com>
Extract 9 decimal digits at once by dividing by 10^9 instead of 10. This reduces the number of divisions in the base case by 9x, improving performance of large bigint to_s conversion by approximately 2x. Co-authored-by: Claude <noreply@anthropic.com>
Follow the codebase convention of using _recur suffix for recursive functions (e.g., codedump_recur, dump_recur). Co-authored-by: Claude <noreply@anthropic.com>
Apply Lemire's small table technique: use a 200-byte lookup table to convert digit pairs (00-99) instead of computing each digit separately. Reduces operations from 9 to 5 per 9-digit batch in the base case. Co-authored-by: Claude <noreply@anthropic.com>
use the mathematical identity 10^k = 2^k * 5^k to speed up the divide-and-conquer decimal string conversion. dividing by 5^k is faster than dividing by 10^k because 5^k has ~30% fewer bits (log2(5) ≈ 2.32 vs log2(10) ≈ 3.32). the 2^k component is handled with fast bit shifts. benchmarks show 3-8% improvement for large numbers: - 800K bits: 1.00s -> 0.97s - 1.6M bits: 3.95s -> 3.84s - 2.4M bits: 8.79s -> 8.46s Co-authored-by: Claude <noreply@anthropic.com>
when comparing bigint values with <=> operator, the comparison would convert both operands to float, losing precision for values > 2^53. this caused incorrect results like (10^20+1) <=> (10^20+2) returning 0 instead of -1. add direct bigint comparison paths in cmpnum() to avoid float conversion when both operands can be handled by mrb_bint_cmp(). Co-authored-by: Claude <noreply@anthropic.com>
Use stack allocation with zero-initialization and MRB_TRY/MRB_CATCH to ensure heap-allocated mpz_t data is freed even when an exception occurs during conversion. Co-authored-by: Claude <noreply@anthropic.com>
Add MRB_TRY/MRB_CATCH to ensure local mpz_t variables are freed when an exception (e.g., RangeError from shift overflow) occurs during the all-ones multiplication optimization. Co-authored-by: Claude <noreply@anthropic.com>
rational_eq_b was using wrong struct fields (p1->numerator/denominator which access i.num/i.den) for bigint-backed rationals that use b.num/b.den. Also added missing MRB_TT_BIGINT case to prevent fallthrough to default case which caused ping-pong recursion between Rational#== and Integer#==. Co-authored-by: Claude <noreply@anthropic.com>
When shift amount is 0, urshift() and ulshift() called mpz_set() which copies data without trimming leading zero limbs. This caused bigint values to have inflated sz fields, making ucmp() comparisons incorrect. For example, a 256-bit remainder from division could have sz=18 instead of sz=8 because the divisor had 18 limbs. This made it compare greater than values with fewer limbs, even when numerically smaller. The bug also caused memory leaks when the incorrect comparison led to taking wrong code paths in division, triggering size overflow exceptions after memory was allocated. Co-authored-by: Claude <noreply@anthropic.com>
Add trim() calls after mpz_set/mpz_move in shift operations where actual bit manipulation is skipped: - mpz_mul_2exp when e==0 (no shift needed) - mpz_mul_2exp when bs==0 (limb-only shift) - mpz_div_2exp when e==0 (no shift needed) - mpz_div_2exp when bs==0 (limb-only shift) This prevents inflated sz values from propagating through operations, complementing the earlier fix to urshift/ulshift when n==0. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )