matrix-multiplication

Here are 912 public repositories matching this topic...

google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated Dec 10, 2025
C

ZrobMiloudaa / jetson-orin-matmul-analysis

Star

🔍 Analyze CUDA matrix multiplication performance and power consumption on NVIDIA Jetson Orin Nano across multiple implementations and settings.

machine-learning robotics cuda cublas matrix-multiplication high-performance-computing gpu-computing performance-optimization autonomous-systems edge-computing nvidia-jetson embeded-systems tensor-cores ml-deployment jetson-orin-nano gpu-benchmarking power-efficiency-benchmark cuda-optimization

Updated Dec 10, 2025
Python

bbahipro2 / GEMM

Star

🔍 Explore GEMM: a C/C++ library for efficient matrix multiplication using OpenMP, designed for parallel computing learners and practitioners.

google hls structural-biology vulkan molecular-structures protein-structure cuda crystallography matrix-multiplication mmcif gemma high-level-synthesis pdb-files blis binary-neural-networks mtz cuda-kernel paligemma

Updated Dec 10, 2025

Djspraragen / Project1

Star

🔍 Simulate quantum entanglement using C++ with grids and random particles, offering an engaging way to explore this fundamental concept.

javascript css python heroku html simulation optimization postgresql site goodreads matrix-multiplication mpc modelica bim cs50 systolic-arrays ibpsa cs50w

Updated Dec 10, 2025
C++

Gutsperro / ai-gpu-playground-mac

Star

🚀 Explore GPU capabilities on Mac with hands-on comparisons of CPU and Metal GPU performance for AI training using PyTorch and TensorFlow.

education benchmark deep-learning metal tensorflow gpu pytorch matrix-multiplication mps mlx hands-on apple-silicon llamacpp tflops tokens-per-second

Updated Dec 10, 2025
Python

theBappy / top-asked-dsa-problem-solving

Star

Data Structure & Algorithm : This journey is not just about coding but also about developing problem-solving thinking, optimizing solutions, and building a strong foundation for coding interviews and real-world programming. So far i am loving it.

javascript python stack queue math cpp string array bit python3 hashmap matrix-multiplication binary-search-tree dynamic-programming dijkstra-algorithm bellman-ford trie-data-structure floyd-war

Updated Dec 10, 2025
Python

explosion / cython-blis

Star

💥 Fast matrix-multiplication as a self-contained Python library – no system dependencies!

neural-network numpy cython linear-algebra matrix-multiplication neural-networks blas openblas blas-libraries blis

Updated Dec 9, 2025
C

cp2k / dbcsr

Star

DBCSR: Distributed Block Compressed Sparse Row matrix library

hpc linear-algebra mpi cuda matrix-multiplication blas sparse-matrix cp2k gemm openmp-parallelization

Updated Dec 8, 2025
Fortran

libxsmm / libxsmm

Star

Library for specialized dense and sparse matrix operations, and deep learning primitives.

machine-learning fortran vector matrix intel avx sse jit simd matrix-multiplication sparse blas convolution avx2 amx tensor avx512 transpose bfloat16

Updated Dec 9, 2025
C

orishlach / openMP

Star

Seminar on parallel programming with OpenMP

openmp multithreading matrix-multiplication seminar strassen-algorithm parallel-programming task-based-parallelism

Updated Dec 8, 2025
C++

Graiphic / LabVIEW-GPU-Benchmarks

Star

Benchmark suite comparing LabVIEW GPU toolkits (CuLab, G2CPU, Graiphic Accelerator). Includes methods, sources, results, and reproducible test pipelines.

benchmark performance gpu optimization accelerator signal-processing parallel-computing cuda labview matrix-multiplication complex-numbers fft reproducibility tensorrt onnx directml culab g2cpu graiphic

Updated Dec 8, 2025

ECP-Solutions / VBA-Expressions

Star

A powerful library extending VBA with over 100 functions for math, stats, finance, and data manipulation. It supports matrix operations, and user-defined functions, enhancing automation and analysis within Microsoft Office and LibreOffice environments for data management, financial calculations, an more.

science basic geometry libreoffice mathematics mathematical-expressions-evaluator matrix-multiplication curve-fitting maths statistical-models lu-decomposition equation-systems multivariate-linear-regression mathematical-expression-parser cholesky-decomposition determinant-calculation cholesky-factorization lo-basic

Updated Dec 7, 2025
VBA

malb / m4ri

Star

M4RI is a library for fast arithmetic with dense matrices over GF(2)

c linear-algebra matrix-factorization matrix-multiplication

Updated Dec 7, 2025
C

aditya2819 / CUDA-accelerated-linear-algebra-toolkit

Star

High-performance GPU-accelerated linear algebra library for scientific computing. Custom kernels outperform cuBLAS+cuSPARSE by 2.4x in iterative solvers. Built for circuit simulation workloads.