Skip to content

wsheffel/cortex

 
 

Repository files navigation

WebsiteSlackDocs



Scale compute-intensive serverless workloads

Cortex is a Kubernetes-based serverless platform built for AWS.


Deploy realtime, batch, and async workloads

  • Realtime - realtime APIs respond to requests in real-time and autoscale based on in-flight request volumes.
  • Batch - batch APIs run distributed and fault-tolerant batch processing jobs on-demand.
  • Async - async APIs process requests asynchronously and autoscale based on request queue length.

Scale across hundreds of CPU and GPU instances

  • No resource limits - allocate as much CPU, GPU, and memory as each workload requires.
  • No cold starts - keep a minimum number of API replicas running to ensure that requests are handled in real-time.
  • No timeouts - run workloads for as long as you want.

Control your AWS spend

  • Spot instance management - Cortex automatically runs workloads on spot instances and falls back to on-demand instances to ensure reliability.
  • Multi-instance type clusters - choose the ideal EC2 instance type for your workloads or mix and match several instance types in the same cluster.
  • Customizable autoscaling - optimize the autoscaling behavior for each workload to ensure efficient resource utilization.

About

Scale compute-intensive serverless workloads

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 68.8%
  • Python 16.9%
  • Jupyter Notebook 9.4%
  • Shell 2.9%
  • Jinja 1.0%
  • Dockerfile 0.6%
  • Makefile 0.4%