It would have been easier to build on top of the usual stack. Take PyTorch, add a serving framework, wrap it in an API and call it a product. We chose not to — and that choice defines how we work.
Three principles
Everything we ship comes back to the same three ideas, whether it's an inference engine, a compiler, or a security platform.
- Own the full stack — rebuild from first principles instead of stacking dependencies. Fewer layers, more control.
- Numbers, not marketing — every claim is reproducible on real hardware, with the benchmarks and the honest limitations published alongside.
- Open by default — our research ships as software you can read, run and build on, from a laptop GPU to a data-center cluster.
Why depth pays off
When you control the kernels, the engine and the format end to end, problems that look impossible from inside a framework become tractable. Seven-second cold starts, single-card 7B serving, one kernel across four architectures — none of those are reachable if you're three abstraction layers away from the metal.
“We rebuild from first principles — kernel compilers, inference engines, formats — instead of stacking dependencies.”
— Zyora Labs
We're a small team from Nagercoil building low-level systems for AI, in the open. If that's the kind of work you want to read about — or do — this blog is where we'll keep sharing it.