Large Lenguage Server App

If you want to check the code for this project:

A high-performance LLM inference server with structured and unstructured output validation, designed for distributed systems and optimized for large-scale LLM operations. Is optimized for high throughput low latency.

The core components of the project are:

Connection pool
Batching startegies
Structured vs unstructured output generation (can check the results here)