# Bijay's Blog - LLMs.txt # Website title and description name: Bijay's Blog description: Personal Blog about anything that occurs to me url: https://blog.regmi.dev # Sitemap sitemap: https://blog.regmi.dev/sitemap.xml # About name: About url: https://blog.regmi.dev/about description: ![Github Avatar](https://avatars.githubusercontent.com/u/23026528?s=400&u=700ee0be121faedd98ba4c10fb799a2db2412776&v=4) I was born in Nepal and later moved to Germany, where I studied Medicine at t # Blog Posts - title: Understanding LLMs and Modern Inference Engines url: https://blog.regmi.dev/post/understanding-llms-and-modern-inference-engines description: Choosing an LLM inference engine is a hardware-and-systems decision, not a meme. For real self-hosting, runtime, throughput, concurrency, and cost matter as much as the model. date: 2026-05-21 - title: Ein medizinisches Modell mit synthetischen Daten url: https://blog.regmi.dev/post/ein-medizinisches-modell-mit-synthetischen-daten description: Optimierte KI-Modelle für die medizinische Kodierung: Wie BERT und synthetische Daten den Klinikalltag revolutionieren. date: 2026-05-17 - title: State of Naïve RAG vs Agentic RAG in 2026 url: https://blog.regmi.dev/post/state-of-naive-rag-vs-agentic-rag-in-2026 description: RAG is not dead. In 2026, agentic RAG often beats naïve RAG for accuracy and complex retrieval, but naïve RAG still wins for simple, fast, low-cost use cases. date: 2026-05-17 # Tags tags: ai, ai-infrastructure, ai_engineering, data_engineering, de, en, english, german, gpu, inference, inference-engines, ki, llama-cpp, llms, medizin, nvidia, open-source-models, rag, self-hosting, sglang, tensorrt-llm, vllm