# Bijay's Blog - LLMs.txt

# Website title and description
name: Bijay's Blog
description: Personal Blog about anything that occurs to me
url: https://blog.regmi.dev

# Sitemap
sitemap: https://blog.regmi.dev/sitemap.xml

# About
name: About
url: https://blog.regmi.dev/about
description: ![Github Avatar](https://avatars.githubusercontent.com/u/23026528?s=400&u=700ee0be121faedd98ba4c10fb799a2db2412776&v=4)

I was born in Nepal and later moved to Germany, where I studied Medicine at t

# Blog Posts
- title: Understanding LLMs and Modern Inference Engines
  url: https://blog.regmi.dev/post/understanding-llms-and-modern-inference-engines
  description: Choosing an LLM inference engine is a hardware-and-systems decision, not a meme. For real self-hosting, runtime, throughput, concurrency, and cost matter as much as the model.
  date: 2026-05-21

- title: Ein medizinisches Modell mit synthetischen Daten
  url: https://blog.regmi.dev/post/ein-medizinisches-modell-mit-synthetischen-daten
  description: Optimierte KI-Modelle für die medizinische Kodierung: Wie BERT und synthetische Daten den Klinikalltag revolutionieren.
  date: 2026-05-17

- title: State of Naïve RAG vs Agentic RAG in 2026
  url: https://blog.regmi.dev/post/state-of-naive-rag-vs-agentic-rag-in-2026
  description: RAG is not dead. In 2026, agentic RAG often beats naïve RAG for accuracy and complex retrieval, but naïve RAG still wins for simple, fast, low-cost use cases.
  date: 2026-05-17

# Tags
tags: ai, ai-infrastructure, ai_engineering, data_engineering, de, en, english, german, gpu, inference, inference-engines, ki, llama-cpp, llms, medizin, nvidia, open-source-models, rag, self-hosting, sglang, tensorrt-llm, vllm