LAUNCH | 1 MAY 2026

BezChmury Launch:
6 months, 11B parameters, local-first

Opublikowano: 01.05.2026 Czas czytania: 13 min Autor: Dominik Witanowski

A pivot from Villa Mamma SEO to AI compliance for Polish businesses. Six months of training on BezChmury 11B v3, two RTX 5090 GPUs, 41,016 training pairs - and a decision to never send a single invoice to the United States.

Author: Dominik Witanowski Published: 1 May 2026 ~13 min read

SECTION 1 | GENESIS

From a wedding venue to AI compliance

BezChmury did not start as an idea for an AI startup. It started with questions that began landing at the table where we usually talked about how many chairs fit into a wedding hall.

For several years I ran SEO for Villa Mamma - a wedding venue in Nadarzyn near Warsaw. It was a good business: local, deeply Polish, with very specific clients. Within the family and client circle of Villa Mamma there were a handful of accountants. Towards the end of 2024 they began asking questions that came up more and more often in the same form: "Listen, do you use this ChatGPT thing? Would you give it our full books?" Or: "Can I paste a JPK_V7M into it? My client's data? My client's data?".

The first ChatGPT-4 tests with Polish invoices delivered three results. First - hallucinations. The model was confident that KSeF had been live since 2024, when in reality mandatory KSeF was rescheduled in stages (large taxpayers from 1 February 2026, smaller entities from 1 April 2026). Second - no knowledge of FA(3), the latest logical structure for the e-invoice that takes effect in 2026. Third, and worst of all: every question that contained a client's NIP was a transfer of personal data to OpenAI servers in the United States.

Schrems II in 2020 invalidated Privacy Shield. The Data Privacy Framework from 2023 partially replaces it, but not all major providers are confirmed as certified - as of 1 May 2026 the status of OpenAI and Anthropic in the DPF still requires per-provider verification. Which means that in the worst-case scenario a Polish accountant was sending client data they had no legal right to expose to anyone into an American data centre.

In Q4 2024 and Q1 2025 I started looking for Polish language models. I looked at PLLuM (project formally concluded 31 December 2024 - model still available, but the roadmap closed), at Trurl (good, but not for our use cases), and finally at the BezChmury 11B family from SpeakLeash. BezChmury 11B looked like something completely different from the rest: open-weights, Polish tokeniser, Polish team, Polish training data. The decision was not even difficult - the question was not "should we build on Bielik" but "how fast can we start".

SECTION 2 | PIVOT

Plan A: Llama 70B. Plan B: BezChmury 11B v3.

The first version of BezChmury was supposed to run on Llama 70B. After two months of prototyping we changed the path entirely.

Three problems with Llama 70B

Size. Llama 70B at full precision is a file of around 75 GB. After quantisation to Q4_K_M the file shrinks to ~40 GB - and that is still beyond the consumer hardware that we want to deploy on at accounting firms. A MacBook M2 with 16 GB? No chance. An RTX 4060 with 12 GB? Same problem.
Tokeniser. Llama uses a tokeniser optimised for English. Polish text - with its inflections, declensions and diacritics - passes through that tokeniser two to three times more fragmented than the equivalent English text. That means slower inference, smaller effective context, and more VRAM consumption.
Licence and geopolitics. Meta's Llama licence permits commercial use, but introduces restrictions (for example, for entities with 700M+ MAU). More importantly: a whole product built on Llama is a product built on someone else's ecosystem from a different regulatory jurisdiction. BezChmury is conceived as the antithesis of "AI from the United States". It is hard to sell Polish accountants "private, local AI" when underneath sits a model from Menlo Park.

Why BezChmury 11B v3

In April 2026 we made the pivot to BezChmury 11B v3 from SpeakLeash. Concrete facts from the official model cards (accessed 1 May 2026):

Apache-2.0 licence - full commercial freedom, zero restrictions of the MAU type.
The base is Mistral-7B-v0.2 scaled to 11B parameters. Training carried out by SpeakLeash in cooperation with ACK Cyfronet AGH, on PLGrid infrastructure (Athena and Helios supercomputers).
APT4 tokeniser - let me quote the model card verbatim: "after replacing its tokenizer to the APT4 tokenizer optimized specifically for the Polish language". That single detail solves the Polish inflection problem in one stroke; Llama would never have fixed it.
Minitron 7.35B variant - officially described as a compression of the 11B model, reduced from 11.04 billion to 7.35 billion parameters (-33.4%). A sweet spot for older laptops and mobile deployment.
bielik-papers on GitHub - repository with separate materials for v1, v2, v3, v3_minitron and v3_small. Open documentation, evaluation, an audit trail.

BezChmury 11B v3 quantised to Q4_K_M is a file of around 6.5 GB. It fits comfortably on a MacBook M2 with 16 GB of memory. It fits on a PC with an RTX 4060 / 4070. Whatever Llama 70B promised in theory, BezChmury 11B delivers in practice - and it does it in Polish.

Context note. PLLuM as a separate project was formally concluded on 31 December 2024. Successor initiatives (HIVE AI and derivatives) are publicly communicated, but as of 1 May 2026 the roadmap for a fully open Apache-2.0 model with a Polish tokeniser remains clearly visible only on the BezChmury 11B family side - which is why BezChmury runs on BezChmury 11B.

SECTION 3 | TRAINING

Six months of training on 2x RTX 5090

Deliberately I will not put loss curves, eval curves or benchmark percentages here. The reason is simple - for the BezChmury 11B v3 family I have not found a publicly confirmed set of standard benchmarks that would let me honestly compare my fine-tuned version against the baseline. Rather than make up numbers, I describe the methodology. Full reports are available on request under NDA.

Stage 1 - mega 2B warm-up (Q4 2025)

BezChmury 11B is base-trained on general Polish. We add very narrow domain knowledge on top: KSeF, ZUS, VAT, JPK, GDPR, the Polish codes. If we hit the model with aggressive SFT without a warm-up, we risk catastrophic forgetting - the model "forgets" basic Polish to make room for the new knowledge.

Stage 1 is a gentle LoRA pass with parameters: LR 1.0-1.2e-5, LoRA r=32, short run. The task: teach the model the "feel" of the BezChmury domain without disturbing the base weights.

Stage 2 - supervised fine-tuning (Q1 2026)

Full SFT on 41,016 training pairs. The dataset was built manually and semi-automatically from three sources: official KSeF/ZUS/VAT documentation, our own corpus of accountant questions (Villa Mamma + network), and synthetic augmentations generated by larger models with verification in Polish.

Stage 2 parameters: LR 3.0-3.5e-6, 1 epoch, long sequences (chunking + multi-chunk + hierarchy + map-reduce + explicit uncertainty). The canary A/B experiment: two LRs running simultaneously on two GPUs, 7B Mini and 11B v3 side by side - twice as fast as sequential.

Stage 2.5 - anti-forgetting

A short pass with LR 1.0e-6, replaying general Polish knowledge (90% PL / 10% EN). The point: when SFT turns out to be too aggressive, this stage restores some of the base knowledge without breaking the freshly learned skills. Insurance against catastrophic forgetting.

Stage 2.6 - targeted microfact pack (Q2 2026)

A 2,000-pair pack split into 5 blocks: (1) IKE/IKZE anti-stale - fixes persistent hallucinations on contribution limits; (2) fresh facts top fails - the most common errors from the eval probe; (3) no-EN - hard-blocks English "leak" in Polish responses; (4) long-context PL grounding - full Polish contexts; (5) retention - a safety net for earlier stages.

Parameters: LR 6e-7 (11B) / LR 8e-7 (Mini), 1 epoch. The reason this stage had to appear at all - earlier Stage 3 DPO attempts caused a regression (Mini EN-leak: 12 to 21, WORSE). DPO teaches preferences; it does not nail down numbers. Microfacts require SFT.

Stage 3 - DPO (optional, light)

Direct Preference Optimization, parameters LR 5e-7, beta 0.1. For v3 this is an optional and light stage - SpeakLeash has already done DPO-P (114k pairs) and GRPO (143k pairs) on their side, so we do not duplicate their work. We only add a domain "polish" of around 1.5-3k BezChmury pairs.

Hardware and thermal

We train on two RTX 5090s. The chip's Tj Max is 90 degrees Celsius, soft throttle starts around 87-88 degrees Celsius - those are the real numbers, not the 83 degrees of older GPUs. Asymmetric power: 500W on GPU 0 and 550W on GPU 1 (GPU 0 has historically had worse thermals, peaking at 92 degrees Celsius).

The cheapest thermal upgrade for the whole setup was not an extra card - it was Vornado #1 as an intake fan. Wedged under the desk as a forced cool-air feed it gave us -13 degrees Celsius while raising the power budget by +25%. Plan for summer 2026: a second Vornado plus air conditioning at a Cool 16 degree Celsius setpoint for a target of 575W/575W.

Every change in the PC environment is made live - without rebooting the machine. The practical reason: 8 x 32 GB DDR5 is a memory training-friendly setup, but pulling the DIMMs after a reboot can take hours. systemctl reload > restart, every day of the week.

SECTION 4 | MANIFESTO

Five principles of BezChmury

The whole product is built around five principles, which are hard engineering decisions and not just marketing slogans.

Local after install. The model runs locally - on the accountant's laptop, on the PC in the law firm, or on an on-prem server. No product call-home by default and no automatic telemetry. Knowledge updates (Update Pack) happen manually or under the agreed contract.
GDPR-aware by design. Every deployment ships its own DPIA template, an audit log per query and a configuration that minimises data transfers. Pseudonymisation in logs. Schrems II / Cloud Act / FISA 702 risk depends on the deployment configuration.
Polish model + Polish data. BezChmury 11B v3 (open-weights, Apache-2.0) plus our SSoT v34 with 95 operational facts and 17 contested points verified by GPT-5.4 Pro. We do not guess whether VAT refund in 2026 is 60 days or 40 days - checked, frozen, auditable.
Audit-ready. Every question and every answer are logged in JSONL with a timestamp, IP hash, intent classification and a list of cited facts. Five-year retention. Replay capability - the same input always yields the same output (deterministic resolver).
Transparency. BezChmury 11B open-weights on Hugging Face, open methodology (probe of 150 pairs, 147/150 PASS), public release log. Enterprise customers (from PLN 49,900) receive a code audit path under NDA.

Regulatory context - in brief

AI Act 2024/1689 - Annex III high-risk obligations apply from 2 August 2026. BezChmury in its current form is not classified as high-risk (it is an information assistant, it does not make administrative or judicial decisions on behalf of a person), but we design to the highest Annex III standards anyway - simply with lower operational friction.

Schrems II + DPF 2023 - as I wrote above: the status of OpenAI and Anthropic in the Data Privacy Framework requires per-provider verification. A Polish legal entity processing client data does not have the luxury of waiting - either the data stays in Poland or we fall into a regulatory gap.

Cloud Act 2018 + FISA 702 - a fundamental asymmetry: any United States provider can be compelled by US law to surrender a European customer's data without that customer's knowledge, regardless of European DPF guarantees. This is not a hypothetical risk; it is a structural one.

Polish DPA trend 2024-2026. Decision DKN.5131.3.2025 underlines the obligation of risk analysis before deploying AI. 13 August 2024 - a fine of PLN 1,499,000 against a medical entity. We do not need a crystal ball to forecast that 2026-2027 will see the first Polish DPA decisions concerning accounting firms which routinely send client data to public chatbots. Better to be on the right side of that history.

SECTION 5 | ROADMAP

What is next - Q2 to Q4 2026

The roadmap is deliberately conservative. I list only what I can already build today - no "nice to have" without a code commitment behind it.

Q2 2026 - pricing public + 100 beta testers

Four packages go public: KSeF Lite (PLN 1,490), KSeF Private (PLN 4,990), Accounting Private (PLN 9,990), GDPR Pro Bundle (PLN 14,900). Annual Update Packs PLN 990-5,990. Enterprise on-prem from PLN 49,900 (custom scoping under NDA). The Beta programme - 100 seats: 30 days of free KSeF Private trial, 1:1 onboarding with me, influence on Q3 roadmap priorities.

Q3 2026 - BezChmury Mobile (Bielik Minitron 7.35B)

The Bielik-PL-Minitron-7B-v3.0-Instruct variant (-33.4% vs 11B) as the base for the mobile edition. Target: older laptops with 8-12 GB of RAM. Use cases: quick lookup outside the office (a tax inspection in the field, a phone call from the lift, a quick KSeF code check on a phone tethered to the laptop).

Q4 2026 - BezChmury Pro 11B v3.1

The plan assumes an upgrade to BezChmury 11B v3.1 once that release is published by SpeakLeash (date unknown, we follow bielik-papers). Annual update pack for current Pro customers: a new SSoT (VAT changes, new error codes), new weights if the baseline shifts. Roadmap explicitly "NOT committed" - the decision will be research-driven, not by date.

2027 and beyond

ISO 27001 - path planned, certification in progress in 2027. Enterprise on-prem multi-tenant - deployments for larger entities (chains of law firms, large accounting firms with 10+ staff). EN locale - English language support in the UI for Polish firms with international clients (now live, you are reading it). All these are directions, not promises.

SECTION 6 | FOR YOU

Open invitation - 100 beta seats

If you run an accounting firm, a law firm with a focus on GDPR/healthcare, or a mid-sized medical company - I want to talk to you.

The Beta programme is not a "waiting list". It is 100 seats for people who want a real say in the product before it goes public. What you get:

30 days of free KSeF Private trial - full functionality, local installation, no question caps.
1:1 onboarding with me (one hour) - installation, first questions, setup tailored to your client base.
Influence on the Q3 roadmap - Beta customer priorities = BezChmury priorities. A module missing? Tell me, I will look at it.
Beta price - locked in at the moment of full licence purchase after the trial, lower than Q3 public pricing.

The application takes five minutes. Fill in the demo form - in the "source" field I will automatically tag blog-launch, so I know you came from this post.

Closing

Finally, Polish AI compliance that doesn't send your invoices to the United States. Six months of training, eleven billion parameters, two graphics cards, one Vornado fan under the desk, local model. Let's start together.

Apply for Beta (15 min) See KSeF Private → View pricing →

Dominik Witanowski

Building BezChmury since 2024. Ten years in IT, ex-SEO at Villa Mamma, author of the KSeF Private pipeline with 147/150 PASS on his own probe of 150 test pairs (BezChmury's internal benchmark, NOT an official Ministry of Finance benchmark).

ŹRÓDŁA

Oficjalne źródła i odniesienia

[1]
bielik-papers (GitHub) - SpeakLeash https://github.com/speakleash/bielik-papers · dostęp: 2026-05-01
[2]
Bielik-PL-11B-v3.0-Instruct (model card) - Hugging Face https://huggingface.co/speakleash/Bielik-PL-11B-v3.0-Instruct · dostęp: 2026-05-01
[3]
ACK Cyfronet AGH (PLGrid: Athena, Helios) - Cyfronet AGH https://www.cyfronet.pl · dostęp: 2026-05-01
[4]
Regulation (EU) 2024/1689 (AI Act) - EUR-Lex https://eur-lex.europa.eu/eli/reg/2024/1689/oj · dostęp: 2026-05-01
[5]
CJEU C-311/18 ruling (Schrems II) - CURIA https://curia.europa.eu/juris/liste.jsf?num=C-311/18 · dostęp: 2026-05-01
[6]
Apache License 2.0 - Apache Software Foundation https://www.apache.org/licenses/LICENSE-2.0 · dostęp: 2026-05-01

Wszystkie cytaty dosłowne w artykule pochodzą z powyższych oficjalnych źródeł. Inline odniesienia oznaczone [N] linkują do tej listy.

Want to see private AI
for your business?

A short KSeF Private demo (15 min). We will show local execution, control questions, source base and how BezChmury reduces the risk of hallucinations.

Book a demo (15 min, free) See KSeF Private →