A connected sensor manufacturer we work with bought a six-figure continuous validation platform two years ago. The sales demo was excellent. The proof of concept hit every milestone. The team signed.
Eighteen months later, they have spent more engineering time working around the platform than they would have spent building one themselves. The vendor’s roadmap diverged from their product roadmap. Custom test integrations require professional services hours billed at €280 per hour. A migration off the platform now would cost an estimated four months of engineering time.
We see this story regularly enough that we now treat continuous validation platform selection as a one-way door. The decision you make here will shape your firmware release cadence, your compliance posture, and your test coverage for the next three to five years.
This post is a vendor-neutral framework. We do not sell a platform. What we do is help teams select, integrate, and sometimes migrate away from one.
What Continuous Validation Actually Means for Embedded Systems
The term is borrowed from cloud-native software, where continuous validation means automated test execution against every commit, with results visible to every engineer. For pure software, this is well-understood territory: pipelines, runners, containers, mocked dependencies.
For embedded systems, “continuous” has different physics. You cannot mock a power rail. You cannot containerise a sensor reading at -20°C. You cannot scale a HIL bench by spinning up another EC2 instance. The naive translation of software CI/CD to firmware ends at “we built the firmware on every PR” which catches the compile-time and lint-level defects but misses the integration, timing, and hardware-specific bugs that dominate field failures.
A real continuous validation platform for embedded systems covers at least four loops, and up to six for safety-critical work where Model-in-the-Loop (MIL) and Processor-in-the-Loop (PIL) sit between Build and SIL, and between SIL and HIL respectively:
- Build & static analysis
Every commit produces a signed firmware artifact, runs static analysis, MISRA C:2023 or AUTOSAR C++14 checks if relevant, and generates a fresh SBOM
- Simulation (SIL)
The firmware runs in a software model of the hardware, with mocked peripherals. Fast feedback, low fidelity
- Hardware-in-the-Loop (HIL)
The firmware runs on real hardware connected to instruments that inject realistic stimuli. Slower, higher fidelity
- Fleet validation
A subset of real devices in real conditions report telemetry back to the validation system. Slowest, highest fidelity
The platforms on the market today address some combination of these four loops. Almost none of them address all four well. This is the first question to ask: which loops does your product actually need, and which does the vendor cover?
The Build-vs-Buy Question (And Why It’s Not Binary)
Most engineering leaders frame this as build-or-buy. In practice, every successful continuous validation system we have seen is a hybrid.
The components that should almost always be bought (or used off-the-shelf open source):
- The build runner itself (GitHub Actions, GitLab CI, Jenkins pick one and stop debating)
- The static analysis engine (Clang-Tidy, Cppcheck, Coverity, PVS-Studio, Helix QAC, or Polyspace depending on budget and regulatory pressure)
- The artifact storage (any modern S3-compatible store)
- The SBOM generation tool (Syft, CycloneDX CLI)
The components that should almost always be built in-house (or owned and modified):
- The HIL test fixtures themselves (you understand your hardware better than any vendor)
- The fleet telemetry interpretation logic (your data model is yours)
- The release decision logic (your risk tolerance is yours)
The components where the question is genuinely open:
- HIL automation orchestration (vendor tools vs. self-hosted)
- Test data management and analysis
- Integration with issue tracking and release management
- Compliance evidence generation (SBOMs, vulnerability disclosure logs, CRA reporting)
The vendor case is strongest in the middle category. The trap is buying a platform that promises all three categories and ends up forcing you to use its versions of the things you should have built yourself.
The Seven Criteria We Use for Platform Selection
When we evaluate platforms with clients, we score every candidate against the same seven criteria. The score is less important than the conversation it forces.
1. Coverage Across the Four Loops
How does the platform handle build, SIL, HIL, and fleet validation? Does it pretend to cover loops it does not, or is it honest about where it stops?
The honest vendors will say: “We do HIL orchestration well. We do not do fleet validation we integrate with your existing telemetry pipeline.” The vendors to be wary of are the ones who claim to cover all four loops with one product.
2. Hardware Abstraction Quality
How does the platform represent your hardware in software? Is the abstraction useful, or does every new device variant require a vendor engagement?
The test: ask the vendor how their platform handles a new MCU variant the team has never seen before. If the answer involves their engineering team and a four-week timeline, that is a lock-in signal.
3. Test Authoring Workflow
Who writes the tests, and in what language? If tests are authored in a proprietary DSL or a vendor-specific GUI, the team is being trained in a skill that does not transfer. If tests are authored in Python, C, or another standard language with a vendor library, knowledge stays with the team.
We have seen teams whose entire test suite is locked inside a vendor’s proprietary tooling. Migration off the platform would require rewriting the test suite from scratch. This is the most common form of platform lock-in.
4. Data Egress and Format Lock-In
Where does the test data live, and in what format? Can you export it? Can you query it without going through the vendor’s UI?
A platform that stores results in an open format (Parquet, JSON, SQLite) and exposes them via a documented API is recoverable. A platform that stores results in a proprietary binary format and gates queries behind a paid analytics module is a hostage situation waiting to happen.
5. CI/CD Integration
How does the platform integrate with the build system the team already uses? Is it a first-class plugin, or is the team expected to write glue code?
The good platforms have native GitHub Actions, GitLab CI, and Jenkins integrations, and treat the build system as the source of truth for what to test and when. The bad platforms try to be the build system, replacing what the team already runs adding another set of credentials, another dashboard, and another upgrade cycle to manage.
6. Total Cost Over Three Years (Not License Cost)
License cost is the visible part. The real cost includes:
- Hardware required to run the platform (often a dedicated server, sometimes more)
- Vendor professional services for setup and custom integrations
- Annual maintenance and support
- Engineering time to maintain integrations as the vendor releases new versions
- Training time for new engineers joining the team
- Migration cost when the platform no longer fits
For mid-size embedded teams, three-year total cost of ownership for a commercial platform typically lands between €80K and €350K based on the engagements we have walked clients through, dominated by license fees, professional services, and integration maintenance. A self-hosted equivalent using open-source components plus dedicated engineering time typically lands between €40K and €150K, with significantly more flexibility and zero vendor risk.
7. Exit Plan
Before signing, ask: if we decide to leave in 18 months, what happens? A platform with a credible exit plan is one where:
- Test scripts are in a standard language and can run elsewhere with modest refactoring
- Test data can be exported in full
- HIL fixtures are physical assets the team owns, not vendor-locked instruments
- The contract does not have egress fees or extended termination clauses
If the vendor cannot answer this question coherently, that is the answer.
The Three Architectural Paths
Once the seven criteria are scored, three broad paths emerge.
Path A Fully Self-Hosted, Open Source Stack
GitHub Actions or GitLab CI as the orchestrator. Python-based HIL controllers using PyVISA or pyserial for instrument control, python-can for CAN bus, pyOCD for debug-probe automation, and pytest-embedded as the test runner. Test results stored in PostgreSQL or SQLite. Reporting via Grafana or simple HTML. SBOM generation via Syft.
When this works:
Teams with at least one engineer comfortable building internal tooling, no immediate regulatory pressure, and a product line stable enough that the investment compounds.
Estimated effort:
3 to 6 months of one engineer’s time to reach feature parity with a commercial platform for a single product line. For context, Path C commercial setups typically take 1 to 3 months of vendor-led work plus 3 to 6 months of in-house customization, the time savings are smaller than they appear.
Three-year TCO:
€40K–€150K (mostly engineering time).
Path B Hybrid: Self-Hosted Orchestration + Commercial HIL Hardware
Use open source for build, CI integration, and reporting. Use a commercial HIL hardware platform (NI, dSPACE, Speedgoat, ETAS, or Vector for automotive bus-heavy testing) for the parts where the hardware genuinely matters and the open-source equivalents have meaningful fidelity gaps.
When this works:
Most teams. The 80/20 split commercial where it earns its keep, open source where it does not. The discipline is keeping the commercial layer thin enough to be replaceable.
Estimated effort:
2–4 months setup, ongoing maintenance lower than fully self-hosted because the hardest hardware integration is bought.
Three-year TCO:
€100K–€250K.
Path C Commercial End-to-End Platform
A single-vendor platform that covers build, simulation, HIL, and sometimes fleet. Examples include Parasoft and LDRA (Build and static analysis), VectorCAST and Tessy (unit and integration testing), NI TestStand and VeriStand or dSPACE AutomationDesk (HIL orchestration). Note that no single commercial product genuinely covers all four loops, Path C is almost always a curated set of commercial tools, not one platform.
When this works:
Teams with strict regulatory requirements (DO-178C aerospace, IEC 62304 medical, ISO 26262 automotive) where vendor certification dossiers materially reduce certification cost. Also teams without the bandwidth to maintain internal tooling.
When this becomes a trap:
Teams without regulatory drivers who buy “to save engineering time” and discover the platform creates more engineering time than it saves.
Estimated effort:
1–3 months of vendor-led setup.
Three-year TCO:
€150K–€500K+ depending on team size and product complexity.
What Has Changed in 2026
Three forces are reshaping this decision today, and they should weigh in any evaluation.
The EU Cyber Resilience Act.
SBOM generation, vulnerability tracking, and per-device update evidence become compliance requirements, vulnerability and incident reporting from 11 September 2026, full product compliance from 11 December 2027. Platforms that treated compliance as an afterthought are scrambling. Platforms that built it in from the start (or self-hosted stacks where compliance is just another scripted step) are better positioned. We covered this in detail in the CRA draft guidance piece.
AI-assisted test generation.
LLMs (GitHub Copilot, Cursor, and dedicated tools like Diffblue Cover for unit-test generation) are now competent at generating embedded test cases from requirements documents and existing code, particularly for boilerplate-heavy unit tests. The platforms that expose their test authoring through standard languages (Python, C) integrate cleanly. The platforms that locked tests inside proprietary DSLs are being left behind.
OTA-driven validation.
Fleet validation as the final loop is no longer optional for connected products. The platforms that treat fleet data as a first-class signal feeding regressions back into the SIL/HIL stages are the ones building toward a defensible position. We wrote about the architectural side of this in Why Most Embedded Teams Get OTA Wrong.
The Decision Matrix
When we walk clients through this, we use a one-page matrix. Score each platform 1–5 against the seven criteria. Weight the criteria by what matters to the product:
| Criterion | Weight (typical) | Higher weight if… |
|---|---|---|
| Coverage across loops | 15% | Product needs all four loops day one |
| Hardware abstraction quality | 10% | Product line uses many MCU variants |
| Test authoring workflow | 20% | Team plans to scale engineering headcount |
| Data egress / lock-in | 15% | Long product lifecycle (5+ years) |
| CI/CD integration | 10% | Existing CI/CD investment is significant |
| Three-year TCO | 15% | Budget pressure is real |
| Exit plan | 15% | Vendor risk is meaningfully high |
The platform that wins is rarely the one that scored highest on any single criterion. It is the one that scored “good enough” across all seven without catastrophic weakness in any single one. A 4-out-of-5 average across the board usually beats a 5-out-of-5 in three criteria and a 1-out-of-5 in two others.
When to Stop Evaluating and Start Building
A failure mode we see often: teams spend six months evaluating platforms when the right answer was to start building the self-hosted stack on day one.
Three signals it is time to stop evaluating:
- You have evaluated more than three platforms and none have scored above 70% on your weighted matrix
- The vendor sales cycles are starting to exceed the time the team has spent on actual test development
- The team has at least one engineer who has built internal tooling before and is energised by the prospect of doing it again
If any two of these are true, the answer is almost always Path A or Path B. The opportunity cost of further evaluation exceeds the value of the marginally-better platform that might exist.
At Better Devices, we help embedded teams select, integrate, and sometimes migrate away from continuous validation platforms. We do not sell a platform. We help teams pick the right one, build the parts that should be built, and avoid the lock-in that costs them later. Related reading: How to Build a HIL Test Bench for Embedded Devices, 5 HIL Testing Mistakes That Ship Bugs to the Field, and Beyond the Lab Bench: 5 Continuous Validation Strategies. If platform selection is on your roadmap and you would value a vendor-neutral conversation, book a vendor-neutral platform selection review.
Join other engineering leaders receiving our monthly insights, or reach out to discuss how Better Devices can help your team ship faster.
