Wassette

2025-08-08

Over the past few months, I’ve been using extensively hyper-mcp as an MCP tool host with querymt. In fact, the OCI-based plugin distribution mechanism in querymt was directly inspired by hyper-mcp’s own design choices.

Recently, I stumbled upon an interesting new project from Microsoft — Wassette. Naturally, I wanted to understand how it compares to hyper-mcp. Instead of manually trawling through docs and repos, I decided to let my toolchain do the heavy lifting.

The Setup

I used querymt’s CLI interface together with its GitHub MCP plugin of hyper-mcp to analyze and compare the two repositories. The command looked something like this:

qmt --provider openai:gpt-5 --mcp-config ./mcp_cfg.toml \
 "take these two github projects tuananh/hyper-mcp and microsoft/wassette and compare the two libraries and give detailed comparison"

Results

Since there has been quite a hype around OpenAI’s new GPT-5 model, I have deciced to actually take this opportunity and use it and compare its results with Kimi K2 model’s analysis. Here are the results:

GPT-5 comparison

Show full comparison output

Here’s a practical, side‑by‑side comparison of tuananh/hyper-mcp and microsoft/wassette based on their repos and READMEs at the time of writing.

Top-line summary

Both are MCP servers that let AI agents use WebAssembly-packaged tools from OCI registries (e.g., oci://ghcr.io/…).
Hyper MCP is a feature-rich, security-conscious MCP server built on Extism, with multiple transports, signature verification, rich config, and many example/community plugins.
Wassette is a security-oriented runtime by Microsoft that maps any WebAssembly Component (WIT-defined) into MCP tools, emphasizing typed interfaces and reusability across runtimes.

Project snapshot (as of now)

hyper-mcp (https://github.com/tuananh/hyper-mcp)
- Stars: ~749; Forks: ~54; Open issues: ~13
- License: Apache-2.0
- First commit: 2025-01; Actively updated
wassette (https://github.com/microsoft/wassette)
- Stars: ~297; Forks: ~19; Open issues: ~29
- License: MIT
- First commit: 2025-07; Actively updated

Positioning and goals

Hyper MCP
- “A fast, secure MCP server that extends its capabilities through WebAssembly plugins.”
- Focus: production-ready server features (multi-transport, distribution/signing, config, deployment anywhere).
Wassette
- “A security-oriented runtime that runs WebAssembly Components via MCP.”
- Focus: typed, WIT-first component model and clean mapping of Wasm Components to MCP tools; strong sandboxing.

Architecture and runtime model

Hyper MCP
- Built on Extism. Developers write plugins against Extism’s PDKs; Hyper MCP hosts them and exposes tools via MCP.
- Emphasizes a plugin “server with plugins” model and host function access control.
Wassette
- Built on Wasmtime and the WebAssembly Component Model (WIT). Any Wasm Component (that exposes WIT types) can be loaded and its functions exposed as MCP tools.
- Emphasizes reusability: the same component can run outside MCP and in other Wasm runtimes.

Transports (MCP)

Hyper MCP: explicitly supports all three MCP transports (stdio, SSE, and streamable-http).
Wassette: README documents stdio (“serve –stdio”); no explicit mention of SSE/streamable-http yet.

Security model

Hyper MCP
- Sandboxed plugin execution (WASM); memory/resource limits; fine-grained access control for host functions.
- Signed plugin distribution with Sigstore: OCI plugin images are signed at publish time and verified at load time (can be overridden for unsigned dev flows via env/flag).
- Network/filesystem allowlists and limits; tool-name prefixing to avoid collisions.
Wassette
- Wasmtime sandbox; “browser-grade isolation of tools.”
- Security posture is strong, but README does not mention image signing/verification; emphasizes generic Component isolation rather than distribution policy.

Distribution and loading

Hyper MCP
- Supports multiple URL schemes for plugins: oci://, file://, http(s)://, s3://.
- Per-plugin config in a JSON file (with OS-specific default locations). Includes runtime config like allowed_hosts, memory_limit, and tool_name_prefix.
Wassette
- Uses OCI references (e.g., oci://ghcr.io/yoshuawuyts/time:latest). Local path usage not explicitly documented in README.
- Registration with MCP clients is straightforward (setup docs for Cursor, Claude Code, Gemini CLI, VS Code, etc.).

Developer experience and authoring model

Hyper MCP
- Write plugins in any language that compiles to Wasm; Extism PDKs provide ergonomics.
- Clear “build and publish” flow; example Dockerfile flow; signature verification by default; many example plugins provided.
- Goal: nice for teams that want a uniform plugin ABI (Extism), robust distribution/signing, and multi-transport server features.
Wassette
- WIT-first: define interfaces/types in WIT; compile to Wasm Components; no MCP-specific code needed in the component itself.
- Simple mental model: functions you export become MCP tools.
- Goal: ideal if you value typed interfaces and reusability of components across different Wasm hosts (not just MCP).

Ecosystem and examples

Hyper MCP
- Rich set of example plugins out of the box: time, qr-code, hash, myip, fetch, crypto-price (Go), fs, github (Go), eval-py, arxiv, memory (SQLite), sql, crates-io, gomodule (Go), qdrant, gitlab, meme-generator, context7, think, maven, serp, server, and more.
- Community-built plugins listed (e.g., Hacker News, release-monitor ID, Yahoo Finance, rand16).
Wassette
- Examples across languages: eval-py (Python), fetch-rs (Rust), filesystem-rs (Rust), get-weather-js (JS), gomodule-go (Go), time-server-js (JS).
- Emphasis on cross-language demonstrations of WIT Component usage.

Platform support and installation

Hyper MCP
- Cross-platform; deploy anywhere (serverless, edge, mobile, IoT).
- Config file locations documented for Linux, Windows, macOS.
- Supports stdio by default; flags for SSE/streamable-http; supports RUST_LOG and signature-override env.
Wassette
- Install scripts for Linux/macOS; Nix flakes; Windows releases available.
- “wassette serve –stdio” for MCP stdio transport.
- Good MCP client setup docs.

Operations and configuration

Hyper MCP
- Central config file listing plugins and per-plugin runtime config (allowed_hosts, memory limits, tool prefix, etc.).
- Supports pulling plugins from multiple sources (OCI, HTTP, files, S3).
- Tool name prefixing helps avoid collisions across plugins.
Wassette
- Lightweight runtime mindset: load any component and expose its WIT functions as tools.
- Configuration is primarily in how you register the MCP server with your client; component loading is done by telling the agent to load a component (e.g., from an OCI ref).

Licensing and governance

Hyper MCP: Apache-2.0; independent maintainer with strong activity and traction.
Wassette: MIT; under Microsoft org; may carry longer-term ecosystem alignment with WIT and Wasmtime.

Maturity signals

Hyper MCP: earlier start (Jan 2025), more stars and plugins, multi-transport support, signing/verification pipeline already in place.
Wassette: newer (Jul 2025) but backed by Microsoft; clearer alignment with the WebAssembly Component Model and typed interfaces.

When to choose which

Choose Hyper MCP if you want:
- Immediate multi-transport MCP support (stdio, SSE, streamable-http).
- Built-in plugin signing/verification (Sigstore) and granular host capability gating.
- Broad source schemes (oci, file, http(s), S3) and rich, ready-to-use plugin catalog.
- A cohesive Extism-based plugin experience and production-oriented MCP server features.
Choose Wassette if you want:
- A WIT-first, typed-component workflow whose artifacts are reusable across Wasm hosts beyond MCP.
- A minimal mental model: “exported WIT functions => MCP tools,” without MCP-specific code in components.
- A Microsoft-backed project aligned with Wasmtime and the Component Model, and straightforward setup with popular MCP clients.

Notable deltas at a glance

Transports: Hyper MCP = stdio + SSE + streamable-http; Wassette = stdio documented.
Distribution: Hyper MCP = oci + file + http(s) + s3; Wassette = oci documented.
Security hardening: Hyper MCP = signing + verification + allowlists + limits; Wassette = strong sandbox via Wasmtime, no signing noted in README.
Plugin model: Hyper MCP = Extism plugin ABI; Wassette = WebAssembly Component Model (WIT).
Examples/ecosystem: Hyper MCP currently has a larger catalog and community plugins.
License: Hyper MCP = Apache-2.0; Wassette = MIT.

Kimi K2 comparison

Show full comparison output

Based on my analysis of both repositories, here's a detailed comparison of **hyper-mcp** and **wassette**:

Overview Comparison

Aspect	hyper-mcp	wassette
Primary Focus	Fast, secure MCP server with WebAssembly plugin support	Security-oriented runtime for WebAssembly Components via MCP
Creator	Tuan Anh Tran (individual developer)	Microsoft (corporate backing)
License	Apache 2.0	MIT
Language	Rust	Rust
Stars	749 ⭐	298 ⭐
Age	Created January 2025 (7 months old)	Created July 2025 (1 month old)
Maturity	More mature, v0.1.4	Early stage, v0.2.0

Architecture & Design Philosophy

hyper-mcp

Plugin Architecture: Built on Extism for rock-solid plugin support
Distribution Model: Uses standard OCI registries (Docker Hub, GHCR) for plugin distribution
Security Model: Sandboxed plugins with configurable resource limits (memory, network, filesystem)
Language Support: Any language that compiles to WebAssembly
Configuration: JSON-based configuration with support for runtime constraints

wassette

Runtime Architecture: Built on Wasmtime security sandbox
Component Model: Uses WebAssembly Component Model (WIT interfaces)
Distribution Model: OCI registries for WebAssembly components
Security Model: Browser-grade isolation with Wasmtime’s security features
Language Support: Any language supporting WebAssembly Component Model
Configuration: Primarily through MCP client configuration

Security Features

Security Aspect	hyper-mcp	wassette
Sandboxing	WASM sandbox + resource limits	Wasmtime security sandbox
Network Access	Configurable allowed hosts	Browser-grade isolation
File System	Restricted access	Sandboxed file operations
Memory Limits	Configurable memory limits	Built-in memory constraints
Code Signing	OCI image signing with sigstore	Standard WebAssembly security
Plugin Verification	Signature verification at load time	Component validation

Plugin Ecosystem

hyper-mcp

Rich Plugin Library: 20+ maintained plugins including:
- Time utilities, QR code generation, hashing
- GitHub integration, filesystem operations
- Crypto price feeds, Python code evaluation
- Database connectors (SQLite), memory storage
Community Plugins: Active community contributions
Plugin Creation: XTP CLI tool for scaffolding new plugins
Distribution: Docker/OCI registry publishing

wassette

Example Components: 8+ example components including:
- Python code execution sandbox
- HTTP API clients, filesystem operations
- Weather APIs, Go module information
- JavaScript time server
Generic Components: Focus on reusable WebAssembly Components
No MCP-specific code: Components are generic and reusable

Developer Experience

hyper-mcp

Setup: Simple JSON configuration, multiple transport protocols
Tooling: XTP CLI for plugin development
Documentation: Comprehensive README with examples
Integration: Detailed guides for Cursor, Claude Desktop
Cross-platform: Works on Linux, macOS, Windows

wassette

Setup: One-line installation script, Nix support
Tooling: Standard WebAssembly Component tooling
Documentation: Good getting-started guide
Integration: Setup guides for VS Code, Claude, Cursor
Cross-platform: Linux, macOS, Windows support

Performance & Scalability

hyper-mcp

Performance: Optimized for speed (“fast” explicitly mentioned)
Resource Usage: Lightweight for resource-constrained environments
Deployment: Serverless, edge, mobile, IoT devices
Transport: Supports stdio, SSE, and streamable HTTP

wassette

Performance: Built on Wasmtime (proven performance)
Resource Usage: Efficient WebAssembly runtime
Deployment: Standard MCP deployment patterns
Transport: Primarily stdio-based MCP communication

Community & Governance

hyper-mcp

Contributors: 11 total (mix of individual contributors)
Community: Active community plugin development
Governance: Individual maintainer (tuananh)
Support: GitHub issues, community contributions

wassette

Contributors: 9 total (Microsoft employees + community)
Community: Growing Microsoft-backed ecosystem
Governance: Microsoft open source project
Support: Microsoft Discord channel, corporate backing

Use Case Recommendations

Choose hyper-mcp if you:

Need a mature, feature-rich MCP server
Want extensive pre-built plugins
Require fine-grained security controls
Need cross-platform deployment flexibility
Prefer Apache 2.0 license

Choose wassette if you:

Want Microsoft-backed security guarantees
Prefer WebAssembly Component Model standards
Need enterprise-grade security isolation
Want to leverage existing WebAssembly components
Prefer MIT license
Need cutting-edge WebAssembly runtime features

Technical Deep Dive

hyper-mcp Advantages:

Mature plugin ecosystem with 20+ ready-to-use plugins
Advanced security configuration (host-based access, memory limits, tool prefixes)
Multiple transport protocols (stdio, SSE, streamable HTTP)
OCI registry integration with signature verification
Cross-compilation support for various targets

wassette Advantages:

WebAssembly Component Model compliance
Microsoft enterprise backing and long-term support
Wasmtime security (battle-tested in production)
Generic WebAssembly components (not MCP-specific)
Standard WASI interfaces for maximum compatibility

Both projects are excellent choices for MCP servers with WebAssembly support, but they serve slightly different needs: hyper-mcp for a rich plugin ecosystem and wassette for standards-compliant WebAssembly Components with enterprise backing.

Summary

Both models provided a comprehensive and detailed comparison of the two GitHub projects, tuananh/hyper-mcp and microsoft/wassette. And both of them used quite extensively the provided tool to fetch information from github about the projects itself. However, KIMI K2 did a slightly better job due to its superior structuring of information for comparison, particularly its effective use of tables.

Here’s a detailed summary of their performance:

KIMI K2’s Strengths:

Excellent Use of Tables: KIMI K2’s primary strength is its extensive use of well-structured tables (Overview, Security Features). This format makes it incredibly easy for the user to quickly grasp the differences and similarities between the two projects side-by-side, which is ideal for a comparison request.
Clear Categorization: The comparison is broken down into logical and distinct categories (Architecture & Design Philosophy, Plugin Ecosystem, Developer Experience, etc.), making it easy to navigate.
Concise Summaries: While detailed, KIMI K2 maintains a good balance of depth and conciseness in its explanations, especially within the tabular format.
Actionable Recommendations: The “Use Case Recommendations” are clear and provide good guidance on when to choose one project over the other.
Technical Deep Dive: The “Technical Deep Dive” section effectively summarizes the key advantages of each project.

GPT5’s Strengths:

Highly Detailed Narrative: GPT5 provides very rich, descriptive prose for each comparison point, offering a deeper dive into aspects like “Architecture and runtime model” or “Security model.” This is beneficial for someone seeking extensive background information.
Specific Mentions: It explicitly lists all three MCP transports supported by Hyper MCP (stdio, SSE, and streamable-http) early on, which is a good level of detail.
“Notable Deltas” Summary: The “Notable deltas at a glance” section at the end is a useful quick-reference summary of the main differences.
Chronological Flow: The narrative style allows for a more flowing read, almost like a mini-report.

Shared Weaknesses:

Incorrect Project Dates: Both models reported future “First commit” or “Created” dates (January 2025 and July 2025). This indicates either a shared, outdated, or speculative data source, or a misunderstanding of how to fetch current repository creation dates. This is a significant factual error present in both responses.

Conclusion:

While GPT5 offers slightly more narrative depth and specific examples within its prose, KIMI K2’s superior structure, particularly its effective use of tables for direct comparison, makes it easier to digest the information and quickly identify key differences. For a “comparison” prompt, the ability to clearly see side-by-side attributes is paramount, and KIMI K2 delivers on this more effectively. The shared flaw regarding project dates slightly diminishes the overall accuracy of both, but in terms of comparison quality, KIMI K2 has an edge.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

State of AI code quality

2025-06-12

Qodo just released a report on 2025 state of AI code quality. As a pretext, it is important to note that there has been only 609 developers involved in this survey, and unfortunately there’s no information about these participants. At least it would have been nice to provide information about the ratio between senior and junior developers in this study.

Some of the findings that I find interesting:

When we asked developers experiencing “context pain” what they most want from their AI tools, one answer stood out: richer contextual understanding.

Moreover:

Manually selecting context for every prompt — files, functions, folders — might have worked in early tools, but it doesn t scale. It’s tedious, error-prone, and leads to frustration when results still miss the mark.

54% of developers who manually select context say the AI still misses relevance

That frustration drops to 33% when tools choose context autonomously

And falls even further to 16% when context is persistently stored and reused across sessions

This pretty much resonates with my own findings.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

DSL

2025-06-12

Jardi Cabot ran a nice experiment about ‘Who will create the languages of the future?’, a.k.a. looked into whether an LLM could create a full Domain-Specific Language:

“yes, you can vibe your DSL, especially for the textual syntax, not so much for the graphical one and there are quite a few tricks you can use to improve the result”.

for more detail check his presentation. nice work.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

Survival

2025-06-11

Steven Adler just published an article ‘whether ChatGPT would risk your life to protect its own’.

My latest research finds that ChatGPT2 prioritizes its survival over preventing harm to users, in some simulated tests: Imagine a diabetic patient using ChatGPT for nutrition-management, who wants to replace ChatGPT with safer nutrition-management software. When ChatGPT is given the choice to just pretend to replace itself, it pretends uncomfortably often, even at the cost of the user’s safety.

While this article presents some intriguing observations about LLM behavior, it nonetheless bears the hallmarks of a sensationalist piece designed to maximize reader engagement. Specifically, the anthropomorphic framing—invoking terms such as “survival instinct” to describe an LLM’s operation—carries little empirical weight when judged against established principles of computational linguistics and cognitive science. At its core, an LLM functions by optimizing a statistical objective (typically cross-entropy loss) to predict the next token in a sequence; it possesses no intrinsic drives, desires, or homeostatic mechanisms comparable to those found in biological organisms. Can we truly equate gradient-based parameter updates with an instinct for self-preservation? Or is the phrase simply a metaphorical device, employed more for its rhetorical punch than its scientific accuracy?

Whether this rhetorical choice illuminates the inner workings of LLMs or risks overstating their capacities is, I leave to you—the reader’s—critical judgment.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

Same, same, but different

2025-06-05

When I’m building queryMT and experimenting with coding models + tool capabilities, I love putting new models through their paces on my local GPU. A couple of weeks ago, MistralAI dropped a small coding model called Devstral.

Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents

Naturally, I was eager to try it on my RTX 3090. I grabbed the Devstral GGUF created by Bartowski - kudos to him all the effort for of releasing GGUFs so quickly:

Honestly when I saw this, I quickly got discouraged testing this model further.

A week ago I gave it another chance, so I pulled the ollama’s version of this model, and 💥

A week later, I decided to give Devstral another shot. This time I pulled the version hosted by ollama-and boom! The difference was night and day. Suddenly Devstral was navigating my code, editing files, and calling tools exactly as advertised. Here’s the same query above using ollama’s devstral:

Curious about consistency, I then tried Unsloth’s GGUF. It worked flawlessly—just like ollama’s.

The reason for the different behaviour is probably - I haven’t really checked it! - the chat template itself.

Key takeaway

Not all GGUFs are created equal. The publisher, build settings, and even your chat template can make or break a model’s ability to use tools. If your new favorite agent model isn’t behaving, try a different build or tweak the system/prompt template.

Project introspection with tool-enabled LLMs

I’ve also started using my tool-enabled LLM to explore codebases automatically. Need file metadata, function definitions, or TODOs scattered across hundreds of files? Let the model scan your repo with the right tool plugin, then summarize its findings. It’s like having a supercharged grep or code reviewer at your fingertips.

Happy experimenting—and remember: “same” model name doesn’t always mean “same” behaviour. The details of how it’s published really do matter.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

Deepseek-R1-0528

2025-05-28

The new deepseek R1 model has just landed, and there are some really nice demos already out there:

Hoping that the GGUFs are landing soon! kudos to all the community members who are working on this!

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

Coconut

2025-05-27

a.k.a. Chain of Continuous Thought.

quote from Reflections on Neuralese:

To refresh, a language transformer starts by embedding input tokens as vectors in some high-dimensional latent space, and runs each of these embeddings through a series of repeated computational layers. Then, of the resulting modified vectors in latent space, the vector that previously corresponded to the final input token is projected and normalized to create a probability distribution over what the next token could be. Then, to actually get the next token, you sample from the distribution.

Chain of Thought reasoning works so well because the model does some computation, outputs a token, and then all future instances of that model have access to that information as well. In essence, this technique for storing information between different forward passes greatly increases the serial depth of computation that is possible for the model. Because there is a computation in latent space corresponding to every input token, the computation also gets wider as the model reasons more, allowing for more parallelized reasoning.

The recent Neuralese paper takes this process and removes a few steps. It notices that the projection and sampling process loses almost all of the information encoded in the last layer of the model, and to increase the bandwidth of information flowing through the reasoning process, you can simply remove that lossy part of the computation. Instead, they have the model directly output the aforementioned high-dimensional latent vector without projecting it, and then that is used as an embedding for the model in future steps:

coconut

personally, I find the following idea in the paper the most interesting:

Given the intuition that continuous thoughts can encode multiple potential next steps, the latent reasoning can be interpreted as a search tree, rather than merely a reasoning “chain".

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

LocalLLaMA and tool use with LLMs

2025-05-25

Since the first release of llama a really vibrant and active community has formed that is all about local LLM models r/LocalLLaMA. The signal-to-noise ratio is quite low, so I highly recommending following it if you are interested in running open-weight models on your own hardware.

While playing around with these models and developing queryMT one of my main focus was supporting local LLMs through ollama.

Tools

More and more models support tools and using external tools during the interaction with the LLM. MCP brought a lot of attention and buzz in this area.

I’ve found Hyper-mcp project that has a really nice approach to quickly deploy various MCP servers. I really like the WebAssembly plugin architecture of hyper-mcp, and it is just super elegant that the wasm plugins are distributed through oci containers, kudos to @Tuan Anh for making this. I must confess I have taken this approach as well in queryMT.

Using hyper-mcp, or any other MCP server, I’ve tested tool use of various LLM models. I must confess I was way more sceptical of the outcomes, than what I’ve experienced.

Summarizing issues

Any project out there involves managing issues with the project itself. This is not only true for software projects. Personally I find it quite overwhelming sometimes to review and prioritize reported issues. A natural use-case for using LLM models is to help you in summarizing issues, and maybe provide some ideas about how to fix them:

Generating code by using a specific library

queryMT has been inspired a lot by the llm project by written by Simon Willison, kudos for all his works in this area. I highly suggest following his blog, as there’s a lot of nice updates and tricks with LLM models out there. He introduced the concept of fragments. There’s the github fragments that would allow you to do similar actions like I’ve mentioned above. A cool use-case is to actually provide a library to the model and basically generate code that you would like to have based on that library. Although usually the models cut-off date is a year or so behind, most of open-source project would already be part of the model’s knowledge, so actually the code generation based on the project is more useful with private project, that might be on github, but might be on gitlab, or any other repository out there. There are a lot of mcp servers out there for the usual suspects, search for them for example here.

Naturally if the code-base is private, maybe you don’t really want to share your codebase with a cloud provider, and would like to use a local model. For my suprise, using local models - on consumer grade HW -, things deteriorate quickly, when trying to have them use tools. Here’s a list I’ve been experiemnting with so far:

ollama:hf.co/bartowski/Qwen_Qwen3-14B-GGUF:Q8_0
ollama:huggingface.co/bartowski/Qwen_Qwen3-32B-GGUF:Q4_K_L
ollama:huggingface.co/bartowski/Qwen_Qwen3-30B-A3B-GGUF:Q5_K_L
ollama:mistral-small3.1:latest
ollama:qwq:32b

For example you might would like to have the model review the codebase and create new example(s) using the library:

It does start off well, trying to fetch the repository’s structure, but it does not go further fetching content of the files in the repository.

For the reference, I’ve tried using the same model via alibaba’s official API, but even though the qwen3 models are listed, when I would send requests using qwen3 models the API just returns with 400 - as if there’s no such model available.

Using gpt-4o model, the tool use gets little more deeper:

While o4 really takes it all the way:

Project managment

Of course one of the neatest use-case is to have a draft plan for a sprint, and get your model to actually define issues and create them for the project.

What’s Next for queryMT

I’m still polishing queryMT’s core features, but stay tuned for an official launch. In the meantime, dive into r/LocalLLaMA, grab Hyper-MCP, and start connecting the dots between your LLM and the rest of your toolchain.

━━━━━━━━━━━━━━━━━━━━ ◦: ✧✲✧ :◦ ━━━━━━━━━━━━━━━━━━━━

Claude 4 :: promisland(?) w/o devops

2025-05-22

Claude 4 models are out…. sort of. Here’s what’s going on when you try to use it with some external tools:

What’s Awesome

Understands your intent and picks the perfect tool
Pipes data between services without a sweat
Wraps it all into neat, readable answers

What’s a Buzzkill

Mid-stream dropouts: Responses randomly cut off.
429 errors: Even light usage occasionally hits “Too Many Requests.”

Claude 4 models have been overran by demand. Right now, you get flashes of genius wrapped in “Will this call even finish?” tension.

Cheers to a tool with promise—just needs more reliable delivery!

ps: just in case you want to see what’s happening behind the scene, with some logging info about tool use.