Microsoft.AI.Foundry.Local.WinML 1.0.0

Prefix Reserved

dotnet add package Microsoft.AI.Foundry.Local.WinML --version 1.0.0

NuGet\Install-Package Microsoft.AI.Foundry.Local.WinML -Version 1.0.0

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Microsoft.AI.Foundry.Local.WinML" Version="1.0.0" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Microsoft.AI.Foundry.Local.WinML" Version="1.0.0" />
                    

                            Directory.Packages.props

<PackageReference Include="Microsoft.AI.Foundry.Local.WinML" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Microsoft.AI.Foundry.Local.WinML --version 1.0.0

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Microsoft.AI.Foundry.Local.WinML, 1.0.0"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Microsoft.AI.Foundry.Local.WinML@1.0.0

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Microsoft.AI.Foundry.Local.WinML&version=1.0.0
                    

                            Install as a Cake Addin

#tool nuget:?package=Microsoft.AI.Foundry.Local.WinML&version=1.0.0
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Foundry Local C# SDK

The Foundry Local C# SDK provides a .NET interface for running AI models locally via the Foundry Local Core. Discover, download, load, and run inference entirely on your own machine — no cloud required.

Features

Model catalog — browse and search all available models; filter by cached or loaded state
Lifecycle management — download, load, unload, and remove models programmatically
Chat completions — synchronous and IAsyncEnumerable streaming via OpenAI-compatible types
Audio transcription — transcribe audio files with streaming support
Download progress — wire up an Action<float> callback for real-time download percentage
Model variants — select specific hardware/quantization variants per model alias
Optional web service — start an OpenAI-compatible REST endpoint (/v1/chat_completions, /v1/models)
WinML acceleration — opt-in Windows hardware acceleration with automatic EP download
Full async/await — every operation supports CancellationToken and async patterns
IDisposable — deterministic cleanup of native resources

Installation

dotnet add package Microsoft.AI.Foundry.Local

Building from source

cd sdk/cs
dotnet build src/Microsoft.AI.Foundry.Local.csproj

Or open Microsoft.AI.Foundry.Local.SDK.sln in Visual Studio / VS Code.

WinML: Automatic Hardware Acceleration (Windows)

On Windows, Foundry Local can leverage WinML for GPU/NPU hardware acceleration via ONNX Runtime execution providers (EPs). EPs are large binaries downloaded on first use and cached for subsequent runs.

Install the WinML package variant instead:

dotnet add package Microsoft.AI.Foundry.Local.WinML

Or build from source with:

dotnet build src/Microsoft.AI.Foundry.Local.csproj /p:UseWinML=true

Triggering EP download

EP management is explicit via two methods:

DiscoverEps() — returns an array of EpInfo describing each available EP and whether it is already registered.
DownloadAndRegisterEpsAsync(names?, progressCallback?, ct?) — downloads and registers the specified EPs (or all available EPs if no names are given). Returns an EpDownloadResult. Overloads are provided so you can pass just a callback without specifying names.

// Initialize the manager first (see Quick Start)
await FoundryLocalManager.CreateAsync(
    new Configuration { AppName = "my-app" },
    NullLogger.Instance);

var mgr = FoundryLocalManager.Instance;

// Discover what EPs are available
var eps = mgr.DiscoverEps();
foreach (var ep in eps)
{
    Console.WriteLine($"{ep.Name} — registered: {ep.IsRegistered}");
}

// Download and register all EPs
var result = await mgr.DownloadAndRegisterEpsAsync();
Console.WriteLine($"Success: {result.Success}, Status: {result.Status}");

// Or download only specific EPs
var result2 = await mgr.DownloadAndRegisterEpsAsync(new[] { eps[0].Name });

Per-EP download progress

Pass an optional Action<string, double> callback to receive (epName, percent) updates as each EP downloads (percent is 0–100):

string currentEp = "";
await mgr.DownloadAndRegisterEpsAsync((epName, percent) =>
{
    if (epName != currentEp)
    {
        if (currentEp != "")
        {
            Console.WriteLine();
        }
        currentEp = epName;
    }
    Console.Write($"\r  {epName}  {percent,6:F1}%");
});
Console.WriteLine();

Catalog access no longer blocks on EP downloads. Call DownloadAndRegisterEpsAsync explicitly when you need hardware-accelerated execution providers.

Quick Start

using Microsoft.AI.Foundry.Local;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Betalgo.Ranul.OpenAI.ObjectModels.RequestModels;

// 1. Initialize the singleton manager
await FoundryLocalManager.CreateAsync(
    new Configuration { AppName = "my-app" },
    NullLogger.Instance);

// 2. Get the model catalog and look up a model
var catalog = await FoundryLocalManager.Instance.GetCatalogAsync();
var model = await catalog.GetModelAsync("phi-3.5-mini")
    ?? throw new Exception("Model 'phi-3.5-mini' not found in catalog.");

// 3. Download (if needed) and load the model
await model.DownloadAsync();
await model.LoadAsync();

// 4. Get a chat client and run inference
var chatClient = await model.GetChatClientAsync();
var response = await chatClient.CompleteChatAsync(new[]
{
    new ChatMessage { Role = "user", Content = "Why is the sky blue?" }
});

Console.WriteLine(response.Choices![0].Message.Content);

// 5. Clean up
FoundryLocalManager.Instance.Dispose();

Usage

Initialization

FoundryLocalManager is an async singleton. Call CreateAsync once at startup:

await FoundryLocalManager.CreateAsync(
    new Configuration { AppName = "my-app" },
    loggerFactory.CreateLogger("FoundryLocal"));

Access it anywhere afterward via FoundryLocalManager.Instance. Check FoundryLocalManager.IsInitialized to verify creation.

Catalog

The catalog lists all models known to the Foundry Local Core:

var catalog = await FoundryLocalManager.Instance.GetCatalogAsync();

// List all available models
var models = await catalog.ListModelsAsync();
foreach (var m in models)
    Console.WriteLine($"{m.Alias} — {m.Info.DisplayName}");

// Get a specific model by alias
var model = await catalog.GetModelAsync("phi-3.5-mini")
    ?? throw new Exception("Model 'phi-3.5-mini' not found in catalog.");

// Get a specific variant by its unique model ID
var variant = await catalog.GetModelVariantAsync("phi-3.5-mini-generic-gpu-4")
    ?? throw new Exception("Variant 'phi-3.5-mini-generic-gpu-4' not found in catalog.");

// List models already downloaded to the local cache
var cached = await catalog.GetCachedModelsAsync();

// List models currently loaded in memory
var loaded = await catalog.GetLoadedModelsAsync();

Model Lifecycle

Each model may have multiple variants (different quantizations, hardware targets). The SDK auto-selects the best variant, or you can pick one. All models implement the IModel interface.

// Check and select variants
Console.WriteLine($"Selected: {model.Id}");
foreach (var v in model.Variants)
    Console.WriteLine($"  {v.Id} (cached: {await v.IsCachedAsync()})");

// Switch to a different variant
model.SelectVariant(model.Variants[1]);

Download, load, and unload:

// Download with progress reporting
await model.DownloadAsync(progress =>
    Console.WriteLine($"Download: {progress:F1}%"));

// Load into memory
await model.LoadAsync();

// Unload when done
await model.UnloadAsync();

// Remove from local cache entirely
await model.RemoveFromCacheAsync();

Chat Completions

var chatClient = await model.GetChatClientAsync();

var response = await chatClient.CompleteChatAsync(new[]
{
    new ChatMessage { Role = "system", Content = "You are a helpful assistant." },
    new ChatMessage { Role = "user", Content = "Explain async/await in C#." }
});

Console.WriteLine(response.Choices![0].Message.Content);

Streaming

Use IAsyncEnumerable for token-by-token output:

using var cts = new CancellationTokenSource();

await foreach (var chunk in chatClient.CompleteChatStreamingAsync(
    new[] { new ChatMessage { Role = "user", Content = "Write a haiku about .NET" } }, cts.Token))
{
    Console.Write(chunk.Choices?[0]?.Message?.Content);
}

Chat Settings

Tune generation parameters per client:

chatClient.Settings.Temperature = 0.7f;
chatClient.Settings.MaxTokens = 256;
chatClient.Settings.TopP = 0.9f;
chatClient.Settings.FrequencyPenalty = 0.5f;

Audio Transcription

var audioClient = await model.GetAudioClientAsync();

// One-shot transcription
var result = await audioClient.TranscribeAudioAsync("recording.mp3");
Console.WriteLine(result.Text);

// Streaming transcription
await foreach (var chunk in audioClient.TranscribeAudioStreamingAsync("recording.mp3", CancellationToken.None))
{
    Console.Write(chunk.Text);
}

Audio Settings

audioClient.Settings.Language = "en";
audioClient.Settings.Temperature = 0.0f;

Web Service

Start an OpenAI-compatible REST endpoint for use by external tools or processes:

// Configure the web service URL in your Configuration
await FoundryLocalManager.CreateAsync(
    new Configuration
    {
        AppName = "my-app",
        Web = new Configuration.WebService { Urls = "http://127.0.0.1:5000" }
    },
    NullLogger.Instance);

await FoundryLocalManager.Instance.StartWebServiceAsync();
Console.WriteLine($"Listening on: {string.Join(", ", FoundryLocalManager.Instance.Urls!)}");

// ... use the service ...

await FoundryLocalManager.Instance.StopWebServiceAsync();

Configuration

Property	Type	Default	Description
`AppName`	`string`	(required)	Your application name
`AppDataDir`	`string?`	`~/.{AppName}`	Application data directory
`ModelCacheDir`	`string?`	`{AppDataDir}/cache/models`	Where models are stored locally
`LogsDir`	`string?`	`{AppDataDir}/logs`	Log output directory
`LogLevel`	`LogLevel`	`Warning`	`Verbose`, `Debug`, `Information`, `Warning`, `Error`, `Fatal`
`Web`	`WebService?`	`null`	Web service configuration (see below)
`AdditionalSettings`	`IDictionary<string, string>?`	`null`	Extra key-value settings passed to Core

Configuration.WebService

Property	Type	Default	Description
`Urls`	`string?`	`127.0.0.1:0`	Bind address; semi-colon separated for multiple
`ExternalUrl`	`Uri?`	`null`	URI for accessing the web service in a separate process

Disposal

FoundryLocalManager implements IDisposable. Dispose stops the web service (if running) and releases native resources:

FoundryLocalManager.Instance.Dispose();

API Reference

Auto-generated API docs live in docs/api/. See GENERATE-DOCS.md to regenerate.

Key types:

Type	Description
`FoundryLocalManager`	Singleton entry point — create, catalog, web service
`Configuration`	Initialization settings
`ICatalog`	Model catalog interface
`IModel`	Model interface — identity, metadata, lifecycle, variant selection
`Model`	Model with variant selection (implements `IModel`)
`OpenAIChatClient`	Chat completions (sync + streaming)
`OpenAIAudioClient`	Audio transcription (sync + streaming)
`ModelInfo`	Full model metadata record

Tests

dotnet test

See test/FoundryLocal.Tests/LOCAL_MODEL_TESTING.md for prerequisites and local model setup.

Product	Compatible and additional computed target framework versions.
.NET	net9.0-windows10.0.26100 is compatible. net10.0-windows was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net9.0-windows10.0.26100
- Betalgo.Ranul.OpenAI (>= 9.1.0)
- Microsoft.AI.Foundry.Local.Core.WinML (>= 1.0.0)
- Microsoft.Extensions.Logging (>= 9.0.9)

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories (2)

Showing the top 2 popular GitHub repositories that depend on Microsoft.AI.Foundry.Local.WinML:

Repository	Stars
microsoft/ai-dev-gallery An open-source project for Windows developers to learn how to add AI with local models and APIs to Windows apps.	1.4K
rwjdk/MicrosoftAgentFrameworkSamples Samples demonstrating the Microsoft Agent Framework in C#	247

Version	Downloads	Last Updated
1.0.0	1,084	4/9/2026
0.9.0	1,003	3/10/2026
0.8.2.1	3,615	11/20/2025
0.8.0.1	533	11/18/2025