Best way of learning is Sharing.: Memory-Saving Techniques to Boost Performance in ASP.NET Core Web APIs

When an ASP.NET Core Web API starts slowing down under load, the root cause is often not “CPU is too high” — it’s memory pressure. Rising allocations trigger frequent garbage collections (GC), increased latency, and sometimes a steadily growing working set that ends in restarts or out-of-memory events (especially in containers).

This article walks through practical, high-impact memory-saving techniques for ASP.NET Core Web APIs and ties them together with a real-life style example of optimizing a “Orders” endpoint in an e-commerce system.

Why memory matters in Web APIs

Every request creates objects: DTOs, strings, collections, EF Core tracking graphs, serialized JSON buffers, logs, etc. If your API allocates more per request than necessary, you’ll see:

Higher latency (GC pauses)
Lower throughput (more time collecting than doing work)
Memory spikes during traffic bursts
Unstable performance over time (working set growth)

The goal isn’t “use no memory” — it’s to allocate less per request, avoid buffering large payloads, and keep caches bounded so memory stays predictable.

Real-life example scenario: “Orders API” under load

Context: An e-commerce platform exposes:

GET /api/orders/search?customerId=…

Traffic pattern:

200–500 requests/sec during peak
Many clients ask for large date ranges
A small percentage of customers have tens of thousands of orders

Symptoms:

P95 latency jumps from 120ms to 900ms during peak
Gen 2 GCs become frequent
Memory usage climbs after each traffic spike
Occasional container restarts

The “before” implementation (common anti-patterns)

Typical issues:

Loading full entity graphs
Tracking enabled for read endpoints
Materializing full lists in memory
Returning massive payloads without pagination

[HttpGet("search")]
public async Task<IActionResult> Search(Guid customerId)
{
    var orders = await _db.Orders
        .Include(o => o.Items)
        .Include(o => o.Payments)
        .Where(o => o.CustomerId == customerId)
        .OrderByDescending(o => o.CreatedAt)
        .ToListAsync();

    var dto = orders.Select(o => new OrderDto
    {
        Id = o.Id,
        CreatedAt = o.CreatedAt,
        Total = o.Total,
        Items = o.Items.Select(i => new ItemDto { /* ... */ }).ToList()
    }).ToList();

    return Ok(dto);
}

This looks harmless, but at scale it causes:

Large allocations for List<>, nested List<>, DTO graphs
EF Core tracking overhead (stores snapshots, references, fixup)
Bigger JSON serialization buffers
Higher GC pressure and latency

The optimized version: memory-saving changes that matter

Below are the highest-impact changes for memory and performance.

1) Always paginate large result sets

If your endpoint can return “all orders,” it will eventually return “too many orders.”

Rule of thumb: enforce a maximum pageSize, even for internal APIs.

[HttpGet("search")]
public async Task<IActionResult> Search(Guid customerId, int page = 1, int pageSize = 50)
{
    page = Math.Max(page, 1);
    pageSize = Math.Clamp(pageSize, 1, 200);

    // ...
}

Why this saves memory: you cap the number of entities/DTOs/materialized rows in memory at once.

2) Use AsNoTracking() for read-only queries

For read endpoints, EF Core tracking is often wasted memory.

var query = _db.Orders
    .AsNoTracking()
    .Where(o => o.CustomerId == customerId);

Why this saves memory: tracking creates internal data structures for every entity row. AsNoTracking() avoids them.

3) Project directly to DTOs (avoid loading entity graphs)

Instead of Include + mapping after the fact, shape the response in the database query.

var results = await _db.Orders
    .AsNoTracking()
    .Where(o => o.CustomerId == customerId)
    .OrderByDescending(o => o.CreatedAt)
    .Skip((page - 1) * pageSize)
    .Take(pageSize)
    .Select(o => new OrderSummaryDto
    {
        Id = o.Id,
        CreatedAt = o.CreatedAt,
        Total = o.Total,
        ItemCount = o.Items.Count
    })
    .ToListAsync();

Why this saves memory:

You avoid materializing full Order, Items, Payments graphs
You allocate fewer objects overall
JSON serialization is smaller and faster

Bonus: this often reduces database load too.

4) Stream results for truly large exports (avoid buffering)

Some endpoints are inherently large (exports, reports). For those, don’t return huge JSON arrays in one shot.

Options:

CSV export streamed to the response
NDJSON streaming (one JSON object per line)
IAsyncEnumerable streaming (careful with client expectations)

A CSV streaming example outline:

Write headers
Stream rows in batches
Flush periodically

This keeps memory stable because you never hold the full dataset in memory.

5) Bound your caches (don’t “accidentally DOS yourself”)

Unbounded in-memory caching is a classic cause of memory growth.

If you use IMemoryCache, configure size limits and set size per entry:

Enable SizeLimit
Every entry must call SetSize(…)
Use absolute expiration

Why this saves memory: the cache becomes self-limiting instead of growing with traffic patterns.

6) Avoid request/response body buffering unless required

Middleware or logging that reads the body often forces buffering. Buffering large payloads multiplies memory use during concurrency.

Guidance:

Don’t enable request buffering globally
Don’t log entire bodies in production
Stream uploads/downloads

7) Logging: reduce hidden allocations

High-volume logging allocates:

formatted strings
structured state objects
scope dictionaries

Prefer structured logs:

Good: logger.LogInformation(“Fetched {Count} orders for {CustomerId}”, count, customerId);
Avoid: interpolated strings in hot paths, or logging huge serialized objects

Putting it together: an “after” endpoint

Here’s a more memory-friendly version of the original endpoint:

[HttpGet("search")]
public async Task<ActionResult<PagedResult<OrderSummaryDto>>> Search(
    Guid customerId,
    int page = 1,
    int pageSize = 50)
{
    page = Math.Max(page, 1);
    pageSize = Math.Clamp(pageSize, 1, 200);

    var baseQuery = _db.Orders
        .AsNoTracking()
        .Where(o => o.CustomerId == customerId);

    var total = await baseQuery.CountAsync();

    var data = await baseQuery
        .OrderByDescending(o => o.CreatedAt)
        .Skip((page - 1) * pageSize)
        .Take(pageSize)
        .Select(o => new OrderSummaryDto
        {
            Id = o.Id,
            CreatedAt = o.CreatedAt,
            Total = o.Total,
            ItemCount = o.Items.Count
        })
        .ToListAsync();

    return Ok(new PagedResult<OrderSummaryDto>(data, total, page, pageSize));
}

public record OrderSummaryDto(Guid Id, DateTime CreatedAt, decimal Total, int ItemCount);

public record PagedResult<T>(IReadOnlyList<T> Data, int Total, int Page, int PageSize);

This version:

Caps payload size
Avoids tracking
Avoids loading large graphs
Allocates fewer objects per request

A quick checklist you can apply to most APIs

Pagination everywhere for list endpoints
AsNoTracking() for reads
Projection (Select) instead of Include + mapping
Streaming for exports and large payloads
Bounded caching (size + expiration)
No global body buffering
Structured logging with controlled volume

How to validate improvements (what to measure)

To prove memory optimizations, track:

Allocated bytes/sec
Gen 0/1/2 GC counts
P95/P99 latency
Working set / private bytes
Request rate under load

In practice, the biggest sign you’re winning is: lower allocations per request, fewer Gen2 collections, and stable memory during bursts.

Hope you like the article. Happy Programming.

Best way of learning is Sharing.

Tuesday, 23 December 2025

Memory-Saving Techniques to Boost Performance in ASP.NET Core Web APIs