Blog

If ad session setup becomes slow, do you wait or degrade?
If monetization services are unhealthy, do you preserve revenue or playback continuity?
If the content path is healthy but the ad path is not, what fails first?

Those decisions define the system more than the product documentation does.

Personalized Manifests Change Cache Behavior

One of the easiest mistakes in SSAI systems is assuming that caching will behave the same way it did before ad stitching.

It will not.

With MediaTailor, manifests become more personalized and session-aware. That changes cache efficiency and makes CDN behavior less predictable if you designed the system with generic streaming assumptions.

The core issue is simple:

Personalization and cacheability are in constant tension.

Once two viewers can receive different ad experiences for the same content, manifest-level caching becomes more complicated.

That leads to consequences such as:

Lower cache hit ratios on manifest traffic
More origin pressure during spikes
Harder reasoning about edge behavior
More difficult debugging when playback differs across users

In non-SSAI pipelines, many cache assumptions are stable. With SSAI, some of those assumptions quietly stop being true.

That does not mean the model is bad. It means cache design must be revisited explicitly, not inherited blindly from a simpler delivery architecture.

The practical consequence:

A streaming platform that behaves well under shared manifests can behave very differently once manifests become individualized around ad sessions.

Tracking Is Harder Than It Looks

A common misunderstanding around SSAI is that server-side insertion automatically makes ad measurement clean.

It does not.

It makes some parts more controlled, but measurement is still full of edge cases.

The reason is straightforward:

An inserted ad is not the same thing as a confirmed ad view.

A manifest can contain an ad break. A session can be created correctly. Tracking beacons can be emitted. And the viewer can still abandon playback, hit buffering, switch devices, or fail to actually consume the ad as expected.

That is where teams start discovering that monetization truth and playback truth are not identical.

In practice, you need to reason separately about:

Ads inserted
Ads requested
Ads started
Ads actually watched
Tracking successfully emitted
Tracking successfully received downstream

If those layers are not clearly instrumented, the system becomes difficult to trust.

You may think you have a playback problem when you actually have a tracking problem. Or the opposite.

Failure Modes Matter More Than Happy Paths

SSAI demos focus on the happy path.

Production systems are defined by failure paths.

When using MediaTailor, the real engineering work starts when some part of the ad-serving chain is slow, unavailable, or inconsistent.

Typical questions appear immediately:

What happens if ad decisioning is slow?
What happens if session setup fails?
What happens if manifests are healthy but tracking is degraded?
What happens if content delivery is fine but monetization dependencies are not?

Those are not implementation details. They are product decisions expressed through architecture.

For example, if the ad path becomes unhealthy during a live event, do you block playback, delay playback, or choose a content-only fallback to preserve continuity?

There is no universally correct answer. But there must be an answer.

Teams that avoid these decisions early usually end up making them in the middle of an incident, which is the worst possible moment.

The real architectural decision:

Protect monetization first
Protect viewer continuity first
Use different fallback behavior for live and VOD

The specific answer matters less than the fact that it is explicit, tested, and observable.

Observability Is the Real Work

The operational difference between a manageable SSAI platform and a chaotic one is not the vendor.

It is observability.

Without clear visibility, teams end up debugging monetization, manifests, player behavior, and CDN issues through guesswork.

That does not scale.

At minimum, I would want visibility into:

Manifest stitching latency
SSAI session setup latency
Ad decision latency
Ad fill rate
Playback errors around ad breaks
Fallback activation rate
CDN hit ratio changes
Mismatch between inserted ads and confirmed views

That is the difference between "SSAI is enabled" and "SSAI is operable."

In ad-supported streaming, visibility is not a nice-to-have. It is what lets you separate playback incidents from monetization incidents and respond correctly to both.

Where debugging gets much harder:

One of the hardest parts of debugging SSAI playback is that server-side signals are not enough.

If player-side telemetry is limited, expensive to retain, or simply unavailable, teams end up trying to explain viewer behavior from manifests, backend timings, and CDN signals alone.

That creates a serious blind spot.

You may know that an ad break was stitched correctly and that delivery looked healthy from the server side, while still being unable to explain what the viewer actually experienced during playback.

That gap makes ambiguous playback issues much slower to diagnose and much harder to resolve with confidence.

Managed SSAI Does Not Remove Architectural Responsibility

The biggest mistake teams make with products like MediaTailor is assuming that outsourcing part of the implementation also outsources the architectural burden.

It does not.

Managed SSAI can reduce the amount of custom machinery you need to build. That is valuable.

But the hard questions remain yours:

How much latency can the playback path tolerate?
What degrades first when dependencies fail?
How do you preserve viewer continuity?
How do you validate monetization correctness?
How do you debug individualized manifest behavior under load?

AWS MediaTailor helps. It does not absolve.

Final Thoughts

From my experience, MediaTailor is useful for exactly the reason many teams adopt it: it reduces the amount of client-side ad complexity and gives you a more centralized SSAI model.

But that is only the beginning.

The real engineering challenge is not enabling ad insertion.

The real challenge is making ad-supported playback reliable, debuggable, and predictable under real production conditions.

That means treating SSAI as a distributed systems problem, not just a monetization feature.

Once ads become part of the delivery path, the standard changes.

You are no longer just serving video.

You are serving video, monetization, timing, and system resilience in the same request flow.

That is where the real complexity starts.

Designing a Scalable GraphQL Backend in Go for a Commerce Platform

5 min read·Jan 29, 2026

GraphQL is extremely attractive for commerce platforms. It gives frontend teams flexibility, reduces over-fetching, and allows rapid feature iteration. But that flexibility comes with architectural responsibility.

Without discipline, a GraphQL backend can become the primary bottleneck in your system. Query flexibility can translate into unpredictable database load, cache pressure, and rising infrastructure costs.

In this post, I will walk through practical strategies for designing a scalable GraphQL backend in Go, focusing on batching, caching, schema evolution, and architectural boundaries.

Architecture Boundaries First

The most important design rule is simple:

GraphQL is a transport layer, not a business logic layer.

Resolvers should orchestrate domain services. They should not contain core business rules, persistence logic, or infrastructure concerns.

A clean Go structure typically looks like this:

graph: schema and resolvers
domain: business logic
repository: database access
infrastructure: cache, messaging, external integrations

This separation ensures:

Business logic is independently testable.
Transport concerns do not leak into core services.
The system can evolve without coupling everything to the GraphQL schema.

When GraphQL becomes the center of your architecture instead of an interface, scalability problems follow.

Eliminating the N+1 Problem

The most common scalability issue in GraphQL systems is the N+1 query problem.

Imagine querying 200 products and resolving inventory and pricing for each product individually. A naive resolver might trigger a separate database query per product.

Instead of executing:

SELECT * FROM inventory WHERE product_id = ?

200 times, we batch.

In Go, a request-scoped DataLoader pattern works well.

Example loader:

type InventoryLoader struct {
    repo *repository.InventoryRepo
}

func (l *InventoryLoader) LoadMany(ctx context.Context, productIDs []int) (map[int]Inventory, error) {
    return l.repo.GetByProductIDs(ctx, productIDs)
}

Repository implementation:

func (r *InventoryRepo) GetByProductIDs(ctx context.Context, ids []int) (map[int]Inventory, error) {
    query := `SELECT product_id, quantity FROM inventory WHERE product_id = ANY($1)`
    rows, err := r.db.QueryContext(ctx, query, pq.Array(ids))
    if err != nil {
        return nil, err
    }
    defer rows.Close()

    result := make(map[int]Inventory)
    for rows.Next() {
        var inv Inventory
        if err := rows.Scan(&inv.ProductID, &inv.Quantity); err != nil {
            return nil, err
        }
        result[inv.ProductID] = inv
    }
    return result, nil
}

This reduces database round trips dramatically and stabilizes latency under load.

Batching is not optional in a high-traffic commerce backend. It is foundational.

Caching Strategy

In commerce systems, reads dominate writes.

A layered caching strategy typically includes:

Request-level in-memory caching
Redis for cross-request caching
CDN for public assets

The critical part is defining cache boundaries explicitly.

Cache keys must reflect business semantics. For example:

product:{id}
category:{id}:products
pricing:{region}:{product_id}

Invalidation must be designed before caching is implemented. Product updates must invalidate both product-level and category-level entries. Otherwise, you trade performance for consistency bugs.

Blind caching increases system fragility. Structured caching increases system predictability.

Schema Evolution Without Breaking Clients

GraphQL promises flexibility, but schema evolution is still a real operational concern.

In a fast-moving commerce environment, frontend and backend deployments are not always synchronized. Breaking changes in the schema can disrupt multiple teams.

Our approach was explicit versioning at the field level.

When introducing a new shape or behavior, instead of modifying an existing field, we introduced:

productPrice_v2
productPrice_v3

The older field was marked as deprecated:

type Product {
  productPrice: Money @deprecated(reason: "Use productPrice_v2")
  productPrice_v2: Money
}

We then observed usage of deprecated fields through schema analytics and resolver-level logging. Only after confirming that no active clients depended on the deprecated field did we remove it.

This approach provided:

Backward compatibility
Safe iteration
Clear migration paths
No surprise breakages

It does increase schema verbosity temporarily. But in a commerce platform where uptime and stability directly affect revenue, predictability outweighs aesthetic purity.

Concurrency and Context Discipline

Go makes concurrency simple. That does not mean it is safe by default.

Every resolver must respect context cancellation and timeouts.

Example:

func (r *Resolver) Product(ctx context.Context, id int) (*Product, error) {
    select {
    case <-ctx.Done():
        return nil, ctx.Err()
    default:
        return r.productService.GetByID(ctx, id)
    }
}

Downstream calls must propagate context properly:

func (r *ProductRepo) GetByID(ctx context.Context, id int) (*Product, error) {
    return r.db.QueryContext(ctx, "SELECT ... WHERE id = $1", id)
}

Ignoring context propagation leads to:

Wasted database queries
Goroutine leaks
Increased CPU under partial failures

Under load, these inefficiencies accumulate quickly.

Query Cost Control

GraphQL allows deeply nested queries. Without safeguards, a single request can become computationally expensive.

Defensive measures include:

Maximum query depth limits
Complexity scoring per field
Mandatory pagination for lists
Resolver-level timeouts

These constraints protect the database and cache layers from accidental or malicious misuse.

Flexibility without limits is instability.

Federation vs Modular Monolith

GraphQL Federation is supported in Go through tools like gqlgen and Apollo-compatible federation approaches.

However, federation introduces:

Cross-service latency
Schema coordination complexity
Operational overhead
Harder local development environments

For smaller teams or tightly coupled domains, a well-structured modular monolith in Go often provides:

Better performance
Lower cognitive load
Faster iteration

Federation should be driven by domain boundaries and team scale, not architectural fashion.

Observability is Not Optional

GraphQL can hide expensive work behind nested fields.

We instrumented:

Resolver latency
Database query duration
Cache hit ratios
Error rates per field
Query complexity distribution

Without field-level observability, performance regressions go unnoticed until customers feel them.

Scalability requires visibility.

Final Thoughts

GraphQL is neither inherently scalable nor inherently dangerous. It depends entirely on how it is designed.

In commerce systems, scalability is about:

Clear architectural boundaries
Batching and caching discipline
Safe schema evolution
Concurrency correctness
Controlled query complexity
Strong observability

The goal is not only high throughput.

The goal is predictable behavior as traffic grows and features evolve.

In backend engineering, scalability is less about speed and more about controlled complexity.

Best Practices for Writing Scalable and Maintainable Code

8 min read·Mar 22, 2025

Writing scalable and maintainable code isn't just about solving the immediate problem - it's about designing software that stands the test of time. We've all been there - coding a feature quickly and realizing later that the code becomes unmanageable. In this post, we'll explore how to avoid that fate by following best practices for scalable and maintainable code. Along the way, we'll also see how the SOLID principles can be applied to make your code more modular, flexible, and easier to maintain.

Modular Code

One of the core principles in software development is keeping your code modular and simple. By creating small, reusable functions and breaking down your code into manageable components, you make it easier to maintain and extend.

In Go, modularity can be achieved by dividing code into packages. This allows you to separate concerns and promote reuse. Let's look at an example from an e-commerce system, where we separate the logic for processing payments into its own package.

Example: Modular Code in Go (Processing Payments in an E-Commerce System)

// main.go package main import ( "fmt" "myapp/payment" ) func main() { transactionID := payment.ProcessPayment(100.50, "credit_card") fmt.Println("Transaction successful, ID:", transactionID) }

// payment/payment.go package payment import "fmt" // ProcessPayment handles payment logic and returns a transaction ID. func ProcessPayment(amount float64, method string) string { fmt.Printf("Processing %.2f payment using %s...\n", amount, method) return "TXN123456" // Mock transaction ID }

This example follows the Single Responsibility Principle (SRP) since the payment package only deals with payments. The Open/Closed Principle (OCP) is also supported: new payment methods (like PayPal) can be added by extending the payment package without modifying existing logic.

Design Patterns

Now that we've explored the importance of modularity in code and how it contributes to maintainability, it's time to turn our attention to design patterns. These patterns are reusable solutions to common problems that arise when building scalable and maintainable systems. By understanding and applying design patterns, you can structure your code more effectively, improve its flexibility, and keep it modular.

Let’s dive into some of the most widely used design patterns, such as Singleton, Factory, and Repository, and see how they can help your code follow SOLID principles while solving complex architectural problems.

Example: Singleton Pattern for Database Connection

A real-world use case for the Singleton pattern is a database connection manager, ensuring that only one connection instance exists.

package database import ( "database/sql" "sync" _ "github.com/lib/pq" // PostgreSQL driver ) type Database struct { connection *sql.DB } var instance *Database var once sync.Once // GetInstance returns the single database connection instance. func GetInstance() *Database { once.Do(func() { db, _ := sql.Open("postgres", "user=admin dbname=mydb sslmode=disable") instance = &Database{connection: db} }) return instance }

This follows SRP because the database package handles only DB connections. It also enforces Dependency Inversion Principle (DIP) by ensuring that high-level modules depend on abstractions, not concrete database implementations.

Example: Factory Pattern for User Authentication

A Factory Pattern can be used in user authentication, where multiple authentication methods (email, OAuth, etc.) are supported without modifying existing logic.

package auth type Authenticator interface { Authenticate(credentials string) bool } type EmailAuth struct{} func (e EmailAuth) Authenticate(credentials string) bool { return credentials == "valid-email" } type OAuthAuth struct{} func (o OAuthAuth) Authenticate(credentials string) bool { return credentials == "valid-oauth-token" } func AuthFactory(method string) Authenticator { switch method { case "email": return EmailAuth{} case "oauth": return OAuthAuth{} default: return nil } }

This supports OCP (new authentication types can be added without modifying existing logic) and Liskov Substitution Principle (LSP) (any Authenticator implementation can replace another without altering the system's behavior).

Example: Repository Pattern for User Data Management

A use case for the Repository Pattern is handling user data management in a microservices architecture. This example also follows the Interface Segregation Principle (ISP) by ensuring that different services only depend on the specific methods they need.

package repository import "errors" // User represents a user in the system. type User struct { ID int Name string Email string } // Segregated Interfaces following ISP type UserReader interface { GetUserByID(id int) (User, error) GetAllUsers() ([]User, error) } type UserWriter interface { CreateUser(user User) error DeleteUser(id int) error } type UserUpdater interface { UpdateUserEmail(id int, email string) error UpdateUserPassword(id int, password string) error } // Concrete implementation using an in-memory store type InMemoryUserRepo struct { users map[int]User } // NewInMemoryUserRepo initializes the repository. func NewInMemoryUserRepo() *InMemoryUserRepo { return &InMemoryUserRepo{users: make(map[int]User)} } // Implementing UserReader func (repo *InMemoryUserRepo) GetUserByID(id int) (User, error) { user, exists := repo.users[id] if !exists { return User{}, errors.New("user not found") } return user, nil } func (repo *InMemoryUserRepo) GetAllUsers() ([]User, error) { var userList []User for _, user := range repo.users { userList = append(userList, user) } return userList, nil } // Implementing UserWriter func (repo *InMemoryUserRepo) CreateUser(user User) error { if _, exists := repo.users[user.ID]; exists { return errors.New("user already exists") } repo.users[user.ID] = user return nil } func (repo *InMemoryUserRepo) DeleteUser(id int) error { if _, exists := repo.users[id]; !exists { return errors.New("user not found") } delete(repo.users, id) return nil } // Implementing UserUpdater func (repo *InMemoryUserRepo) UpdateUserEmail(id int, email string) error { user, exists := repo.users[id] if !exists { return errors.New("user not found") } user.Email = email repo.users[id] = user return nil } func (repo *InMemoryUserRepo) UpdateUserPassword(id int, password string) error { user, exists := repo.users[id] if !exists { return errors.New("user not found") } // For simplicity, storing plain text password (not recommended in real apps) user.Email = password repo.users[id] = user return nil }

This design adheres to ISP by ensuring that different services (e.g., authentication, user management, reporting) only depend on the necessary interfaces. It also promotes Separation of Concerns, making the code more maintainable and flexible.

Code Reviews

Regular code reviews are an essential practice for ensuring that code is both high-quality and maintainable. Not only do they help catch errors early, but they also provide opportunities for learning and improving. Code reviews promote knowledge sharing within a team and encourage adherence to best practices. I always appreciate feedback and enjoy helping others improve their code.

Naming Conventions

Clear and consistent naming conventions play a crucial role in making code more readable and understandable. Variables, functions, and class names should reflect their intended purpose, avoiding vague or overly generic terms. For instance, instead of naming a variable temp, opt for something descriptive like customerSessionData or articleList. This not only aids current developers but also benefits future developers who will work with the code.

In Go, the language encourages simplicity, and certain abbreviations are widely accepted - such as err for error and f for function. However, consistency is key. Once you establish a naming convention, it’s important to stick with it across the codebase. This ensures that other developers can easily understand the intent behind the code, without needing to delve too deeply into documentation.

By following consistent naming conventions, your code becomes more approachable and understandable, improving collaboration and speeding up future development.

Test-Driven Development (TDD)

Test-Driven Development (TDD) is a practice I strongly believe in. Writing tests before the code ensures that your solution meets the required functionality from the outset. TDD also provides immediate feedback on whether your code is working as expected. It helps you catch bugs early, which ultimately saves time and resources in the long run.

Refactoring

Even after code is written, it's important to keep refactoring to improve its readability and efficiency. As a project grows, it's easy for code to become messy or redundant. Periodically revisiting the code and simplifying it helps maintain its quality and reduces the likelihood of technical debt.

Documentation

Clear documentation is essential for any project, especially for large teams or long-term projects. Writing concise comments for complex sections of code or creating external documentation for the overall project ensures that everyone is on the same page. Documentation serves as a helpful guide for future developers, saving them time in understanding the codebase.

Conclusion

Writing maintainable and scalable code is not only important for the short-term success of a project but also for long-term growth. It improves collaboration, ensures fewer bugs, and makes it easier to add new features or modify existing ones without breaking the system. By following these practices, you'll not only improve the quality of your code but also set your project up for long-term success, keeping it clean, flexible, and adaptable to future changes.

Recommended Reading

"Clean Code: A Handbook of Agile Software Craftsmanship" – Robert C. Martin
A must-read for writing readable, maintainable, and efficient code. Covers naming, functions, comments, and structuring code.

"Design Patterns: Elements of Reusable Object-Oriented Software" - Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides
The bible of design patterns. Covers Factory, Singleton, Observer, Strategy, and more. Essential for software architecture.

"Agile Software Development, Principles, Patterns, and Practices" – Robert C. Martin
This is where SOLID principles were first introduced in depth.

"The Pragmatic Programmer: Your Journey to Mastery" – Andy Hunt & Dave Thomas
Covers a broad range of best practices, including modular design, automation, and debugging.

"Domain-Driven Design: Tackling Complexity in the Heart of Software" – Eric Evans
Essential for designing modular, scalable, and maintainable applications with DDD principles.