Refactoring to CQRS

cqrs-kulendayz.tpetrina.dev

About me 👋

Toni Petrina
SRE @ Visma e-conomic
github.com/tpetrina
@tonipetrina1

What is CQRS?

Command Query Responsibility Segregation

Or in its weaker form Command Query Segregation (CQS)

Simply put

separate commands (operations that change data) from queries (operations that read data).

Can be done at:

code level - requires diligence (sometimes libraries can help)
architecture level - harder, but more benefits

How we do it today?

"Services" mix read and write operations
GETs that modify (referential transparency)
Lack of common interface between "operations"

An aside

In n-tier architecture, operations are clustered - all operations on User go in UserService and UserRepository

Hard to do higher-order operations e.g. decorators.

// No similarity between methods
public class UserService {
  public UserDto GetUser(int id) {}
  public IEnumerable<UserDto> GetUsers(int id) {}
}
public class OrganizationService {
  // No similarity between this and GetUser
  public OrganizationDto GetOrganization(Guid id) {}
}

An aside - example

Motivation

In CRUD domain entites are resources
Most UIs are task based and expressive, REST is not (but RPC is)
REST is incompatible with task based approach (but RESTful isn't)

class UserController
{
  [HttpGet]
  ActionResult<UsersDto> GetUsers() { /* */ }
  [HttpGet]
  ActionResult<UserDto> GetUser(UserId) { /* */ }
  [HttpPost]
  ActionResult<UserId> CreateUser(UserDto) { /* */ }
  [HttpPut]
  ActionResult<Unit> UpdateUser(UserDto) { /* */ }
  [HttpDelete]
  ActionResult<Unit> DeleteUser(UserId) { /* */ }
}

Simple RESTful UserController

Another problem with non-semantic entry points

Controllers have generic actions
Services have generic interface
Database have generic mapping (generic repository pattern)
Which leads to anemic domains and lots of boilerplate
Services, repositories and controllers are too similar, but can't be unified (role vs header interface)

Questions

How to know what is updated?
- Update email must have confirmation flow
- Update password is not the same as changing email preference
Fetching data is too generic
- Overfetching is rampant
- Hard to match fetch/update methods

CQRS is often combined with:

DDD - why aren't you using it anyway?
Event Sourcing - beyond the scope

Let's refactor!

record ChangeEmailDto(string NewEmail);
// new
class UserController {
  [HttpPost("change-email")]
  ActionResult<bool> ChangeEmail(ChangeEmailDto dto) {
    return _userService.ChangeEmail(dto);
  }
}
// vs
class UserController {
  [HttpPut("user/{id}")]
  ActionResult<Unit> UpdateUser(UserDto) { /* */ }
}

Semantic actions

interface IUserService {
  /* */
  bool ChangeEmail(ChangeEmailDto dto);
  /* */
}
class UserService : IUserService {
  /* */
  bool ChangeEmail(ChangeEmailDto dto)
  {
    /* */
  }
  /* */
}

Implementation in the service layer

class UserService {
  UserService(
    IUserRepository userRepository,
    IEmailService emailService,
    IStorage storage
  ) {}
  bool UpdateInfo() { /* */ }
  bool ChangeEmail() { /* */ }
  bool UploadAvatar() { /* */ }
  bool RemoveAvatar() { /* */ }
}

Explosion of dependencies

Problem

Becomes repetitive in different sense
Service layer essentially becomes namespace/module
Performance hit in the request DI build-up phase
Proliferation of get methods - semantic ones (but it might be good!)

interface IUserService
{
  /* */
  bool ChangeEmail(ChangeEmailDto dto);
  /* */
}
// actually typed as:
// ChangeEmailDto => bool

Start with the original method

// let's introduce command instead of Dto
class ChangeEmailCommand
{
  int UserId { get; }
  string NewEmail { get; }
}

Obvious intention and data (serializable method call)

// and the handler
class ChangeEmailHandler
{
  ChangeEmailHandler(
    /* dependencies */
  )
  {}
  bool Execute(ChangeEmailCommand command)
  {
    /* */
  }
}

Handling commands is isolated

interface IUserService
{
  /* */
  bool ChangeEmail(ChangeEmailDto dto);
  /* */
}
// =>
class ChangeEmailHandler
{
  bool Execute(ChangeEmailCommand command) { /* */ }
}

Refactoring to command

// marker interface
interface ICommand {}
// generic handler
interface ICommandHandler<TCommand>
  where TCommand : ICommand
{
  bool Execute(TCommand dto);
}
// our implementation
class ChangeEmailHandler
  : ICommandHandler<ChangeEmailCommand>
{
  bool Execute(ChangeEmailCommand command) { /* */ }
}

Refactoring to command

class UserController
{
  [HttpPost("change-email")]
  ActionResult<bool> ChangeEmail(
    [FromBody]     ChangeEmailDto     dto,
    [FromServices] ChangeEmailHandler handler)
  {
    var command = new ChangeEmailCommand(CurrentUser, dto.NewEmail);
    return handler(command);
  }
}

Time to update our user controller

class UserController
{
  ICommandHandler<ChangeEmailCommand> handler;
  UserController(ICommandHandler<ChangeEmailCommand> handler)
  {
    this.handler = handler;
  }
  [HttpPost("change-email")]
  ActionResult<bool> ChangeEmail([FromBody] ChangeEmailDto dto)
  {
    var command = new ChangeEmailCommand(CurrentUser, dto.NewEmail);
    return handler(command);
  }
}

Use Dependency Injection

Should Dto actually be a command?

Easier to evolve separately (contract)
Enrich command from ambient (e.g. controller) data
Merge route parameters and body

Some design decisions

Naming: Execute, Handle, Do, _
Return type: void, Task, Task<Either<Error, Unit>>, Task<Result>
Manual construction of handlers vs DI

On the query side

Similar interface IQuery<TQueryResult>
Similar handler IQueryHandler<TQuery, TQueryResult>
Also known as query object pattern

Meanwhile... on the query side of my architecture

Why are repositories bad?

Either you specialize them too much or have generic repository
Not all reads are equal
Specification pattern can be replaced with query object pattern
Reading in commands is different than reading from the client!

class UserRepository
{
  UserDto[] GetUsers();
  UserDto GetUser();
  UsersDto GetUserForAccount(int accountId);
  UsersDto GetArchivedUsers(int accountId);
  /* */
}

Either you specialize them too much or have generic repository

class UserRepository {
  // for HttpGet
  // mapping required, not all data needs to be present
  // /user/1
  // or as a lookup
  UserDto GetUser();
  // enriching (JOINs)
  // /me
  UserDtoWithNotifications GetUserWithNotifications();
  /* */
}
public class UserDto {
  int Id { get; }
  string Name { get; }
  // have fun with nullable
  IEnumerable<UserNotificationDto>? Notifications { get; }
}

Not all reads are equal

class UserRepository
{
  // specification is custom for each repository
  // combinatorial explosion
  // lack of predictability
  // tedious to implement
  // might as well switch to IQueryable
  UserDto[] GetUsers(UserSpecification specification);
}

Specification pattern can be replaced with query object pattern

class UserRepository
{
  // for HttpGet
  // mapping required, not all data needs to be present
  // not tracked by EF
  UserDto GetUser();
  // for updating user (domain model)
  // tracked by EF
  UserEntity GetUser();
  /* */
}

Reading in commands is _different_ than reading from the client!

Recap - read models

Hydrate for command
Lookup data for command handler (validation or reference)
A view for UI or report
Admin view (for all) vs user view (for me) - multi-tenant scenarios

Query objects are powerful

Avoid using domain model completely!
Can use raw SQL (or Dapper as opposed to EF in the command model)
- Use views!
Caching can be done only for this layer

record GetActiveUsersResult();
record GetActiveUsersQuery : IQuery<GetActiveUsersResult>;
class GetActiveUsersHandler
  : IQueryHandler<GetActiveUsersQuery, GetActiveUsersResult>
{
  // use Dapper
  IConnection connection;
  GetActiveUsersHandler(IConnection connection)
    => this.connection = connection;
  GetActiveUsersResult Execute(GetActiveUsersQuery query)
  {
    return connection.Query<GetActiveUsersResult>(
      "SELECT * FROM Users WHERE IsActive = 1");
  }
}

An example query

So if read layer can look differently...

Separating databases

SQL -> use replication and have replica used in the read model
Use different database for read (Mongo, Elastic Search)
Key idea is that we can split read from write model
- Not all applications have similar load on the read/write
- Read model can be simplified (document) while modifications are done on the normalized database

Example 1: digital signage application

CMS for editing media items, playlists, channels
Items in multiple playlists, playlists in multiple channels (normalized model, rare writes)
End devices pick channel and play schedule (poll for update, JSON document)
Update [dbo].[Schedule] on every change

Example 2: static content for mobile app

CMS for editing pages and their content
Devices poll for updates
Manual publishing - [dbo].[Cache] - it might take a minute

Spectrum of options:

Just separate commands and queries
Use optimized queries/views for reads
Add columns for caching
Add tables for caching
Different SQL instances (replicas)
Different DB technologies

Adopting CQRS increases performance

Faster to reason with (self-contained code)
Faster to run (smaller DI graph)
More manageable code (append only files, less conflicts)
Easier to test

Testability

class TestUsers
{
  [Fact]
  void User__CanChangeEmail()
  {
    // Arrange
    var handler = new ChangeEmailHandler();
    var command = new ChangeEmailCommand(1, "[email protected]");
    // Act
    var result = handler.Execute(command);
    // Assert
    result.Should().BeTrue();
  }
}

Unit testing is easy

[Test]
void UnverifiedNewUser__CannotCreateTasks()
{
  // Arrange
  new CreateUserCommandHandler().Execute(
    new CreateUserCommand("user@email.com));
  // Act
  var result = new CreateTaskCommandHandler().Execute(
    new CreateTaskCommand(
    name: "Task name",
    label: "Green"
  ));
  // Assert
  result.Should().BeFalse();
}

Easy to do integration tests!

[Test]
void VerifiedNewUser__CannotCreateTasks()
{
  // Arrange
  var id = new CreateUserCommandHandler(/* ... */);
  new VerifyUserCommandHandler()
    .Execute(new VerifyUserCommand(id));
  // Act
  var result = new CreateTaskCommandHandler()
    .Execute(new CreateTaskCommand(
    name: "Task name",
    label: "Green"
  ));
  // Assert
  result.Should().BeTrue();
}

Easy to do integration tests!

But that's not all!

Decorators on commands extract cross cutting concerns.

class EFTransactionHandler<TCommand>
  : ICommandHandler<TCommand>
{
  EFTransactionHandler(
    DbContext context,
    ICommandHandler<TCommand> handler)
    => (this.context, this.handler) = (context, handler);
  bool Execute(TCommand command)
  {
    using (var transaction = context.Database.BeginTransaction())
    {
      handler(command);
      transaction.Commit();
    }
  }
}

Handler returns another handler!

var handler = new EFTransactionHandler<ChangeEmailCommand>(
  _context,
  new ChangeEmailHandler(/* ... */)
);

Example usage

Use a library

MediatR for C#
Helper classes in testing

Takeaway

Refactoring traditional "service" object to CQRS involves separating the read part from the write part. After that, write part is split into task based methods as opposed to resource based methods. We can have multiple POST/PUT/PATCH methods that have semantic meaning.

Questions?

Thank you!

Toni Petrina
@tonipetrina1

cqrs-kulendayz.tpetrina.dev