High Frequency Trading Platform: CSX

Overview

Led development of a high-throughput crypto trading platform (CoinSwitch Exchange – CSX), including an in-memory, event-sourced matching engine with strict ordering, exactly-once processing, and high availability.

Technologies Used

Distributed Systems
Events Sourcing
CQRS
Matching Engines
Go
Microservices
AWS
CloudWatch
Docker
Kubernetes

Context

CoinSwitch was building CSX, a crypto exchange platform with multiple order types, high throughput requirements, and strong correctness guarantees.

The system needed to ensure:

no order loss
strict sequential processing
financial correctness under concurrency
scalability across multiple trading pairs
operational clarity across many microservices

My Role

Technical lead and execution owner
Owned the matching engine and coordinated development across 7 microservices
Led and mentored a team of ~8 engineers
Coordinated with project managers and stakeholders on delivery and rollout
Responsible for:
- architecture decisions
- code reviews
- deployment strategy
- cross-team consistency and discipline

System Architecture Overview

Order Flow

API Gateway routed requests to the User Service
User validation and authentication
Order request passed to Order Service
balance checks
order validation
Validated orders were placed onto a currency-specific input queue
Each currency pair had its own dedicated matching engine
Matching results were published to an output queue
Downstream services consumed updates to reflect order state

An additional AML verification step existed in the flow before final acceptance.

Matching Engine Design

In-memory, single-threaded:
- avoided locking and race conditions
- guaranteed strict ordering
Exactly-once semantics (logical)
Event-sourced:
- every state transition recorded
- full replay ability of order books
Currency-partitioned:
- each trading pair isolated
- improved scalability and fault isolation

If a matching engine instance failed, another instance could:

replay events
rebuild state
resume processing without order loss

Order Handling Semantics

Market & Limit orders supported
Partial fills – remaining quantity stayed in the order book
Cancellations – removed from the matching engine if still open
State mutation was not allowed outside the matching engine
All external services treated matching results as immutable facts

CQRS & Microservices Discipline

CQRS applied to Orders:
- write service: order creation & validation
- read service: order status, history, views
Services communicated asynchronously
Duplicate processing prevented via order IDs
Clear ownership boundaries between services

Observability & Debugging

Built a shared logging module used across all services
Structured logs with:
- requestId
- correlationId
End-to-end tracing across microservices using these IDs
Debugging via CloudWatch log correlation
Designed for production readiness (work completed up to UAT)

Performance & Reliability

Benchmarked throughput at ~7,000 orders/sec (approx.)
Designed for:
- high availability via leader selection
- fast recovery using event replay
Focused on correctness first, performance second

Leadership & Execution

Enforced:
- consistent logging and tracing
- architectural boundaries
- clear communication standards
Ruthlessly prioritized P0 / P1 work
Blocked scope creep and unfocused discussions
Mentored a team of fresh engineers (Go)
- intensive PR reviews
- teaching correctness, not just syntax

What I’d Improve If Rebuilding Today

Stronger idempotency guarantees across services
Formalized queue abstractions earlier
Better domain abstractions for products and pricing
Even stricter separation of concerns between orchestration and execution