Techstack

Thanks to a state-of-the-art tech stack – mainly written in Golang, TypeScript, and Scala – which is deployed by Gitlab CI to modern container orchestration systems, our event-driven backend platform handles over few billion requests and/or events daily.

Looking for job in tech?

1500 Mio. Requests / day

Hundreds million of mobile devices are getting served by our apis and we are providing a global failsafe backend infrastructure

distributes the incoming requests to the right target

CONTAINER ORCHESTRATION SYSTEM

~200 services running
~1000 tasks on
~200 EC2 nodes

200 services running

We chose Golang as our primary language as it is a modern tool to build simple, reliable and efficient software. Furthermore it’s built for high concurrency, performance and throughput. Another big advantage is that Golang compiles into small binaries which have a small memory footprint and result in very small docker containers.

Billions of logs per day

FluentBit is our main log processor and forwarder which makes all application logs from the ECS services available in our monitoring system.

EVENT LAYER

More than a million events per minute

we believe in event-driven architecture, our main event bus system is SNS+SQS

Makes it easy to generate events based on DynamoDB database activity

Billions of records and 2 TB of data per day

Large-scale event streaming

Logging and Metrics Layer

As a lot of open source projects have official support to run their software containerized in a kubernetes cluster, we use it to run our monitoring platform with Elasticsearch and Kibana for example.

billions of log events processed, stored and viewable – thanks to self-hosted Elastic

Awesome dashboard for our metrics

Observability plays a big role in our stack. Cloudwatch is our main pillar in regards to collecting and analysing metrics not only for the AWS services itself but also all of our golang applications are emitting a lot of data points to give meaningful insights.

In addition to logging application errors we’re using Sentry as our error tracking software to get an easy overview of what’s going wrong in the system.

Incident management tool – never miss an outage or service degradation

Logs and metrics are not enough? Tracing is another important part to provide observability in a distributed system. The X-Ray SDK supports us in instrumenting the golang applications to get traces.

Database Layer

Millions of queries per minute

One of the most popular in-memory databases out there. We do a lot of key value lookups for which we use redis as part of our caching layer.

~300 tables in use

For all high traffic and business critical frontend facing applications, we’re using multiple hundred DynamoDb tables as persistence layers which are handling over a million reads and writes per minute.

4 TB data

On top of our data lake in S3 we’re running Redshift as our main data warehouse to provide an unified access to our data including tables from our MySQL databases.

1 TB data

No system without a relational database. Of course we have these, too. Perfect fit for handling the entity management of our internal management applications with low traffic which don’t need auto scaling.

10 times faster than redshift

The engine which powers the dashboard our customers use to access their data. A powerful analytics database which provides interactive real time queries with sub second response times.

Big Data

Data Lake Pipeline

Billions of records and 2 TB of data per day

Able to handle every possible amount of our data to process and throughput in real-time

Firehose makes it possible to write this data in optimal size and format (parquet) for querying

Dozens of lambdas

Connected to DynamoDB streams, captures table activity and sends it to Kinesis

Glue is the backbone of this pipeline, holds schema and table information that is used for processing as well as for querying.

Data Science

Hundreds of cpus in use

Spark is used for handling ad hoc analytics, transformations and migrations in our data lake and on arbitrary files in S3. With increasing data size this tool turns into an inevitable one.

Convenient web-based notebook to query our data lake via spark and do ad hoc analytics.

Presto, fully managed in AWS Athena, is the first choice of running queries in our data lake, is a distributed SQL query engine for running interactive analytics queries against data sources of all sizes ranging from gigabytes to petabytes.

Deep Storage

300 TB data

S3 is used for justtrack and is our persistence layer for everything which isn’t going into a database. Especially with the support of the Hadoop Distributed File System (HDFS) it makes it quite easy to use it as storage layer for our data lake.

Interested?

Get in touch and apply today!

Apply now!