No-Index Log Management at S3 Scale

Database indexes are invaluable for information systems with low throughput, low latency and high consistency requirements. Both compute, and disk space are required for creating indexes along with any required operational overheads. Often, resource, time, and cost to maintain indexing far outweighs the performance objectives of the log management tool itself.

LOGIQ’s log analytics has a unique no-index approach to log management allowing infinite scale, while ensuring search and query performance. For achieving this, we have to solve the problem of infinite scale for both our data and metadata stores.

LOGIQ maintains its metadata in Postgres. However, that cannot scale infinitely without incurring significant cost. Our Hybrid metadata layer manages the migration of metadata tables between postgres and S3. Metadata that is old, is seamlessly tiered to S3 and is fetched on-demand when needed. The Key/Value nature of S3 allows us to fetch granular metadata on-demand without additional indexes being maintained.

A similar approach is applied to data. Incoming data is broken into chunks and stored in a partitioned manner so object lookups for e.g. a namespace or an application does not need additional indexes. The object key implicitly encodes the index information. This makes lookups and retrievals efficient when data is needed from the S3 layer that is not found in the local disk cache.

LOGIQ’s architecture offers unique advantages by using S3 as its primary storage location. Yes! S3 is not a secondary storage tier in our architecture.

  • S3 storage for data and metadata

    Storing both data and metadata vs using local storage significantly reduces the total cost of the solution. Most scaled out self-service log analytics solutions require costly management of volumes at scale! LOGIQ abstracts it as an S3 API.

  • No-Index log management

    Eliminates costly compute and storage that would otherwise be used constantly for indexing, rebuilds etc.

  • Eliminate Data egress cost

    When running in public cloud environments, deploying LOGIQ with the S3 bucket in the same region eliminates costly egress and data transfer costs that can run into tens of thousands of dollars when sending data to an external cloud provider.

LOGIQ is the first real-time platform to bring together benefits of object store (scalability, one hop lookup, better retrieval,  ease of use, identity management, lifecycle policies, data archival etc) and distributed compute via Kubernetes, along with highly configurable dash-boarding, query, alerting and search. As a result, we provide much reduced cost, easy integration with other analytics tools, and operational agility.

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Logiq

Log Analytics finally got wings!

Log Analytics got wings with S3

Logiq

By LOGIQ team

Right, a few months back, few of us came together with one mission: To give log analytics its much-desired wings. BTW It’s not just AWS S3; you can use any S3 compatible store.

Logging is cumbersome, so we built the LOGIQ platform to bring easy logging to the masses. A platform that is so easy you can be up and running in less than 15 minutes. Send logs from Kubernetes or on-prem servers or cloud virtual machines with ease.

What If

  • you can retain your logs forever
  • you don’t have to worry about how many gigabytes of data you produce
  • you can get all crucial events out of logs without a single query
  • you can access all of it using a simple CLI or a set of API’s
  • you can build cool real-time dashboards

And what if all of this comes with a fixed cost.

How Do We Do It?

We use S3 (or S3 compatible storage) as our primary storage for data at rest. LOGIQ platform exposes protocol endpoints – Syslog, RELP, MQTT. You can use your favorite log aggregation agent (Syslog/ rsyslog/ Fluentd/Fluent-bit/ Logstash) to send the data to LOGIQ protocol endpoints. (oh we have REST endpoints too). Our platform indexes all the data as we ingest. Knowing where the data is, makes it easier to serve your queries. Use our CLI/REST/GRPC endpoints to get data, and it is served directly from S3. Yes, “The Wings”

Dashboards, Covered!

If you think it’s just about the CLI and API’s, we have cool customizable dashboards too and a powerful SQL interface to dig deep into your logs.

Real-time, Covered!

Want to know what is happening in your servers real-time, yeah use our CLI to tail logs from servers in real-time. (just like you do a tail -f)

> logiqbox tail -h 10.123.123.24 -a apache

What we solve?

Most of the solutions out there hold data in proprietary databases that use disks and volumes to store their data. What’s wrong with that? Well, disks and volumes need to be monitored, managed, replicated, and sized. Put clustering in the mix, and your log analytics project is now managing someone else’s software and storage nightmare.

With S3 compatible storage, you abstract out your storage with an API. 10G, 100G, 1TB, 10TB, it’s no different.