Report: Treasure Data fair usage limits
Executive summary
Treasure Data publishes a detailed ICDP Fair Usage and Performance Policy that sets tiered limits for ingest, storage, instances, users, and sandbox usage. The policy is explicit about enforcement (notifications, cure periods, and automatic upgrades/overage charges) and prescribes operational best practices. In practice these limits are generous for many enterprise workloads (high ingestion rates, petabyte-scale storage and a stated ~2 million records/sec ingestion capability), but they can also produce unexpected cost and operational friction when customers exceed thresholds or attempt production-scale testing in Sandbox environments.
This report presents evidence from Treasure Data documentation and customer case studies, contrasts supportive and critical perspectives, and ends with a pragmatic design checklist and patterns to reliably operate within the fair usage boundaries.
What Treasure Data says (selected excerpts)
"If Customer exceeds any of the above limitations for their purchased ICDP Tier: Notification of the excess will be sent to the e-mail address associated with Customer’s Instance; Customer will have until the end of the month in which notice is sent to reduce usage below the limits; and If usage continues to exceed the specified limit at the end of the month in which notice is sent, Customer will be automatically upgraded to the next level per the 'Upgrade Pricing & Overages Table'..." (ICDP Fair Usage and Performance Policy).
"Customers shall not run production-level volumes and/or frequency of real-time data processing in Sandbox Instances. Customers shall not perform sustained or high-throughput load testing, including ongoing simulations of user activity or latency tracking, in Sandbox Instances." (Terms of Service / Sandbox guidance).
"Treasure Data will provision sufficient infrastructure resources for customers to ingest, cleanse, unify, segment, activate, orchestrate and analyze their profile and behavior data volumes while meeting the following minimum performance standards: Query Success Rate (QSR): 99.9%." (ICDP Fair Usage and Performance Policy).
The two perspectives: TD Advocate vs Skeptical DBA
TD Advocate (supportive)
- Documentation is explicit and tiered: monthly ingestion and daily storage row limits are published and permit very large volumes (for high tiers, examples cited include monthly ingestion up to billions of records and Tier 13+ allowances up to 10,000 billion records) (policy).
- Enterprise customers successfully run large workloads: case studies show ingestion spikes like "61 million records in under 10 minutes" and unified profiles in the millions, with measurable business value (campaign time reduction, improved ROI) (FinOps case study).
- Treasure Data offers architectural guidance, e.g., aggregation at source/destination, Fluentd-driven collection for containers, and Lambda-like patterns to combine batch and stream processing—practical advice to stay efficient within limits (Distributed logging blog).
Skeptical DBA (critical)
- Enforcement mechanics create operational and financial risk: automatic tier upgrades or invoices for overages can produce unexpected charges if workloads spike briefly or testing is mis-scheduled (policy excerpt).
- Sandbox limitations and the modest provisioning of Dev instances ("typically 10% of a Production Instance") mean load-testing or accurate production simulation in Sandbox is not feasible—leading to blind spots that can cause overages in production (ToS and Sandbox guidance).
- Monitoring and automated prevention of overages are not described as strongly available in the public policy—customers must self-manage and notify Treasure Data of planned usage increases, which is a procedural (not automated) safeguard (policy notification section).
Where the perspectives agree
- The platform is architected to scale and supports very high ingestion rates and large storage footprints, but customers are expected to design resource-efficient workloads.
- Treasure Data provides guidance (aggregation, sequential workloads, avoiding idle sessions) aimed at avoiding unnecessary resource usage.
- Exceeding published limits is handled via notification, cure periods, and either upgrade or invoiced overage—so budgeting and planning are essential.
Practical failure modes and operational risks (evidence)
- Automatic tier upgrades/invoices when limits persistently exceeded can create unplanned cost hikes (policy) (policy).
- Sandbox instances are not production-like (10% of resources), restricting realistic load testing and increasing the chance of surprises when moving to production (ToS sandbox guidance).
- If customers do not proactively notify Treasure Data of planned increases, Treasure Data may defer non-critical workloads or apply fees—so reactive discovery of spikes is risky (policy notice).
Design principles to operate safely within fair usage limits
-
Plan capacity and notify TD before big changes
- Notify your Customer Success Manager at least 5 business days before planned increases in query volume/complexity to avoid surprises (policy guidance).
-
Reduce surface area: aggregate at source and destination
- Perform deduplication and summarization as close to the source as possible. Aggregate logs or events in Fluentd or edge collectors to reduce raw record counts (distributed logging).
-
Favor sequential, scheduled workloads over parallel bursts
- Treasure Data recommends running workloads sequentially to limit resource contention and reduce burst-triggered overages.
-
Partition and sample intelligently
- Use partitioned writes and time-based batching to limit hot partitions and sudden ingestion spikes; for testing, use representative samples rather than full-volume tests in Sandbox.
-
Use hybrid patterns where appropriate
- Offload extremely high-frequency telemetry to purpose-built streaming systems (Kafka, Kinesis) and send only aggregated, enriched records to Treasure Data. This follows the Lambda-like guidance combining batch/stream.
-
Instrument and automate guardrails
- Implement ingestion throttles, quotas, and alerts in your collector layer (Fluentd, custom agents) so you can shed or buffer traffic before reaching Treasure Data limits.
-
Optimize query patterns and caching
- Use pre-aggregation, materialized views (where supported), and cached results to limit heavy ad-hoc query loads.
-
Stage migrations and do dark launches
- When moving large workloads to TD, ramp traffic in stages and use feature flags/dark launches to measure real consumption before committing.
-
Treat Sandbox for integration tests only
- Because Dev Sandboxes are intentionally small, keep performance/load testing to short, controlled windows with explicit coordination with Treasure Data or use a paid temporary higher-tier environment for safe testing.
-
Contractual & cost controls
- Negotiate clear overage terms, set alerts for automatic upgrades, and include budget guardrails in your procurement to limit financial exposure.
Concrete architecture patterns and recipes
-
Source-side aggregation (recommended):
- Use Fluentd or a short-lived collector to aggregate and deduplicate events into time-windowed summaries before shipping. This reduces record counts and preserves rollups for analytics. (See Fluentd integration guidance: Fluentd + Docker logging).
-
Hybrid stream + batch (Lambda-like):
- Keep low-latency real-time routing for critical events; asynchronously aggregate/flush to TD for storage and heavy analytics. This limits write amplification and supports high-throughput use cases without constant spikes.
-
Sequential job orchestration:
- Use a scheduler (Airflow, Prefect) to orchestrate heavy ETL jobs to non-overlapping windows to avoid concurrent load that can trigger overage detection.
-
Throttled collectors with buffering:
- Collector agents should buffer to disk or SQS/Kafka when facing transient downstream pressure, and backpressure upstream to avoid sudden spikes.
Monitoring & runbook (essential checks)
- Daily ingestion and storage dashboards (compare to tier limits).
- Alerts: 60%, 80%, 95% of monthly ingest or daily storage quotas.
- Pre-deploy checklist: notify TD CSM; run a 24–48 hour ramp with stepped volumes; review costs from TD billing portal.
- Overage runbook: detect -> notify stakeholders -> reduce/hold upstream ingestion -> request temporary quota bump -> confirm billing impact.
Quick checklist for product teams (practical)
- Review your ICDP Tier limits and map expected monthly ingest and daily rows to your forecast (policy link).
- Implement source-side aggregation and deduplication.
- Use throttles/buffers in collectors and implement backpressure.
- Schedule heavy jobs sequentially; avoid overlapping jobs.
- Use representative sampling for Sandbox tests; plan a paid ramp for production-scale tests.
- Add billing alerts and test your alert-triggered mitigation steps.
- Negotiate contractual protections and temporary lift processes with Treasure Data.
Illustrative quotes & citations
"A leading cloud-based financial operations platform ingested 61 million records in under 10 minutes, unified 6.8 million customer profiles, and identified 27,800 overlapping customers across business units within four weeks." (FinOps case study).
"Customers shall not run production-level volumes and/or frequency of real-time data processing in Sandbox Instances." (ToS sandbox guidance).
"If Customer exceeds any of the above limitations for their purchased ICDP Tier... Customer will be automatically upgraded to the next level per the 'Upgrade Pricing & Overages Table'..." (ICDP policy).
Conclusion & recommendation
Treasure Data's fair usage limits are explicit and, for many enterprise customers, more than adequate—backed by case studies showing massive ingestion and unification. However, operational and financial risks arise when workloads spike, when Sandbox testing is mistaken for production-scale validation, or when monitoring/automation is insufficient.
If you plan to use Treasure Data at scale:
- Start with a responsible capacity plan, notify Treasure Data about major changes, and negotiate clear overage and temporary-bump procedures.
- Use source-side aggregation, throttling, sequential job orchestration, and hybrid streaming patterns to reduce raw record counts and burstiness.
- Treat Sandbox as limited; use staged ramps or paid short-term environments for realistic testing.
Following the design principles and recipes above will minimize the chance of surprise overages or performance problems while allowing you to benefit from Treasure Data's scalable platform.
Inline follow-up topics (suggested next truth reports)
- [[how-to-negotiate-overage-protections-with-treasure-data|How to negotiate overage protections and temporary quota bumps with Treasure Data]
- [[does-treasure-data-provide-real-time-usage-alerting-and-dashboards|Does Treasure Data provide real-time usage alerting and dashboards?]
- [[how-to-design-fluentd-aggregators-for-treasure-data-ingest|How to design Fluentd aggregators for efficient Treasure Data ingest]
- [[what-are-common-hidden-costs-when-scaling-cdp-workloads|What are common hidden costs when scaling CDP workloads on Treasure Data]
- [[can-hybrid-lambda-architectures-minimize-treasure-data-ingest-costs|Can hybrid Lambda architectures minimize Treasure Data ingest costs]
- What testing strategies reduce Sandbox-to-production surprises with Treasure Data
Sources: Treasure Data ICDP Fair Usage & Performance Policy and related product, blog, and customer pages (links embedded above).