The only way to work with 'real-time' data is to actually work with that data - not a facsimile or abstraction of it. To do this, we developed a way to simply connect your database directly to Chartio.
When you build a visualization in Chartio it creates a query that is sent back to the database and returns a set of results that are then charted. For most customers this is a light-weight solution to getting nearly immediate results from their data - hardware today is relatively fast, queries and databases relatively small. But for some customers working directly with an active data source is unthinkable for either security or performance reasons. Fortunately there are a myriad of good alternatives to connecting to an active production database that will still let you get the most out of Chartio.
If query load on your database is a concern, whether using Chartio or something home brewed, it is recommended that you run your analytics on a replica database rather than directly on the production instance. Analytics queries often have a different form than application queries and can supply a significant additional load to your database. You can create a live slave replica, or simply make a nightly copy. The advantage of the live slave is relevancy, but the nightly copy tends to be much easier to setup.
Each database in Chartio has an adjustable cache duration, which could be set to as much as an entire day. Every time a query is run in Chartio, the cache is checked first to see if the same query has been run recently. If the query has been run within your established cache duration the cached results are returned. If the cache is old, a fresh query is run, and the old cached data is shown with a loading spinner for you to view while you wait.
Our friends with really, actually, very big data rely almost solely on summary tables as a window into the activity in their database. A common practice is to generate these summary tables in a separate database once per hour, and then use the collected database of these summary records to perform analytics.
The down side of summary tables are the extra setup costs, and the loss of discoverability. Every time you'd like to measure a new statistic, you need to setup the new tables, a process to put the summarized info into it, and then wait a period for a significant amount of data to roll in.
3rd Party Warehousing is the process of uploading a copy of your database off-site, usually for analytic or auditing purposes, as well as data retention. In typical warehousing projects the data is 'cleaned' or formatted in some way as to make analytics faster and more reliable.
At Chartio, we are against the idea of warehousing your data with someone else for a few key reasons. First, you've already incurred many costs recording and storing your data - in salary, hard drive space, and engineering effort. Warehousing forces you to duplicate that cost by paying for storage and services off-site. As with duplicated databases, warehousing often has a painfully slow recency interval - a week or more in many cases.
Security is another issue - not only are you giving a third party access to your entire database, you're giving them ownership too. You no longer have control over knowing for certiain who has access to your data, how it is being used, and where it is being sent.
Finally, the old data processing steps of cubing, schema manipulation, and analytic purposed databases aren't as needed anymore. Advances in hardware and database design have made returning results from database queries much much quicker than when warehousing was first invented.
Request a Demo
Tell Us About Your Company