Cloud9 SaaS Analytics Platform
Introduction
Cloud9 Analytics’ mission is to deliver powerful business analytic applications for revenue performance management as a service over the Internet. As a SaaS vendor, the company’s offerings have to satisfy the expectations of line of business managers, the typical SaaS buyer. The fast growing SaaS marketplace, exemplified by companies like cloud CRM provider salesforce.com, has produced a checklist for successful SaaS offerings: they must be packaged applications, domain and role aware, delivering zero time–to–value, and requiring no IT staff or skills.
While SaaS vendors of transactional applications are finding great acceptance, no vendor before Cloud9 Analytics has ever delivered revenue performance management applications that met all these criteria. The Cloud9 Analytics’ platform architecture was designed specifically to do so, and was shaped by five key design requirements:
- Automated Data Warehousing
- Lazy Analytics
- Mass Customization
- Zero Administration
- Cloud Scalability
Each of these requirements led to novel decisions about technology architecture and implementation.
Automated Data Warehousing
The last three decades of IT experience with on–premise data warehousing can be typified by projects that are slow, incomplete and expensive. The mix of multiple tools and skills (database, ETL, quality, design, operations, etc.), complex business requirements, performance challenges, data volume growth and increasing velocity of business change led to expensive, failure-prone initial projects and large operational teams and overhead dedicated to ongoing maintenance.
Clearly, replicating this type of data warehousing project “in the cloud” makes little sense. The whole point of the economics of subscription SaaS offerings is that they can only be supported by fully automated, multi-tenant (or, even better, pooled) infrastructure. If teams of IT professionals are required to support each analytics customer, the SaaS model would not work.
Cloud9 Analytics recognized early on that a new approach to data warehousing was required in the SaaS domain, one that still delivered the “system of record” for historical truth, but one based on very different architectural considerations and technological capabilities.
Versioned Replication
Cloud9 started by deconstructing the traditional data warehousing infrastructure and processes and synthesized a radically new approach we call Versioned Replication. This innovation delivers the automation, flexibility and scale required to truly bring analytic solutions into the cloud era.
In traditional data warehousing, decisions about how the warehoused data will eventually be used are made up front, and locked into the physical structure of the data warehouse. First, teams of database designers, subject matter experts and business analysts work together to design large, complex schemas. Then database administrators instantiate those schemas in traditional relational databases and tune them carefully for performance. Finally, ETL developers define custom data extraction, transformation and loading scripts that physically move and reshape data from source systems into the data warehouse.
In Cloud9 Analytics’ Versioned Replication approach, no assumptions are made up front about how the warehoused data will eventually be used. The warehouse is managed separately from the solutions that will eventually be built on it. This allows the data warehouse tier to efficiently become the system of record for historical truth, and nothing less.
Cloud9’s automated data warehousing service is comprised of two main technologies: a replication service and a versioned data base.
Replication Service
Operationally, Cloud9’s Replication Service automatically introspects source systems, determines their structure and data content, then replicates the full schemas in Cloud9 Analytics’ systems. As metadata and data change in the source systems, Cloud9’s replication service efficiently captures just those changes and, again, replicates them in the company’s systems.
The Replication Service is designed to work across the Internet, and with many different types of source systems. Internet-based communications protocols are not transactional, so referential integrity in the replicated data cannot be assumed. A sophisticated set of recovery and cleansing algorithms are employed to adapt to real–time network and source system behavioral anomalies, and to post-process the replicated data before it is integrated into the data warehouse.
The replication approach to data warehouse construction eliminates the need to hand design customer–specific warehouse schemas, or to manually craft complex ETL scripts to move and transform data. Cloud9’s replication service does all that automatically.
Versioned Database
A replication methodology has many advantages, but also some challenges, especially for a data warehouse. One of the primary responsibilities of a data warehouse is to accumulate the historical data that is continuously overwritten and/or periodically purged from source systems. Typically, with naive replication into an off–the–shelf relational database, whenever a data value or metadata entity is updated or deleted in the source systems, the destructive change propagates to the replicated copy and history is lost.
Cloud9’s solution to this problem is to preserve the advantages of replication, but remove the limitations of naive replication into standard databases. To that end, the company has developed a proprietary data management technology called a versioned database, which is much like a familiar relational database, but with a few key advantages.
- 1) Update and Delete events are not destructive.
- In a versioned database, when a destructive operation is requested, instead of overwriting and/or removing the previous value of the data, the system simply creates a new logical version of the database and the new value appended. Changes to the database are therefore cumulative instead of destructive, giving the versioned database a built-in native time dimension.
- 2) Changes are efficiently managed.
- In a versioned database old values are not erased as changes accumulate, and therefore the database becomes larger. In this case, it is important to optimize the representation of versions so that storage grows only with the actual size of the changed values. As an example, if a single value is modified in a row, only the new atomic value is stored, not a new, slightly modified copy of the entire row.
- 3) Structural changes are versioned like any other change.
- A versioned database must adapt to ongoing schema changes automatically. When tables and/or columns are created, modified and dropped in source systems, the replicated data warehouse must reflect those changes for go-forward versions, while retaining the previous structure for historical versions.
- 4) All versions of the database are equally accessible.
- A versioned database must be query–able within the current version, within historical versions, and across versions with constant overhead. To efficiently support analytic applications, the overhead of querying a particular version should be proportional to the size of the database, not to the effective date of any particular version.
- Cloud9 provides both proprietary (object oriented API) and industry standard (SQL–92 with version extensions) interfaces to its versioned database technology. The company’s versioned database technology serves as the historical system of record for Cloud9 applications, and works hand in hand with the analysis service that powers the end user application experience.
The same security you already have
Finally, SaaS applications would be worth little if they did not provide the same security users and IT departments have come to expect from their traditional CRM systems. To that end, Cloud9 Analytics platform provides security to match, literally, the best established CRM systems have to offer:
- Full enforcement of CRM authentication and authorization
- No additional security administration required
- Uniform security model across all user experiences (even ODBC)
- Encrypted communications
- Data center certification
- Per–customer data segmentation
Bottom Line
Delivering analytic applications with a true SaaS model requires innovation in both the methodology and mechanics of data warehousing. Simply applying traditional on-premise models does not work. Cloud9 Analytics has developed a novel data warehousing methodology, Versioned Replication, and a set of proprietary technologies that together deliver automated data warehousing and serve as the foundation to the company’s unique revenue performance management applications.