NoSQL and Elastic Caching in Papyrus

Mike Gualteri posted on his Forrester Research blog on Application Development about NoSQL and Elastic Caching. Quote: ‘The NoSQL idea is pretty simple: Not all applications need a traditional relational database management system (RDBMS) that uses SQL to perform operations on data. Rather, data can be stored and retrieved using a single key. The NoSQL products that store data using keys are called Key-Value stores (aka KV stores).’ Mike sees the difference as: ‘Ultimately, the real difference between NoSQL and elastic caching may be in-memory versus persistent storage on disk.’

I already posted about the powerful clustering and caching algorithms of the Papyrus Platform some time back. It was now interesting to read about combining NoSQL and Elastic Caching. The Papyrus Platform uses both the same concepts on the lowest layer to support the metadata repository, rule engine, and the distributed, object-relational database and transaction engine. Even the strict security layer and easy to use thick- and thin-client GUI frontend benefit from the powerful object replication and caching.

  • Reliability and Scaling: Papyrus offers the benefits of reliability and scaling through replication. Persistence and storage management concepts are defined on a per object type and node type form. Data can be spread across thousands of nodes. Also user PC’s can have their own local node and storage. Actually, that will be even true for mobile phone users once our mobile kernel will be available later this year for iPhone, WinMobile, and Symbian.
  • Fast Key-Value Access: Papyrus supports straight key-value access but also PayprusQL object-relational access (similar to Xpath), offering query and search across data in widely distributed KV storage nodes. Those can also be offline (dumped to tape or DVD).
  • Distributed execution: Papyrus executes object-state-engines and methods (implemented in PQL), events, and rules. The deployment of the application is automatic to the local node where the data is or any other chosen node. It does not take developers (clever or not) to distribute the load across multiple servers.
  • Change of data structures: Due to Papyrus WebRepository and its class versioning we can add fields to objects without the need to restructure database tables. New instances will simply have the new fields. Data storage IS NOT XML format because the performance to parse it is dreadful. Papyrus uses field-length-keyed hex-codepaged strings that can be parsed 20 times faster.
  • Latency: Papyrus can use transient objects that not saved to disk when the data does not have to be persisted. This significantly reduces the latency of data operations. In-memory operation is thus not a downside for large or persistent objects because it can be chosen per object type (class or template).
  • Reliability: Papyrus provides distributed caching with data replication algorithms to store the data on multiple nodes. If one of the nodes goes down, the load balancer in V7 will move the user session to another node and continue with the proxy objects there. A more efficient object distribution for a HA cluster will be available in Q410.
  • Scale-out: With Papyrus you add and remove nodes during operation. Currently the application can choose how the objects are distributed across nodes. The next release in Q410 will provide this distribution on system level as a part of the backup and recovery procedure.
  • Execute in data location: Using distributed code execution, developers can distribute the workload to where the data resides rather than moving the data to the application. Execution of methods on the owner node of the tool is the basic functionality. Full Distribution  is no problem with PQL.

It does not require enterprise application developers and architects to create architectures with the above features as they are embedded in the Papyrus Platform peer-to-peer kernel engine. Papyrus thus provides all the benefits of NoSQL and Elastic Caching without the technical complexity:

  • Achieve savings by reducing RDMS licenses and maintainance.
  • Add scaling layer in-front of databases, SOA or MQ messaging.
  • Build Web applications with shared session and application data.