Papyrus Application Scalability

Many Papyrus Platform installations have recently grown substantially and therefore means of application scalability have become a common subject. The principal scalability of the Papyrus Platform is unlimited due to its peer-to-peer cluster design but it is restricted by the synchronicity needs of the application. Simply reading or displaying documents or content from storage has no limitation, but keeping multiple write accesses in sync creates scalability issues. Clearly, when the number of users is doubled from 1000 to 2000 users then it does require monitoring and if necessary scaling the HW. When the number of documents or process tasks is doubled from 1000 to 2000 per hour that also requires consideration ow that load can be safely spread across server nodes.

Scaling Papyrus Platform applications is very simple compared to for example three-tiered Web/Java/SQL/SOA application clusters. A more detailed discussion of Java Application scalability I have posted on my ‘Real World’ blog.

User complaints that the ‘application is slow’ without any details being measured are not helpful but have to be taken seriously. Often the user has no means to understand that the processing requirements for similar looking documents can differ substantially. The document might be simple for the user but the backend process can be fairly complex. We propose that proactive monitoring is established. The Papyrus Platform has all the means for such monitoring available.  Simple dashboards can be created and summary reports are available.

Scalability is not only about tuning or maintaining an acceptable response time for a growing number of users. It is unreasonable to expect that the system will handle growth in users and transactions automatically. No system does. Papyrus has many load balancing and tuning options and many are set either by default or by the system. We found that some automatic functions had to be changed for weak networks or when growth was not gradual but for example many users, nodes or new documents and processes were added at once. That is related to the automatic version control and deployment of the Papyrus Platform.

Document application are a complicated conglomerate of GUI, process, and rule execution threads that read and write data from a number of service interfaces or databases. The performance of the SOA backend service interfaces or the database is much more relevant to the scalability than the user front end. Using common database or transaction measurements the Papyrus Platform executes millions of transaction per hour. That is however irrelevant. The question is how that translates into user experience!

The following measurements are used to define the quality of scalability:

1) IRT – Initial Response Time: from request to first usable feedback

2) TRT – Total Response Time: from request to completed function

3) SRP – Service Processing Time: the time the request is actively being processed

4) SQT – Service Queuing Time: the time the request waits for processing

5) ATT – Average Transaction Throughput: transactions performed in a time window

These values can not be taken from the system itself but the measurement of user experience has to be defined in the application. In a multi-tiered Java application that is practically impossible. A screen refresh may or may not be connected to a previous user entry. Papyrus assigns a JOB-ID to each user data entry and enables the tracing through all the functions and servers. A JOB-ID will now trigger the five above measurement points and thus enable a real-world measurement. It is planned to provide a ‘User Experience Dashboard’ for each user that will display a kind of ‘VitalSigns’ statistics for real-time feedback.

The Papyrus Platform uses the Application Performance Analyzer or APA to measure those values and relate them to the general statistical data about CPU, I/O and RAM usage.

APA Tuning and Monitoring

APA offers a unique level of insight across all application functions from the user click on the desktop or portal to the final display. Next to the elapsed time measurements, the measurement of resource usage is the key to understanding and tuning. How much CPU, RAM, disk I/O and network bandwidth is consumed per transaction and in total has to be known for tuning.

Web/Java/SOA applications have substantial overhead for sticky-load-balancing, transaction-safe Java caching and database clustering, and parsing and validating the XML data for SOA data communication. Rather than the immense complexity of clustering multi-layered caches – with multiple conversion from tables to cache pages to objects and reverse – the Papyrus Platform collapses the horizontal structures and work with a purely (vertical) object model from definition to storage. The proxy replication mechanism of the Papyrus Platform uses a partitioned caching concept where there is always a unique data owner and each server node uses a replicated copy that is either pull or push updated. The objects do not have to be de/serialized as they are cached/replicated/binary-stored as is. The same object caching mechanism works transparently for all objects regardless whether they are populated via Web services, external databases or from the Papyrus objectspace.

As a consequence of this future proof design the Papyrus Platform can scale linearly as servers and nodes are added to the application clusters. It is however important to segment the database properly. Papyrus provides transparent object access and search across any number of nodes without any changes to an application. it is important to understand that even with a powerful system such as Papyrus the application has to be defined scalable as much as the system.

Forrester Research rates Papyrus a ‘Strong Performer’!

I this post I want to comment a bit more on ‘The Forrester Wave – Document Output for Customer Communications Management’.

Overall, Analysts Sheri McLeish and Craig Le Clair set a fairly high goal to analyze the current market offering in document output management. Only vendors with enterprise class functionality in interactive, on-demand and structured document creation were analysed. The ISIS Papyrus Platform was positioned as a strong performer in the same league as all the large vendors. No overall leader emerged.

“ISIS Papyrus by far owns the broadest vision for DOCCM in the industry,” stated The Forrester Wave™: Document Output For Customer Communications Management (DOCCM), Q2 2009. Forrester defines DOCCM as software used to compose, format, personalize and distribute content to support physical and electronic customer communications and improve the customer experience.

The ISIS Papyrus Platform v7 was reviewed in The Forrester WaveTM as follows: “ISIS scored well as a Strong Performer across all segments and as a well-balanced product with enterprise potential. ISIS Papyrus has the broadest and most unique vision for DOCCM in the industry, supporting ECM, CRM, analytics, event processing, and BPM. It has a modular and easy-to-configure system based on patented technology that combines with innovative post composition, output management, and production management.”

Forrester also reviewed reference installations and stated: “Customers are committed to the ISIS vision and feel passionate about this vendor.”

I am pretty satisfied with the report but it obviously is highly subjective how the functionality, strategy and market presence is rated. If you purchase the report you can change the weightings in the spreadsheet to find your own leader. What comes out of these reports has a lot to do with how clever the vendor positiones features in relationship to buzzwords and trends at the analyst firm. As Papyrus is always against the current trend we usually do not too well in such reports except when a more detailed comparison and an in-depth evalutaion of long term benefits is made. Butler Group did that excellently.

Also at Forrester marketshare and business size is still number one. Therefore a fairly weak strategy and little innovation goes a long way with a huge vendor. A smaller vendor who is very innovative is seen at risk because of that. So you cannot expect that small innovative businesses are at the forefront in terms of ratings, except if once again you hit one of the buzzwords that matches the analysts predictions! There is kind of a commonality between IT and stock market analysts and it is related to self-fulfilling prophecies.

I noticed a lack of understanding in what the various product functions actually are supposed to do. The true complexity of structured output functionality was underrated with fairly basic documents being considered as the norm. For interactive output Forrester mostly rated forms filling applications that in most cases are not enough for the most basic requirements. The document generation with such vendors is then always additional hardcoding in Java or similar. OnDemand generation is defined as document request from other application and widely underestimates the need for interactive text editing as well. Do not forget that you also need the ability to collate, bundle, sort and resend those documents either on paper and electronically and that this ability is neatly integrated with the Papyrus Platform.

What Forrester considered as being ‘Workflow’ or ‘process’ for document application cannot really be taken into account. Only the large BPM setups of either EMC or Oracle would come close to what customers are asking and getting today from a Papyrus Platform. What Papyrus EYE offers in terms of GUI not even Adobe FLEX can come close, but how would Forrester understand that? Therefore if you need highly user-customizable content, free-text editing, dynamic processes for creation and production, powerful color charts, image and graphics you better make sure that you do a proof of concept before signing a contract.