piątek, 8 września 2017

Integrate systems with processes - jBPM workitems

Processes are almost always completed in several steps, or activities. These activities work together to achieve common goals that define the process they are part of. Let’s just jump straight into an example:
Mary's goal is to go on vacation. To achieve this goal she has to complete several activities such as schedule time off with her employer, book a flight and a hotel, notify her friends on social media about her travels, find transportation from the airport to her hotel, and so on. There can be also a number of unforeseen activities that happen during her travels such as updating health insurance info in case of an illness/injury or checking the weather forecast in case of a storm at her destination location. The point here is that all of  Mary’s activities involve existing systems such as booking, bank, online systems etc.. Mary’s activities have to be able to integrate with these systems in order to achieve her goal.
If we look at Mary’s activities making up a process which we have to define and implement and then look at any process runtime engine out there it is clear that none come out-of-the box with every single integration point Mary needs. It’s clear that the same pretty much applies to any process out there so we need an easy and intuitive way to define and implement integration points to other existing systems.
So how do we deal with this integration of processes and systems in jBPM? Well the good news is that jBPM out-of-the-box provides integration with all existing and future systems out there. The bad news is that's not really true. What jBPM does provide however is the ability for you to define and implement these integration points rather quickly and start using them right away to first of course test and then run your processes. This integration is done through (to use the jBPM lingo) “custom work items”, process activities where you as developer can define how Mary books her flight, what information she will need to make the booking, what company she will book from, etc etc.
Some people when reading this might think like “Oh, here comes yet another proprietary integration technology that I’ll be stuck with forever and eternity.” but that is far from the case. The short answer here is that the integration between systems and processes with jBPM is made of two parts, the markup part (BPMN2) which is completely portable and the definition/implementation part which is just your own implementation code, so it’s reusable between different systems. For people that are concerned about the setup time for all this, be assured that it is not bad as jBPM provides tooling (Maven archetype, installation of custom work items in the workbench etc) that help you along the way to get your job done. All of this will be explained in detail in the rest of this text so if you have gotten this far and are still interested in what jBPM has to offer here stick around and read along. As always comments and suggestions are more than welcome, and in case you are a cool developer that would like to contribute to this or any other parts of jBPM here is the starting page to get your started on that: https://www.jbpm.org/.
  1. Let’s get started - jBPM workitems module
If you are starting to use or are already a jBPM pro you should/will know where the code is on github: https://github.com/kiegroup/jbpm. jBPM is made up of many different modules but the one we are concerned about here is the jbpm-workitems module: https://github.com/kiegroup/jbpm/tree/master/jbpm-workitems. This module in itself contains many more modules so let’s take a look at what’s there:
  1. Pre-existing integration points: jbpm-workitems module contains a number of pre-existing integration points such as Email, Jabber, Rest, Webservice, etc.  These are the “out-of-the-box” integration points that you can use as-is or start extending for your own process needs.
  2. Light-weight core: the jbpm-workitems-core module contains the core utilities (and test utilities) to start creating and testing your new custom work item with jBPM. All custom workitem modules extend this core.
  3. Workitems Maven archetype: jbpm-workitems-archetype is a Maven archetype you can use to easily create your own workitems maven project. We will go through all steps on how to use this archetype in the sections below so don’t worry. This archetype not only helps with workitems project generation but also helps you with automatic generation of your integration configuration definition so all you have to be concerned about is your own implementation of the integration. Another nice thing about using the workitems archetype is that the maven project it creates for you has the same structure as any other module jbpm-workitems so if you can are are willing to contribute your integration solution to jBPM and share with the whole jBPM community the archetype makes that very simple for you.
      2. Learning by doing - let’s help Mary!
For the sake of all of our times here we will focus on one specific part of our process to help Mary during her vacation planning activities. We will help her determine what type of clothes she should pack depending on the weather forecast during the time of her travels. In our example this will be the only activity we implement as an integration point and we will integrate with the free Yahoo weather service (via an also free Java api for this service) to find out everything we need to help Mary make her clothing decisions.
    a) Install jBPM
There are a number to start using jBPM locally. You can find the download links and instructions  here. You can also clone the jBPM git repo directly and install it to your local Maven repo with:
mvn clean install -DskipTests
Whatever installation path you chose at the end you should have the jBPM workitems archetype registered locally and be able to start using it.
    b) Build your workitems module
Create a directory (let’s say “myworkitems”) where you want your workitems module to be. Inside that directory let’s run:
Here is the Maven archetype:generate command so you can cut/paste it. Make sure to change the artifactId and the classPrefix to whatever you want (you can also change the version of your workitem if needed, it defaults to 1.0).
The archetype:generate command will prompt you to confirm the given inputs through the above command. Just type “Y” to confirm if everything is per your liking. The command should produce output similar to this:
You should now see the “weather-workitem” maven project generated for you inside your myworkitems directory. This is our base workitems project so let’s import it into your favorite IDE and start working on it (I use IntelliJ but you can easily upload maven projects inside Eclipse etc).
The workitem archetype has generated for us a multi-module Maven project which already includes our base workitem handler (the Java code that will get executed when our process reaches our weather workitem, a base JUnit test for our workitem handler as well as Maven assembly instructions on how to package our workitem so that it is easily uploaded to the Kie Workbench or the jBPM workitem repository.
    c) Implement workitem details
Now let’s expand on top of what the generic project that the jbpm workitems archetype has created for us to start plugging in the integration logic to the Yahoo weather service. For this let’s start editing our WorkItemHandler class:
Ok, so this need a little explanation. A WorkItemHandler is a class that will get executed during process execution when the process reaches our weather activity. The jBPM runtime will then call our handlers executeWorkItem method passing in a WorkItem instance which includes our activity’s input parameters (which we specify). After our integration implementation code is completed we can chose to return back some results, or simply complete our workitem handler by calling the completeWorkItem method on the passed in WorkItemManager instance.
I’m sure you also noticed all the Java annotations on top of our class definition. They are here to help us generate our workitem configuration metadata which then is used by process editors (such as the jbpm-designer) inside the Kie workbench to allow us to drag/drop our workitem activity as we can any other BPMN2 type onto our canvas during process modelling. This configuration is done in a separate file and uses MVEL format but we don’t have to be concerned with it as we can use simple annotations to define our workitem configuration and our project already has assembly instructions to generate this configuration file and package it properly for us.
There are 3 annotations that we need to tackle in order to correctly set up our workitem configuration, namely @Wid, @WidParameter, and @WidMavenDepends (by the way Wid stands for workitem definition). The @Wid annotation is the parent annotation where we specify configuration information about our workitem such as its display name, description, display icon, the Java handler which should be executed when our workitem is active during process execution (defaultHandler), the input and output parameters of our workitem as well as all maven dependencies we need in order to properly execute our workitem implementation during process execution. With this information both the BPMN2 editor as well as the jBPM runtime will be able to know how to display our workitem as well as how to execute it.
The workitem archetype we used has already set up all the parameters with default values for us. All we need to do now is expand on that and implement our own integration with the weather provider system.
After some “fun” coding our workitem handler could look like this:
Let’s see what we have changed from the default handler code generated by the archetype. Well first we have defined one input and one output for our workitem namely LocationZip and WhatToWear. LocationZip is the information Mary has to provide and is the zip code of the location where she is traveling to. WhatToWear is the response of our workitem once we have the weather information for her location from the our weather integration system of choice. We have also updated the mavendepends configuration section to add maven dependency to the Java api which we use here to communicate with the Yahoo weather system. This is needed so that all proper dependencies are available when our workitem handler is being executed.
The actual implementation in the executeWorkItem method is rather simple (and just a test impl for this example). We contact our weather service passing it the zip code Mary provided and retrieve the weather forecast. The evaluation of this data should ideally be implemented via business rules (Drools) in a separate Business Rule Task inside our process but for the sake of this example we make a simple (and most likely wrong) assumption that if the average temperature in the next 10 days at the specified locations is less than 20 degrees Celsius it will be “cold” and if it’s above it will be “warm” enough to pack a swimsuit. Yes, rather simplistic but you can go all out and implement this any way you feel in your workitem handler.
By the way, don’t forget to also add the maven dependency of all apis you may use in your projects pom.xml. For this example we had to just add to the dependencies section:
<!-- weather provider api (yahoo) -->
   <dependency>
     <groupId>com.github.fedy2</groupId>
     <artifactId>yahoo-weather-java-api</artifactId>
     <version>2.0.2</version>
   </dependency>
At this point you can also update the already generated workitem test class and run the JUNIT tests to make sure that everything is fine.
    d) Build, deploy and start using your new workitem
You have your workitem implementation done and tested and now we are ready to build our workitem project and start using it inside the Kie Workbench. Yay!.
First thing let’s build our project. Under our project directory run:
mvn clean install
This will create build our workitem project, run our tests and produce a zip file which includes everything we need to start using our workitem. The zip file we need is located after project build in your $project_dir$/target directory and is called for our example jbpm-workitems-weather-workitem-1.0.zip. Create a new directory anywhere on your local file system and then let’s unzip our jbpm-workitems-weather-workitem-1.0.zip there, you should have something like:
The file structure we extracted from the zip file not only includes already all the files we need to start using our workitem in the jBPM tooling and the Kie Workbench but also already has the exact format needed to be uploaded to the jBPM Workitem Repository.  We will go into this repository in more detail in the next writings but if you are already familiar with it then note that you can upload the files here as-is and there is no changes needed to start installing your workitem via the jbpm-designer in the workbench.
e) Install your workitem to the Kie Workbench
In order for our workitem implementation to be recognized and available inside the Kie Workbench we need to install it first. There are a couple of ways you can do this depending on what’s best for you:
  1. Install on appserver startup: This can be done by adding the following options to the appserver startup command:
./standalone.sh -Dorg.jbpm.service.repository=file:///Users/tsurdilovic/devel/tmp/myworkitems/weather-workitem/jbpm-workitems-weather-workitem/target/tmp -Dorg.jbpm.service.servicetasknames=WeatherWorkitem
-Dorg.jbpm.service.repository specifies the location on our file system to the directory where we extracted our workitem zip file, and -Dorg.jbpm.service.servicetasknames specifies the names of all our workitems (in this case just one) that we want to install once the server starts up.
With this approach on app server startup our workitem will be installed directly.
   
    2) Create a business process in Kie workbench and install it from your workitems repo:
With this option you can either first upload the contents of our created zip file to your remote jBPM workitem repository or you can use the file system directly to point jBPM designer to install our workitem assets directly from where we extracted the zip file. Here is how that would look:

This allows you to connect to your jBPM workitems repository via url and will list all available workitems that are in the repository. We can now click on the little wrench icon next to our weather workitem and the workbench will install it for us. Once installed you will need to save and re-open your process in order for it to be present in the palette under “Service Tasks” (note that if in our handler you specify a category for our workitem configuration our workitem will be under that section in the palette.
f) Create our business process and execute it
Once we have started our appserver running the kie workbench war and have given it instructions to install our created workitem we can go ahead and create a new business process that can look something like this:
As you can see here our workitem was installed and is ready to be used within the jbpm-designer palette on the left-hand side. Remember the icon we specified in the @Wid annotation of our workitem handler class? Well I changed the default icon that is provided by the archetype to a more “weather-ish” icon here and you can do the same (any png you want, prefered size is 16x16 pixels). Our simple business process is very simple - when the process starts Mary will be prompted to enter in a zip code of the location she is traveling to. The process then will run through our workitem handler executeWorkItem method when our workitem is activated and then map our workitem results to a process variable which is then just printed to the console via the “print results” script task...and that’s it!
You will notice that since we asked our appserver to automatically install our workitem it took care of setting our projects dependencies for us as well as add the handler information in the projects deployments descriptor as well for us:
Now all that is left is for you to compile and deploy your project (you will need execution server running on the same or another app server and connect it to your workbench) and start playing around with your test process.
I hope this text shows how easy it is to start integrating different services inside your processes in jBPM. We plan to further increase the automation and lessen the amount of work you have to do to get up and running with this and many other aspects of jBPM in the near future.

czwartek, 24 sierpnia 2017

Cloud runtime architectures for jBPM

In the days where more and more software is moving to cloud, I'd like to take a moment (or two) to describe various runtime architectures that jBPM can be deployed with.

This article mainly talks about version 7 and onwards though some of the aspects are also applicable for version 6.

Terminology



  • Admin Console - is workbench (or it's lighter version that includes only runtime views) with embedded controller
  • Controller - is KIE Controller used with KIE Servers that are running in managed mode
  • Smart Router - is a optional component that acts as kind of intelligent load balancer as it can both route requests to individual KIE Servers and aggregate data from different KIE Servers
  • Managed KIE Server - KIE Server that is connected to Controller and takes the configuration from the controller, overriding anything it has locally (even included in the image)
  • Unmanaged KIE Server - KIE Server that runs completely standalone, does not require any other component to be fully functional
  • Managed Smart Router - router that is connected to Controller though it owns the right to dynamically update server template it represents 


Architecture 1: Immutable unmanaged KIE Servers with Smart Router and Admin Console

This architecture is as cloud native as possible, it promotes the immutable execution servers paradigm that means both execution server itself and all KJARs should be colocated and included in the image itself (with all dependencies). That way any instance of that service (regardless when it starts) will always be identical.
It then uses Managed Smart Router to benefits from routing and aggregation. Smart Router will dynamically update Controller with new containers coming in so Admin Console can properly setup clients to interact with it.


In this architecture end users interact with KIE Servers always via Smart Router - either by using Admin Console and its runtime views or by another application.

Individual KIE Servers can come and go at anytime and register/unregister in the Smart Router. These KIE Servers might be with the same id - meaning representing the same image or different images. In this case (since images are immutable) they will represent other set of kjars.

As an example, let's look at the diagram above:

  • There are three KIE Servers behind the router - each represents independent image (meaning it has different kjars included)
    • KIE Server - ABC
    • KIE Server - DEF
    • KIE Server - GHI
  • There could be multiple instances of given KIE Server image - multiple PODs in OpenShift terminology
  • KIE Server starts completely independent from each other and smart router (though it will constantly attempt to register in Smart Router in case it was not up at the time KIE Server started)
  • KIE Containers to be started is included in the image


Architecture 2: Immutable managed KIE Servers with Admin Console and optionally Smart Router

Immutable managed architecture is slight variation of the first architecture though all KIE Servers are managed by controller thus Smart Router is then optional component as users can access individual KIE Servers directly as they are managed. Smart Router is needed when end users should be able to look at all KIE Servers at once instead of grouped by server template.

The images are still immutable, meaning they include KJARs that should be running, but the final word what set of KJARs (included in the image) should be started has controller. This is to secure that both Admin Console and KIE Server have same set of KIE Containers defined.

In practice, this architecture requires additional step in the deployment (as part of the deployment pipeline) to create immutable server template in the controller that would match kjars included in the image. Main reason for this is to protect the Admin Console from being affected by wrong image connecting with template id that it should not. So Admin Console completely relies on the controller configuration rather than runtime.


In this architecture, users can use both router or individual KIE Servers to interact with their capabilities. Same rules apply as for first architecture when it comes to KIE Server images and their instances. Whenever new instance of the KIE Server starts it register itself in the controller and optionally in the Smart Router.

Architecture 3: "Empty" managed KIE Servers with Admin Console and optionally Smart Router

Another architecture moves into another direction, instead of making the images immutable it promotes the dynamic behaviour of KIE Server. "Empty" KIE Server means that the image is without any KJARs included, it's pure KIE Server runtime that when started has nothing deployed to it.

With managed capabilities, controller can dynamically instruct KIE Servers what needs to be deployed. So the additional deployment step (as in architecture 2) can be used. By making the server template immutable similar architecture as in 2 can be achieved, though it might be impacted by differences in downloaded artefacts - this might be especially visible in case snapshots are used.

Benefit is that there is single image of KIE Server used (per release version) that, when started, is given list of parameters that defines it behaviour:

  • KIE Server ID that refers to server template in controller
  • switches to turn off capabilities
  • url to controller
  • optionally url to smart router


The main difference here is that controller should have server templates defined (with kie containers) for all the KIE Servers in the diagram:
  • ABC
  • DEF
  • GHI
In case the server template is not there, upon connection from KIE Server such template will be created, though it will be completely empty meaning nothing to execute on it.

Another flavour of this architecture could be that the server template is not immutable and thus at any point in time new kie containers can be added or removed. This will then make the environment completely dynamic which might be actually a good fit for development environments.

Interaction with KIE Servers is exactly the same as for architecture 2, can be done directly to individual servers or via smart router.


Architecture 4: Immutable unmanaged KIE Servers with Smart Router

This is another aspect of the immutable images though simplified as there is no need for controller and thus Admin Console is not used. With that in mind users will still have KIE Server image per set of KJARs to ensure immutability though there is no "managed" client for it.

This architecture targets mainly setups where there will be other components (applications/services) interacting with KIE Servers via Smart Router.


KIE Servers behave exactly the same as in architecture 1 and allow to add new instances or images at any time. Smart Router will constantly update the routing table to make sure it provides access to all available server with efficient balancing.


Architecture 5: "Empty" unmanaged KIE Servers with Smart Router

This architecture is pretty much the same as above (4) though it does provide the dynamic deployment feature. So instead of having immutable images of KIE Server and all KJARs included, it  gives at start "empty" KIE Server image that can be manually (since it is unmanaged) configured - deploy/undeploy KJARs at any point in time. 

This gives the most flexibility but at the same time it requires the most manual configuration of the runtime environment. So it is most likely suitable for simple environments. Though it still might be good fit for some use cases.


Conclusion 


Of course final selection of the architecture will depend on number of factors but overall recommendation would be to follow the order of these architectures in the article. 

The most cloud "friendly" seems to be the first one as it nicely fits into the continuous delivery approach with least additional steps. 

Second one adds ability to control individual servers from the controller and use admin console for selected server templates in isolation.

Third, provides really flexible (though not immutable) environment with small number of KIE Server images to managed. Might be an option for certain uses cases where the dynamic behaviour of the business logic is required but does not require complete image redeployment.

Forth, removes the controller and admin console from the landscape so might be a good fit for lighter setups where the business automation is used with another UI and managed completely by the cloud infrastructure - immutable images.

Fifth, is most likely just for the environments where deployment is managed by external system and thus keeps track of all possible KIE Servers being active.

Hope this gives a nice overview and helps with selecting the right runtime architecture based on requirements.

środa, 23 sierpnia 2017

Elasticsearch empowers jBPM

As a follow up on article that introduced NoSQL experimental support for jBPM, this article aims at illustrating one potential integration to enhance search capabilities and potentially routing support for larger environments.

Elasticseach will be used as additional data store where both process instances and tasks will end up being indexed. Please keep in mind that at this point in time it is rather basic integration though has already proven to be extremely valuable. Before jumping into details let's look at what use cases this integration brings:
  • ability to collect process instances and tasks from different sources - e.g. different execution servers connected to different dbs
  • ability to search for process instances and tasks using full text search - indexed values etc
  • ability to search for process instances and tasks by their variables, multiple variables (both name and value) in single request
  • ability to retrieve variables with search results in single request
  • and all other things that Elasticsearch provides :)

Implementation


The actual implementation to integrate Elasticsearch with jBPM based on PersistenceEventManager hooks is actually simple - it consists of single class that implements EventEmitter interface - ElasticSearchEventEmitter

It utilises Elasticseach REST API - to be precise its _bulk REST endpoint. It does push all events in single HTTP call. This consists of both types of instances
  • ProcessInstanceView
  • TaskInstanceView
all views are serialised as JSON documents. This integration uses:
  • http://localhost:9200 as the location where Elasticsearch server is
  • jbpm as the name of the index
  • processes as the type for ProcessInstanceView documents
  • tasks as the type for TaskInstanceView documents

Location of the Elasticsearch server and name of the index is configurable via system properties:
  • org.jbpm.integration.elasticsearch.url
  • org.jbpm.integration.elasticsearch.index

There is one more file in the project and this is the ServiceLoader services file providing information on emitter implementation for discovery on runtime.

ElasticSearchEventEmitter is delivering actual events in an async way to not hold back thread that was used to execute the process so the impact (performance wise) on process engine is minimal. Moreover thanks to default PersistenceEventManager implementation, this emitter will only be invoked when transaction is completed successfully, meaning in case process instance is rolled back that information won't be in Elasticsearch.

Installation


For this who would like to try this out, first of all install Elastisearch on your box (or wherever you prefer as you can point it to any server via system property).
Next, build the elastisearch-jbpm project locally (it's not yet included in the regular jBPM builds) and drop it into KIE Server web app (inside WEB-INF/lib) and that's it!

Now when you execute any processes you will have it's data in Elasticsearch as well so you can nicely query them in very advanced way.


In action

Let's now look at short screen cast that shows this in action. This demo illustrates still rather small subset of data (around 12 000 process instances and 12 000 tasks instances) that will be queried. Anyway, what this will show is:
  • speed of execution
  • query in a way that neither JPA nor jBPM advanced queries allows to do without additional setup
  • data retrieved directly from the query


In details:

  • first search for all active process instances was done in workbench - this uses data sets / advanced queries - though it is slightly slower due to it collects execution errors so that does affect the performance and it's under investigation
  • Then it does the same query over KIE Server REST api - that uses JPA underneath 
  • Last it does the same query over Elasticsearch
  • Next it shows a bit more advanced queries by multiple variables, people assignment etc

What can be found in the screencast illustrates benefits but on small scale, more will be seen where there are several independent execution servers so you can search across them.


Main difference is that Elasticsearch directly returns process instance variables. Similar for user tasks, though it does provide much more information - both task inputs and outputs plus people assignments - e.g. potential owners, business admins and excluded owners.

Expect more integration with other NoSQL data stores to come... so stay tuned.

NoSQL enters jBPM ... as an experiment ... so far

Quite frequently there are questions around jBPM if there is anyway to use NoSQL as data store for persistable setup. From the very beginning persistence in KIE projects (drools and jBPM) was designed to be pluggable. In versions prior to 7 it was though rather tight integration which resulted in dependencies to JPA being still needed. With version 7 persistence layer was refactored (thanks to Mariano De Maio who did majority of work) and enabled much cleaner integration with different (than default) persistence store.

That opened the door for more research on how to utilise NoSQL data stores to benefit the overall projects. With that in mind, we started to think what options are valuable and initial set of them are as follows:

  • complete replace of JPA based persistence layer with another data store (e.g. NoSQL)
  • enhance persistence layer with additional data store tailored with its capabilities 

Replacement of default persistence layer with NoSQL - MapDB

When it comes to the first approach, it's rather self explanatory - it completely replaces entire persistence layer thus freeing it up from any JPA based mechanism. This actually follows Mariano's work on providing persistence mechanism based on MapDB. You can find that work here that provides rather complete replacement of JPA and covers:
  • drools use cases - persistence of KieSession
  • jBPM use cases - persistence of 
    • KieSession, 
    • WorkItem, 
    • ProcessInstance, 
    • Task
  • jBPM runtime manager use cases - mainly around PerProcessInstance and PerCase strategies
  • jBPM services use cases - additional implementation of RuntimeDataService and DeploymentService to take advantage of MapDB store - does not persist all audit log data so some of the methods from RuntimeDataService (like node instances or variables related) won't work
  • KIE Server use cases - an alternative implementation of jBPM KIE Server extension that uses MapDB as backend store instead of RDBMS - though it does have limited capabilities - only operations on process instances and tasks are supported, no async execution (jBPM executor)
The good thing with MapDB is that it's a transactional store so it fits nicely with jBPM infrastructure. 

Though it didn't prove (with basic load tests*) to be faster than RDBMS based store. Quite the opposite it was 2-3 times slower on single box. But that does not mean there is no value in that. 

Personally I think the biggest value of this experiment was to illustrate that a complete replacement of the persistence layer is possible (up to KIE Server). Although it is quite significant work required to do so and there might be some edge cases that could limit or change available features.

Nevertheless it's an option in case some environments can't use RDBMS for whatever reason.

* basic load tests consists of two types of requests - 1) just to start a process with human task, 2) start a process with human task and complete it.


Enhance persistence layer with additional data store

Alternative approach (and in my opinion that brings much more value and less work) is to enhance the persistence layer with additional data store. This means that default and used by the internal services data store is still JPA and thus requires RDBMS though it can be offloaded for certain use cases to another data store as it might be much better suited for that.

Some of the use cases we are exploring are:
  • aggregation of data from various execution servers (different dbs)
  • aggregation of business data and process data
  • analytics e.g. BAM, stream processing, etc
  • advanced search capabilities like full text search
  • replication across data centres for searchability 
  • routing across data centres that runs individual process engines
  • and more... in case you have any ideas feel free to comment

This was sort of possible already in jBPM by utilising event listeners (ProcessEventListenr, TaskLifeCycleEventListener) though it was slightly too fine grained and required to have a bit of plumbing code to deal with how the engine behaves - mainly around transactions. 

So to ease with this work, jBPM provides few hooks to allow easier integration and let developers to only focus on actual integration code with external data store instead of knowing all the details in the process engine. 

So the main two hook points are:
  • PersistenceEventManager - that is responsible for receiving information from the engine when instances (ProcessInstance, Task) are in anyway updated - created, updated or deleted. The other responsibility is to collect all those events and at some point push to the event emitter implementation for actual delivery to external data store.
  • EventEmitter - this is the interface that must be implemented to activate the PersistenceEventManager - if there is no emitter found PersistenceEventManager acts in no-op way. Event emitter has two main responsibilities:
    • provides EventCollection implementation that decides how to deal with events that are added (new instance), updated (updated instance), removed (deleted instance) - different implementation of the EventCollection can decide on individual events e.g. in case single instance is added and removed in the same scope (transaction) then collection can decide to drop it from itself and deal only with still active instances.
    • integrates with the external data store - encapsulate client api of the external system 


Implementations that comes out of the box


PersistenceEventManager

There is a default PersistenceEventManager provided that integrates with transactions. That means there is no need (in most of the cases) to implement new PersistenceEventManager. Default implementation collects events from single transaction and deliver them to emitter at:
  • beforeCompletion of the transaction, manager will invoke deliver method of the emitter - this is mainly to give a complete list of events in case emitter wants to send these events in transactional way - for example JMS transactional delivery
  • afterCompletion of the transaction, this will again deliver same list of events as on beforeCompletion and is more for emitters that can't send events in transactional way e.g. REST/HTTP call. Manager will invoke:
    • apply method of the emitter in case transaction was successfully committed
    • drop method of the emitter in case transaction was rolled back


EventCollection

There is also default EventCollection implementation BaseEventCollection that will collect all events (instances regardless of their event type - create, update, delete) though will eliminate duplicates, meaning it will have only the last state of the instance.

Events

Now let's take a look what is an event - this is maybe a bit overused term but it does fit well in this scenario - it is fired when things happen in the engine - these events mainly represent instances that process engine is managing:
  • ProcessInstance
  • Task
Currently only these two types are managed but the hooks within the engine allow to plugin more, for example async jobs.

As soon as instance is updated (created, updated, deleted) that instance is wrapped with an InstanceView type and delivered to PersistenceEventManager - over its dedicated method representing type of the event - create, update remove.

InstanceView will have dedicated implementation to provide access to individual instance details though every implementation will always provide the link to the actual source of this view. Why there is a need for the *View types? Mainly to simplify consumption of them - InstanceView type is designed to be serialisable - for example to JSON or XML without too much hassle.

Out of the box there are two implementations of the InstanceView:
InstanceView might decide when the data should be copied from source though at latest it will be invoked by the PersistenceEventManager before calling the deliver method - so it's important that in case InstanceView implementation copies data earlier it should mark as copied itself to avoid double copy.



That concludes the introduction into how jBPM looks into support for NoSQL. Following article will show some of the implementations of the second approach to empower jBPM with additional capabilities.

I'd like to encourage everyone to share their opinion on how NoSQL would provide value for jBPM or what use cases you see are good fit for NoSQL and thus jBPM should support that better.

czwartek, 17 sierpnia 2017

Maven plugins for KIE Server

Since version 7 of jBPM KIE Server is the only execution server available by default thus it's getting more and more traction. With that in mind there is a need to have it more aligned with CI/CD pipelines to allow simple integration with runtime environments.

To help with that two maven plugins were built:

the main purpose of these plugins is to enable simple deployment (and not only deployment) of kjars into KIE Servers. 
First one is dedicated for unmanaged KIE Servers as that plugin interacts directly with KIE Server REST api. While the second one targets managed KIE Servers as it interacts with KIE Controller (either one in workbench/business central or standalone controller).

These maven plugins can be used to perform deployment of kjar to execution server directly from within a build pipeline. 

Both plugins have comprehensive documentation (see links above) but just for completeness I'd like to list their capabilities in this article:

KIE Server Deploy Maven Plugin

  • deploy -  deploy kjar to runtime environment
  • dispose - dispose running kjar (kie container) in runtime environment
  • update - update version of running kjar (kie container) in runtime environment

KIE Server Controller Deploy Maven Plugin

  • get-template - retrieves existing server templates from controller
  • create-template - creates new server templates with set of containers 
  • delete-template - removes server template
  • get-containers - retrieves containers in given server template
  • get-container - retrieves given container from server template
  • create-container - create new container in given server template
  • delete-container - delete container from given server template
  • start-container - starts container in given server template
  • stop-container - stops container in given server template
  • deploy-container - creates and starts container in given server template
  • dispose-container - stops and removes container from given server template 

    Contribution - a win win situation!

    And now the most important part - these Maven plugins were added by Fabio Massimo as contributions to KIE projects. So I'd like to thank Fabio for his outstanding work and excellent addition to projects. 

    This clearly shows how valuable contributions are! With that I'd like to encourage others to follow Fabio and share with others community members great stuff you all have done or plan to do!


          środa, 2 sierpnia 2017

          Managed KIE Server gets ready for the cloud

          As described in this article, KIE Server can run in two modes:

          • managed, wit controller that is responsible for providing kie containers to be deployed
          • unmanaged, self contained server that allows to deploy kie containers manually 
          In this article, I'd like to focus on managed mode and show some improvements in that area that will make managed KIE Server ready for the cloud.

          Background

          With default configuration of managed KIE Server, both controller and kie server need to know how to communicate with each other. By default it is REST based communication and thus require to provide credentials while sending requests
          • user and password - for BASIC authentication
          • token for BEARER authentication 
          These should be gives as system properties on each side:

          • org.kie.server.user and org.kie.server.password is to be set on controller jvm to instruct what credentails to use when connecting to kie server(s)
          • org.kie.server.controller.user and org.kie.server.controller.password is to be set on kie server jvm to instruct what credentials to use when connecting to controller

          This configuration fits nicely in non restricted environment where both controller and KIE Server(s) don't have any limitations to talk to each other. Though it does require that user name and password used by controller to connect to kie servers is the same as it is set globally via system properties and thus will be used whenever talking to any KIE Server instance.

          Though this setup can become problematic if there are any restrictions between these two. In some cases controller might be hidden behind firewall. That will then make an issue for it to communicate with KIE Server(s) when needed. Similar this becomes an issue in OpenShift environment where controller and KIE Server(s) are in different namespaces - they won't see each other internal IP.

          Here we touch upon another aspect of managed KIE Servers - its location. KIE Server when running in managed mode requires following configuration parameters (given as system properties on jvm that runs the KIE Server):
          • org.kie.server.id - an id that points to server template id defined in the controller
          • org.kie.server.controller - an URL of the controller to connect to upon start
          • org.kie.server.location - an URL of this instance of the KIE Server where it will be accessible over HTTP/REST
          The location of the KIE Server is expected to be unique - since this is an URL where the actual instance is accessible. Though this becomes an issue when running kie servers behind load balancer or when running in a cloud based environments. 

          It puts us in situation that we either give load balancer URL and by that loose the capabilities of receiving updates from controller (as only one of them will get updates based on load balancer selection) or we bypass load balancer and then loose the capabilities of it for runtime operations. Keep in mind that the location that kie server does provide on connection to controller is then used by (so called) runtime views in workbench - process instances, tasks, etc.

          In OpenShift environment that is pretty much the same issue - either public IP is provided which completely hides the individual PODs or internal IP of the POD. It has the same consequences as load balancer with one addition - internal IP won't work at all across namespaces.

          Websockets to the rescue...

          To resolve all the issues mentioned above, an alternative (and soon to be the default) way of communicating between KIE Server and Controller was introduced. It is based on Websockets that is now available in pretty much any JEE container (including servlet container) and solves pretty much all the issues that were identified for both on premise and in the cloud.


          As illustrated on the diagram above, KIE Server is the one who initiate the communication and keeps it active as long as it's alive. That in turn removes any need from KIE Controller to know how to communicate (and by that connect to) with KIE Server instances. So there is no more need to configure any user name or password on controller jvm to talk to KIE Servers, it will simply reuse open channel to connected KIE Servers.

          KIE Server is solely responsible for the connection. That means it needs to know where the controller is, how to authenticate when opening connection and how to handle lost connection (e.g. when controller goes down).

          So the first two are exactly the same, given as system properties on jvm that KIE Server is going to run:
          • using either BASIC or BEARER authentication 
          • org.kie.server.controller - an URL of the controller to connect to upon start
          Lost connections are handled by retry mechanism - as soon as KIE Server gets notification that the connection is closed it will start a background thread that will attempt to connect to controller every 10 seconds. Once it is reconnected that thread is terminated. It will reconnect only if the KIE Server itself is not the one who closed the connection.

          Since we keep connection open between kie servers and kie controller then the location given when kie server connects does not have to be unique any more. That solves the issue with running behind load balancer or in OpenShift with different namespaces. System property that provides location (org.kie.server.location) should now be given as the load balancer or public IP in OpenShift. 

          NOTE: If you don't run behind load balancer on on-premise setup (not OpenShift) then keep the location of the kie server unique regardless of the websocket being used. Similar rule applies - same public IP/load balancer should be kept for single server template only.

          There is no need for any extra configuration to enable websocket based communication, it is based only on the actual URL given as controller url - org.kie.server.controller system property.

          -Dorg.kie.server.controller=ws://localhost:8080/kie-wb/websocket/controller

          Depending where is your controller you might need to change:
          • localhost - to actual host/IP of your server where controller is deployed to
          • 8080 - to actual port number of your server where controller is deployed to
          • kie-wb - to actual context path of the controller web app 

          Both protocols - HTTP/REST and Websocket are active by default and either of them can be used. Though one rule must be kept - use single protocol for all kie servers of given server template. 
          Recommended is to keep single protocol across all kie servers connected to single controller.

          Workbench that provides UI for process related operations (Process Instance, Process Definitions, Tasks perspectives) will utilise websocket channel only for administration operations, that is:
          • controller based operations to manage kie servers
          • data set queries registration required by runtime views
          All other operations, like getting user tasks, getting process definitions or instances, will use regular REST based communication as it will call endpoints on behalf of logged in user to enforce security.

          With this enhancement managed KIE Server is way nicer option to run in cloud and behind load balancer than ever before :)

          Stay tuned for more to come!


          czwartek, 6 lipca 2017

          Make use of rules to drive your cases

          In case management that was recently released with jBPM version 7, there is a change in the way we look at cases. They are more data driven than flow driven. Of course users are free to define parts of the case definition to be process fragments (see the attached sample) but what is important is the look at cases as data that are handled.

          The steps required to resolve a case are mainly driven by data - that can be people involved in the case (based on the available data take certain actions) or the system itself can decide based on that data to trigger further actions.

          This article is about the later case - system is taking decisions for further actions. And what is better to take this than business rules :)

          Let's have a look at simple scenario, where we have basic car insurance case definition that looks like this


          There are two roles involved:

          • insured 
          • insuranceRepresentative
          At any given point in time data can be inserted into the case instance - to be precise its case file. CaseFile data is under constant supervision of the rule engine and thus we can build up rules that will react to the data that our case instance contains.

          To provide very simple scenario, let's assume that at some point there is a need for more information to be collected from the insured. This simple case could also be handled by a human actor e.g. person who takes the role of insurance representative in particular case instance. Though for the sake of example we can just make a business rule that will react immediately once the data of the case instance indicate there is a decision to ask for more details. It will automatically create human task assigned to the insured.


          rule "ask user for details"

          when 
              $caseData : CaseFileInstance()
              String(this == "AskForDetails") from $caseData.getData("decision")
                    
          then 
              $caseData.remove("decision");
              CaseService caseService = (CaseService) ServiceRegistry.get().service(ServiceRegistry.CASE_SERVICE);
              Map<String, Object> parameters = new HashMap<>();
              parameters.put("reason", "How did it happen?");
              caseService.addDynamicTask($caseData.getCaseId(), caseService.newHumanTaskSpec("Please provide additional details", "Action", "insured", null, parameters));
              
          end


          So that simple rule will do exactly that. If there is a decision in the case file that is set to AskForDetails, then the rule will:
          • remove the decision from the case to avoid rule loop
          • use ServiceRegistry to get hold of case service instance
          • configure task input parameters
          • finally add dynamic task to the case instance that is assigned to use who has role insured in the case instance
          That's all, as simple as that :) Obviously this is simplistic use case but it opens the door for integration between rules and case management to make it even more powerful for the users. Since that is all about dealing with data, combination of rules and case management is a perfect fit.

          Note, ServiceRegistry is part of jbpm-services-api module so make sure that is available on the class path. When building project (kjar) in workbench there is no need to add anything else but if you would like to build it outside of workbench make sure you add following dependencies to your project - both in scope provided.
          • org.jbpm:jbpm-services-api
          • org.jbpm:jbpm-case-mgmt-api
          ServiceRepository can also be used for regular processes in exact same way. Here are the services that are automatically registered in the registry:

          org.jbpm.services.api.DefinitionService
          org.jbpm.services.api.DeploymentService
          org.jbpm.services.api.ProcessService
          org.jbpm.services.api.RuntimeDataService
          org.jbpm.services.api.UserTaskService
          org.jbpm.services.api.admin.ProcessInstanceAdminService
          org.jbpm.services.api.admin.ProcessInstanceMigrationService
          org.jbpm.services.api.admin.UserTaskAdminService
          org.jbpm.services.api.query.QueryService
          org.jbpm.casemgmt.api.CaseRuntimeDataService
          org.jbpm.casemgmt.api.CaseService


          ServiceRegistry has public static members for all out of the box services so that is the recommended way to look them up in the registry. If for whatever reason you prefer to use string based it's the simple name of the interface listed above e.g. DefinitionService or CaseService.

          One final note, this comes with jBPM 7.1 which is just behind the corner...

          piątek, 30 czerwca 2017

          Execution error - how to deal with unexpected in jBPM 7.1

          jBPM technical error handling is based on transactionality and going back to last (stable) state. That means in case of an error (of any kind) that is not handled by the process, will result in rolling back of entire transaction and leaving process instance in the previous wait state. Any trace about this is only visible in the logs and usually is displayed to the caller (who sent the request to process engine).

          That in some cases might not be enough and thus additional error handling is required to provide:
          • Better traceability
          • Visibility in case of critical processes
          • Reporting and analytics - based on error situations 
          • External system error handling and compensation

          Overview

          Configurable error handling is introduced in version 7.1 that will be responsible for catching any technical errors thrown throughout the process engine execution (including task service). Any technical exception means:
          • Anything that extends java.lang.Throwable
          • Was not handled before - like process level error handling
          There are several components that made up the error handling mechanism and allow pluggable approach to extend its capabilities.

          The entry point from process engine point of view is ExecutionErrorManager that is integrated with RuntimeManager which is then responsible for providing it to underlying components - KieSession and TaskService. ExecutionErrorManager from the api point of view gives access to:

          • ExecutionErrorHandler - the heart of the error handling mechanism
          • ExecutionErrorStorage - pluggable storage for execution error information
          ExecutionErrorHandler is bound to the life cycle of RuntimeEngine, meaning is created when new runtime engine is created and is destroyed when RuntimeEngine is disposed. Single instance of the ExecutionErrorHandler is used within given execution context (transaction). Both KieSession and TaskService uses that instance to inform the error handling about processed nodes/tasks. ExecutionErrorHandler allows to inform it about:
          • Starting processing of a given node instance
          • Completion of processing of a given node instance
          • Starting processing of a given task instance
          • Completion of processing of a given task instance

          Such information is mainly used for errors that are of unknown type - in other words errors that do not provide information about the process context. For example, data base exception upon commit time will not carry any process information meaning that would make the error information really poor and pretty much useless. 

          ExecutionErrorStorage is pluggable strategy to allow various ways of persisting information about execution errors. Store is used directly by the handler that gets an instance of the store upon creation (at the time RuntimeEngine is created). Default store implementation is based on data base table. Every error will be stored into that table with all information available in it. Not all errors might have all the details they are dependent of the type and possibility to extract information from the error.


          Error types and filters

          Since error handling will attempt to catch and handle any kind of error it needs a way to categorize errors to be able to properly extract information out of the error and make it pluggable as users might use their special types of error to be thrown and handled in different way then one provided out of the box.
          Error categorization and filtering is based on so called ExecutionErrorFilters. This is simple interface that is solely responsible for building instance of ExecutionError that is later on stored via the ExecutionErrorStorage. It has following methods:
          • accept to indicate if given error can be handled by the filter
          • filter where the actual filtering/handling etc happens
          • getPriority indicates the priority which is used when calling filters
          Filters provide their priority as only one filter can process given error - this is mainly to avoid to have multiple filters returning alternative “views” of the same error. That’s why priority was introduced to allow more specialized filters to see if they can accept the error and if so deal with it, otherwise let it to be handled by another filter.

          ExecutionErrorFilter can be provided using ServiceLoader mechanism that is quite easy and proven so extending capability of the error handling is very simple.

          Out of the box ExecutionErrorFilters:

          Class name
          Type
          Priority
          org.jbpm.runtime.manager.impl.error.filters.ProcessExecutionErrorFilter
          Process
          100
          org.jbpm.runtime.manager.impl.error.filters.TaskExecutionErrorFilter
          Task
          80
          org.jbpm.runtime.manager.impl.error.filters.DBExecutionErrorFilter
          DB
          200
          org.jbpm.executor.impl.error.JobExecutionErrorFilter
          Job
          100

          The lower value of the priority the higher execution order it gets. In above table then filters will be invoked in following order:
          • Task
          • Process
          • Job
          • DB

          Error acknowledgment

          By definition every error that is caught and stored is unacknowledged, that means it is to be handled by someone/something (in case of automatic error recovery). That is the base approach to allow to filter on existing errors if they have been already taken care of or not. Acknowledgment on each error saves user who did the acknowledgment and the time stamp for traceability purpose.

          Since the ExecutionErrorFilter is responsible for creating the ExecutionError instance, different implementations might decide that the acknowledgement is set to true immediately when the error is handled - maybe because there is a notification sent to some issue tracking system or an email to administrator. Again, that is up to concrete implementation of the filters or even storage.

          Auto acknowledgement of execution errors

          By default, executions errors are created unacknowledged and thus require manual action to be performed otherwise they will always be seen as information that requires attention. In case of bigger volumes, manual actions can be time consuming and not suitable in some situations. To help with that auto acknowledgement of errors has been provided. It is based on scheduled jobs (via jbpm executor) and there are three types of jobs available:
          • org.jbpm.executor.commands.error.JobAutoAckErrorCommand
            • Job responsible for finding out jobs that previously failed but now are either cancelled, completed or rescheduled for another execution. This job will only acknowledge execution errors of type “Job”
          • org.jbpm.executor.commands.error.TaskAutoAckErrorCommand 
            • Job responsible for auto acknowledgment of user task execution errors for task that previously failed but now are in one of the exit states (completed, failed, exited, obsolete). This job will only acknowledge execution errors of type “Task”
          • org.jbpm.executor.commands.error.ProcessAutoAckErrorCommand
            • Job responsible for auto acknowledgment of process instances that have errors attached. It will acknowledge errors in case process instance is already finished (completed or aborted) or the task that the error originated from is already finished - based on init_activity_id value. This job will acknowledge any type of job that matches above criteria.
          All three jobs can be registered on KIE Server to automatically auto acknowledge errors and they are reoccuring type of jobs, meaning if not explicitly said to be SingleRun they will run once a day by default. They can be configured to run on any time intervals by providing NextRun as time expression e.g. 2h, 5d etc

          Last parameter that these jobs support is EmfName to provide custom name of entity manager factory that should be used when searching for jobs to acknowledge. All of these parameters are optional.

          There is a base class that is extended by individual jobs and can be seen as the starting point for additional implementation of auto acknowledge options
          org.jbpm.executor.commands.error.AutoAckErrorCommand

          Once extended there are two methods to be implemented:
          • protected abstract List<ExecutionErrorInfo> findErrorsToAck(EntityManager em);
          • protected abstract String getAckRule();
          First is the most important as it abstracts the way individual jobs find error to be acknowledged. Second is to provide the rule based on which the errors were found. It is only for logging purpose to indicate what led to auto acknowledge.

          Services and access to error information

          Access to error information (for the out of the box storage) is through jbpm services. The two admin facing services provide basic access to the error information and to be able to acknowledge the errors:

          • ProcessInstanceAdminService
            • allow to find execution errors of any type and mainly focusing on search capability around process instance
          • UserTaskAdminService 
            • allow to find Task type of errors and focuses on search es around task details like name or id
          Since the way of looking for errors can be pretty much unlimited, above services provide the basic access only. For more advanced/tailored searches advanced queries should be used. There is out of the box query mapper available to directly produce the ExecutionError instance out of the data set.

          Similar access and capabilities are exposed over KIE Server Remote api and its client library.

          Clean up mechanism

          To be able to maintain the ExecutionErrorInfo table in good health there is a need to clean it up from time to time. Since the errors can be there for quite some time, depending on the life cycle of the processes, there is no direct api to clean it up. Instead there is jBPM executor command that can be scheduled for recurring execution to periodically clean up errors. There are several options to be used for clean up command:
          • DateFormat 
            • date format for further date related params - if not given yyyy-MM-dd is used (pattern of SimpleDateFormat class)
          • EmfName 
            • name of entity manager factory to be used for queries (valid persistence unit name)
          • SingleRun 
            • indicates if execution should be single run only (true|false)
          • NextRun 
            • provides next execution time (valid time expression e.g. 1d, 5h, etc)
          • OlderThan 
            • indicates what errors should be deleted - older than given date
          • OlderThanPeriod 
            • indicated what errors should be deleted older than given time expression (valid time expression e.g. 1d, 5h, etc)
          • ForProcess 
            • indicates errors to be deleted only for given process definition
          • ForProcessInstance 
            • indicates errors to be deleted only for given process instance
          • ForDeployment 
            • indicates errors to be deleted that are from given deployment id
          Important note is that the command will always (regardless of parameters given) restrict deletion to already completed/aborted process instances. If there is any other need to deal with that it should be extended or provided as custom command.

          Time to see this in action

          Below screen cast shows this error handling in action. Moreover it shows excellent UI support for it which I would like to give credits to the team that have worked on it - Cristiano, Neus and Rafael.

          In the screen cast you'll see a simple process that based on variable either continues as expected or throws an exception. This exception is then handled as execution error and is available to users/administrators to deal with. In addition it will illustrate use of auto acknowledge jobs to based on various conditions acknowledge the errors. Please be patient as there are some waiting times in the screen cast while waiting for job to execute :)

          Enjoy and stay tuned for more!!!