Workflow Purging for AEM Performance optimization

Workflow Purging:

Every time a new workflow instance gets created when we launch a workflow (either of asset upload, asset publishing, etc.)

• Once the workflow completes (successful or aborted or terminated), it’s archived and never gets deleted.
• Workflow purging needs to be done to clean up archived workflow instances.
• Purging can be done based on 3 categories of workflows
Workflow model
Completion status
Age of the workflow instance
• To manually purge workflows, execute the operation purgeCompleted on the mbean com.adobe.granite.workflow.

Use “Adobe granite workflow purge configuration” at OSGI configuration to configure automatic workflow purging.

In Older versions of Adobe CQ5/AEM to purge old completed workflows, we need to either write a custom job or need to install a package provided by Adobe/Day Care. But in AEM 5.6.1 onwards we have this functionality built-in.

Workflow instances are stored as nodes inside AEM/CQ.  For long running instances where website users or automated jobs can run workflows, this will quickly add the large amounts of content to repository. 

This leads to slowness of AEM server and also grows disk space.

   ENABLING WORKFLOW PURGE SCHEDULER

It’s not pre-configured feature in AEM. We need to enable AEM to automatically purge workflows.
We can create two configurations of the service to purge workflow instances that satisfy different criteria’s.

ü  First configuration:  purges instances of a particular workflow model which are running for longer duration than expected.
ü  Second configuration: purges all completed workflows after a certain number of days to minimize the size of repository

·  Create a Workflow Purge Scheduler instance.  The Workflow Purge Scheduler is a Sling Service Factory, configuration added to service PID:  com.adobe.granite.workflow.purge.Scheduler

·   As service is a factory service, the name of the sling:OsgiConfig node requires identifier suffix like:   com.adobe.granite.workflow.purge.Scheduler-myidentifier

Add below properties on the node:

Property Name
OSGi Property Name
Description
Job Name
scheduledpurge.name
A descriptive name for the scheduled purge.
Workflow Status
scheduledpurge.workflowStatus
The status of the workflow instances to purge. The following values are valid:
  • COMPLETED: Completed workflow instances are purged.
  • RUNNING: Running workflow instances are purged.
Models To Purge
scheduledpurge.modelIds
The ID of the workflow models to purge. The ID is the path to the model node, for example /etc/workflow/models/dam/update_asset/jcr:content/model. Specify no value to purge instances of all workflow models.
To specify multiple models, click the + button in the Web Console. 
Workflow Age
scheduledpurge.daysold
The age of the workflow instances to purge, in days.

Once we deploy this file as a content package, we should see configuration show up under the Workflow Purge Scheduler in the OSGi console like below & it will purge workflows which are older than the specified number of days mentioned in configuration.


   Configure Workflow Purge Scheduler in a Package
Deploying a configuration as a part of package deployment process is also one of best approach.  Through the Apache Sling's OSGi Configurations, we can do this with below simple XML file.
·         Create a XML file under a path Eg:  /apps/[my-app]/config
·         Set name of the file to: com.adobe.granite.workflow.purge.Scheduler.config.[some-arbitrary-id].xml

Add below content to XML file and replace values with respective configuration details:
<?xml version="1.0" encoding="UTF-8"?>

<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0"  xmlns:jcr=http://www.jcp.org/jcr/1.0
jcr:primaryType="sling:OsgiConfig"

scheduledpurge.name="Purge All Completed Workflows"
scheduledpurge.modelIds="[]"

scheduledpurge.workflowStatus="COMPLETED"

scheduledpurge.cron="0 0 * * * ?"

scheduledpurge.daysold="30" />

Once we click save, the scheduler schedules CRON schedule and will purge workflows older than the specified number of days within configuration we specified.

We can also use JMX console to purge repository of completed workflows.
If list of archived workflows grows too large, purge those that are of certain age limit this speeds up page loading.

Use below link for workflow purge scheduler in AEM6.0, we can use AEM built-in feature accessible via JMX and configurable Scheduler http://hostName:portNo/system/console/jmx/com.adobe.granite.workflow%3Atype%3DMaintenance
By mentioning No.of Days of workflow to purge & few more PARAM’s.

For More Info: