TAR Compaction for AEM Performance optimization

TARMK Compaction

We require this when we see growth in TarMk files in the repository. As we all know that tar file data is never overwritten and they always do append the data. So the disk usage increases even when only updating existing data. To avoid such increase in repository,

AEM provides this Tar Compaction. This mechanism will reclaim disk space by removing obsolete data from the repository.

We have two kinds of Tar Compactions.
     Online Compaction (AEM 6.2 Doesn’t recommend it anymore)
     Offline Compaction

Even before jumping on to these details, let us discussion a bit on automatic compaction triggered manually.

Revision Clean Up: 

               The automatic compaction can be triggered manually in the Operations Dashboard via a maintenance job called Revision Clean Up.

To Start Revision Clean Up you need to:

     Go to AEM Welcome Screen.rm
     In the main AEM window, go to Tools → Operations → Dashboard → Maintenance
     Or directly browse below.
     You see the screen like below.         
     Now click on Daily Maintenance Window
     Hover over the Revision Clean Up window and press the start button like below.
         Click on the “Run” icon as seen on the above screenshot.
     The icon will turn orange to indicate Revision Clean Up job is running.

You can stop it at any time by hovering the mouse over the icon and pressing the Stop button.

  
Invoking Revision Garbage Collection via the JMX Console
     Open the JMX console as
     Click RevisionGarbageCollection MBean
     In the next window, click startRevisionGC() and then Invoke to start the Revision Garbage Collection job.

TAR OFFLINE COMPACTION:

NOTE: Never do offline compaction when the repository is up and running. Not even checkpoints should be done while the server is up and running.

The procedure is called offline compaction because the repository needs to be shut down in order to properly run the Oak-run tool.

For faster compaction of the Tar files and situations where normal garbage collection does not work, Adobe provides a manual Tar compaction tool called Oak-run. It can be downloaded at the following location: http://mvnrepository.com/artifact/org.apache.jackrabbit/oak-run/

Note: We use Oak-run jar depending upon the version of our current Oak in the AEM server. So make sure we have the right version of oak-run jar.

The procedure to run the tool is:
     Always make sure we have a recent working backup of the AEM instance
     Shutdown AEM
     Use the tool to find old checkpoints:

java -jar oak-run.jar checkpoints <aem-inst-folder>/crx-quickstart/repository/segmentstore


     When you run the above command you should be able to see the references of the nodes like below which we need to clear them before compaction takes place.
      

     Now delete the unreferenced checkpoints using below command.
            java -jar oak-run.jar checkpoints <aem-inst-folder>/crx-quickstart/repository/segmentstore rm-unreferenced


     Finally run the compaction and wait for it to complete
java -jar oak-run.jar compact <aem-inst-folder>/crx-quickstart/repository/segmentstore

     You should be see something like below when you run the above command.
           
     It should display all files under segmentstore directory. After a while you should be able to see Cleaning up message like below.
     Once it cleans up you should be able to see the below message.

     It will display the tarfiles again with less count after compaction.

We can create a log file to track all these changes in the log file using the below configurations in the server where we are running the compaction. 

This is one time activity and we need to use the command below to log the information/errors into the log file.

     Create the below config file the same place where we have oak-run-*.jar file
     Name it as “logback-compaction.xml” and add the below in the lines in xml file.
           
<configuration>
  <appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
    <target>System.err</target>
    <encoder>
      <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
    </encoder>
  </appender>
  <logger name="org.apache.jackrabbit.oak.plugins.segment.Compactor" level="INFO"/>
  <root level="warn">
    <appender-ref ref="STDERR" />
  </root>
</configuration>


     Now run the below command to log the error/info into the log file.
nohup java -Dtar.memoryMapped=true -Dupdate.limit=5000000 -Dcompaction-progress-log=1500000 -Dcompress-interval=10000000 -Doffline-compaction=true -Dlogback.configurationFile=logback-compaction.xml -Xmx10g -jar <oak jar file path> compact <aem-installation-path> > tarcompaction.log 2>&1

     Now should be able to see the file name “tarcompaction.log” the place where you created the above xml config.

TAR OFFLINE COMPACTION PREREQUISTICS:

·                   How to find correct version of oak-run jar?
                    Go to felix console (/system/console/bundles) –> and search for oak, take version across each oak bundle.

      Find Check Points
      ----------------------
·                      java -jar /folderPath/aem/oak-run.jar checkpoints F:/AEM/AEM-4502/crx-quickstart/repository/segmentstore

    Run below command on linux/unix machine as windows does not support -Dtar option
·                 java -Dtar.memoryMapped=true -Xmx8g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar checkpoints /folderPath/aem/crx-quickstart/repository/segmentstore


         Remove CheckPoints:
--------------------------
·                  java -Dtar.memoryMapped=true -Xmx4g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar checkpoints /folderPath/aem/crx-quickstart/repository/segmentstore rm-unreferenced

Finally Run Compact:
-------------------------
·               java -Dtar.memoryMapped=true -Xmx8g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar compact /folderPath/aem/crx-quickstart/repository/segmentstore >> /folderPath/aem/help/logs/compactLog


Running Script File:

·     Go to respective script file in folder path.
·     Running below command will execute all the above 3 commands (i.e. finding checkpoints, deleting checkpoints, compacting) are incorporated with in that script file.

Command:
                ./scriptFileName


ONLINE COMPACTION:

                   For situations where the AEM cannot be shut down for maintenance, compaction can also be performed while the instance is running. This is called Online Compaction.

You can configure Online Compaction by doing the following:
     Go to the folder where AEM is installed, then browse to crx-quickstart\install
     Open the org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.config file.
     If that doesn’t exist create one and add the following line to the configuration file
                        repository.home=${repository.home}/segmentstore
            tarmk.size=256
            pauseCompaction=false
     Restart AEM

To verify the configuration has taken place, check here.
  Go to the JMX console by pointing your browser to
  Search for CompactionStrategy and click the MBean that shows up in the search.
  Next, verify that the value for PausedCompaction is set to false. This confirms that online compaction is set to run.
  Next, verify if Online Compaction is running properly. You can do this by first going to the Operations Dashboard and checking what is the time interval configured for the Daily Maintenance Window. By default, it is scheduled to run between 2 and 5 AM.
  Now, inspect the error.log file for events logged during the time of the daily maintenance window to see if online compaction ran correctly.
  Example Log:
   [TarMK compaction thread [/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19 02:00:10 EDT 2015, previous max duration 1369831ms] org.apache.jackrabbit.oak.plugins.segment.file.FileStore TarMK compaction started
19.03.2015 02:00:30.441 *INFO* [pool-9-thread-2] com.adobe.granite.taskmanagement.impl.jcr.TaskArchiveService archiving tasks at: 'Thu Mar 19 02:00:30 EDT 2015'
19.03.2015 02:01:01.699 *INFO* [TarMK compaction thread [/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19 02:00:10 EDT 2015, previous max duration 1369831ms] org.apache.jackrabbit.oak.plugins.segment.file.FileStore Estimated compaction in 51.47 s, gain is 69% (1018859520/3343598080) or (1.0 GB/3.3 GB), so running compaction

     Log to make sure online compaction is completed.
           
                [TarMK compaction thread [/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19 02:00:10 EDT 2015, previous max duration 1369831ms] org.apache.jackrabbit.oak.plugins.segment.file.FileStore TarMK compaction completed in 1310939ms


How to automate the offline compaction process.

Offline Tar Compaction is still the Adobe recommended way of compacting Oak.
Below is the script which automates entire process.
For above process download a version of Oak Run that matches your repository version.

Steps to follow:
  1. Shutdown AEM
  1. Find Old Checkpoints
  1. Remove Unreferenced Checkpoints
  1. Compact Oak (using compact keyword in command).
  1. Restart AEM
SCRIPT:
#!/bin/bash
todayDate="$(date +'%d-%m-%Y')"
logfile="compact-$ todayDate.log"
installfolder="/data/aem"
aemfolder="$installfolder/crx-quickstart"
oakrun="$installfolder/help/oak-run-1.0.18.jar"

## Shutdown AEM
printf "Shutting down AEM.\n"
$aemfolder/bin/stop
todayDate ="$(date)"
echo "AEM Shutdown at: $ todayDate " >> $installfolder/help/logs/$logfile

## Find old checkpoints
printf "Finding old checkpoints.\n"
java -Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun checkpoints $aemfolder/repository/segmentstore >> $installfolder/help/logs/$logfile

## Delete unreferenced checkpoints
printf "Deleting unreferenced checkpoints.\n"
java -Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun checkpoints $aemfolder/repository/segmentstore rm-unreferenced >> $installfolder/help/logs/$logfile

## Run compaction
printf "Running compaction. This may take a while.\n"
java -Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun compact $aemfolder/repository/segmentstore >> $installfolder/help/logs/$logfile

## Report Completed
printf "Compaction complete. Please check the log at:\n"
printf "$installfolder/help/logs/$logfile\n"

## Start AEM back up
todayDate ="$(date)"
printf "Starting up AEM.\n"
$aemfolder/bin/start
echo "AEM Startup at: $ todayDate " >> $installfolder/help/logs/$logfile




Workflow Purging for AEM Performance optimization

Workflow Purging:

Every time a new workflow instance gets created when we launch a workflow (either of asset upload, asset publishing, etc.)

• Once the workflow completes (successful or aborted or terminated), it’s archived and never gets deleted.
• Workflow purging needs to be done to clean up archived workflow instances.
• Purging can be done based on 3 categories of workflows
Workflow model
Completion status
Age of the workflow instance
• To manually purge workflows, execute the operation purgeCompleted on the mbean com.adobe.granite.workflow.

Use “Adobe granite workflow purge configuration” at OSGI configuration to configure automatic workflow purging.

In Older versions of Adobe CQ5/AEM to purge old completed workflows, we need to either write a custom job or need to install a package provided by Adobe/Day Care. But in AEM 5.6.1 onwards we have this functionality built-in.

Workflow instances are stored as nodes inside AEM/CQ.  For long running instances where website users or automated jobs can run workflows, this will quickly add the large amounts of content to repository. 

This leads to slowness of AEM server and also grows disk space.

   ENABLING WORKFLOW PURGE SCHEDULER

It’s not pre-configured feature in AEM. We need to enable AEM to automatically purge workflows.
We can create two configurations of the service to purge workflow instances that satisfy different criteria’s.

ü  First configuration:  purges instances of a particular workflow model which are running for longer duration than expected.
ü  Second configuration: purges all completed workflows after a certain number of days to minimize the size of repository

·  Create a Workflow Purge Scheduler instance.  The Workflow Purge Scheduler is a Sling Service Factory, configuration added to service PID:  com.adobe.granite.workflow.purge.Scheduler

·   As service is a factory service, the name of the sling:OsgiConfig node requires identifier suffix like:   com.adobe.granite.workflow.purge.Scheduler-myidentifier

Add below properties on the node:

Property Name
OSGi Property Name
Description
Job Name
scheduledpurge.name
A descriptive name for the scheduled purge.
Workflow Status
scheduledpurge.workflowStatus
The status of the workflow instances to purge. The following values are valid:
  • COMPLETED: Completed workflow instances are purged.
  • RUNNING: Running workflow instances are purged.
Models To Purge
scheduledpurge.modelIds
The ID of the workflow models to purge. The ID is the path to the model node, for example /etc/workflow/models/dam/update_asset/jcr:content/model. Specify no value to purge instances of all workflow models.
To specify multiple models, click the + button in the Web Console. 
Workflow Age
scheduledpurge.daysold
The age of the workflow instances to purge, in days.

Once we deploy this file as a content package, we should see configuration show up under the Workflow Purge Scheduler in the OSGi console like below & it will purge workflows which are older than the specified number of days mentioned in configuration.


   Configure Workflow Purge Scheduler in a Package
Deploying a configuration as a part of package deployment process is also one of best approach.  Through the Apache Sling's OSGi Configurations, we can do this with below simple XML file.
·         Create a XML file under a path Eg:  /apps/[my-app]/config
·         Set name of the file to: com.adobe.granite.workflow.purge.Scheduler.config.[some-arbitrary-id].xml

Add below content to XML file and replace values with respective configuration details:
<?xml version="1.0" encoding="UTF-8"?>

<jcr:root xmlns:sling="http://sling.apache.org/jcr/sling/1.0"  xmlns:jcr=http://www.jcp.org/jcr/1.0
jcr:primaryType="sling:OsgiConfig"

scheduledpurge.name="Purge All Completed Workflows"
scheduledpurge.modelIds="[]"

scheduledpurge.workflowStatus="COMPLETED"

scheduledpurge.cron="0 0 * * * ?"

scheduledpurge.daysold="30" />

Once we click save, the scheduler schedules CRON schedule and will purge workflows older than the specified number of days within configuration we specified.

We can also use JMX console to purge repository of completed workflows.
If list of archived workflows grows too large, purge those that are of certain age limit this speeds up page loading.

Use below link for workflow purge scheduler in AEM6.0, we can use AEM built-in feature accessible via JMX and configurable Scheduler http://hostName:portNo/system/console/jmx/com.adobe.granite.workflow%3Atype%3DMaintenance
By mentioning No.of Days of workflow to purge & few more PARAM’s.

For More Info: