TARMK Compaction
We require this when we see growth in TarMk files in the
repository. As we all know that tar file data is never overwritten and they
always do append the data. So the disk usage increases even when only updating
existing data. To avoid such increase in repository,
AEM provides this Tar Compaction. This mechanism will reclaim disk space by removing obsolete data from the repository.
AEM provides this Tar Compaction. This mechanism will reclaim disk space by removing obsolete data from the repository.
We have two kinds of Tar Compactions.
➢
Online Compaction (AEM 6.2 Doesn’t recommend it
anymore)
➢
Offline Compaction
Even before jumping on to these details, let us discussion a
bit on automatic compaction triggered manually.
Revision Clean Up:
The automatic compaction can be triggered manually in the
Operations Dashboard via a maintenance job called Revision Clean Up.
To Start Revision Clean Up you need to:
❏
Go to AEM Welcome Screen.rm
❏
In the main AEM window, go to Tools → Operations
→ Dashboard → Maintenance
❏
Or directly browse below.
❏
You see the screen like below.
❏
Now click on Daily
Maintenance Window
❏
Hover over the Revision
Clean Up window and press the start button like below.
❏
Click on the “Run” icon as seen on the above
screenshot.
❏
The icon will turn orange to indicate Revision Clean Up
job is running.
You can stop it at any time by hovering the mouse over the
icon and pressing the Stop button.
Invoking Revision
Garbage Collection via the JMX Console
➔
Open the JMX console as
➔
Click RevisionGarbageCollection
MBean
➔
In the next window, click startRevisionGC() and then Invoke
to start the Revision Garbage Collection job.
TAR OFFLINE COMPACTION:
NOTE: Never do offline
compaction when the repository is up and running. Not even checkpoints should
be done while the server is up and running.
The procedure is called offline
compaction because the repository needs to be shut down in order to
properly run the Oak-run tool.
For faster compaction of the Tar files and situations where
normal garbage collection does not work, Adobe provides a manual Tar compaction
tool called Oak-run. It can be
downloaded at the following location: http://mvnrepository.com/artifact/org.apache.jackrabbit/oak-run/
Note: We use Oak-run
jar depending upon the version of our current Oak in the AEM server. So make
sure we have the right version of oak-run jar.
The procedure to run the tool is:
➔
Always make sure we have a recent working backup of the
AEM instance
➔
Shutdown AEM
➔
Use the tool to find old checkpoints:
java -jar oak-run.jar checkpoints <aem-inst-folder>/crx-quickstart/repository/segmentstore
➔
When you run the above command you should be able to
see the references of the nodes like below which we need to clear them before
compaction takes place.
➔
Now delete the unreferenced checkpoints using below
command.
java -jar
oak-run.jar checkpoints
<aem-inst-folder>/crx-quickstart/repository/segmentstore rm-unreferenced
➔
Finally run the compaction and wait for it to complete
java
-jar oak-run.jar compact <aem-inst-folder>/crx-quickstart/repository/segmentstore
➔
You should be see something like below when you run the
above command.
➔
It should display all files under segmentstore
directory. After a while you should be able to see Cleaning up message like below.
➔
Once it cleans up you should be able to see the below
message.
➔
It will display the tarfiles again with less count
after compaction.
We can create a log file to track all these changes in the
log file using the below configurations in the server where we are running the
compaction.
This is one time activity and we need to use the command below to
log the information/errors into the log file.
➔
Create the below config file the same place where we
have oak-run-*.jar file
➔
Name it as “logback-compaction.xml” and add the below
in the lines in xml file.
<configuration>
<appender name="STDERR"
class="ch.qos.logback.core.ConsoleAppender">
<target>System.err</target>
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<logger name="org.apache.jackrabbit.oak.plugins.segment.Compactor" level="INFO"/>
<root level="warn">
<appender-ref ref="STDERR" />
</root>
</configuration>
<target>System.err</target>
<encoder>
<pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %msg%n</pattern>
</encoder>
</appender>
<logger name="org.apache.jackrabbit.oak.plugins.segment.Compactor" level="INFO"/>
<root level="warn">
<appender-ref ref="STDERR" />
</root>
</configuration>
➔
Now run the below command to log the error/info into
the log file.
nohup java
-Dtar.memoryMapped=true -Dupdate.limit=5000000
-Dcompaction-progress-log=1500000 -Dcompress-interval=10000000
-Doffline-compaction=true -Dlogback.configurationFile=logback-compaction.xml
-Xmx10g -jar <oak jar file path> compact <aem-installation-path>
> tarcompaction.log 2>&1
➔
Now should be able to see the file name
“tarcompaction.log” the place where you created the above xml config.
TAR OFFLINE COMPACTION PREREQUISTICS:
· How to find correct version of oak-run jar?
Go to felix console (/system/console/bundles) –> and search for oak, take version across each oak bundle.
Find Check Points:
----------------------
· java -jar /folderPath/aem/oak-run.jar checkpoints F:/AEM/AEM-4502/crx-quickstart/repository/segmentstore
Run below command on linux/unix machine as windows does not support -Dtar option
· java -Dtar.memoryMapped=true -Xmx8g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar checkpoints /folderPath/aem/crx-quickstart/repository/segmentstore
Remove CheckPoints:
--------------------------
· java -Dtar.memoryMapped=true -Xmx4g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar checkpoints /folderPath/aem/crx-quickstart/repository/segmentstore rm-unreferenced
Finally Run Compact:
-------------------------
· java -Dtar.memoryMapped=true -Xmx8g -jar /folderPath/aem/oak-run-jar/oak-run-1.2.7.jar compact /folderPath/aem/crx-quickstart/repository/segmentstore >> /folderPath/aem/help/logs/compactLog
Running Script File:
· Go to respective script file in folder path.
· Running below command will execute all the above 3 commands (i.e. finding checkpoints, deleting checkpoints, compacting) are incorporated with in that script file.
Command:
./scriptFileName
ONLINE COMPACTION:
For situations where the AEM cannot be shut down for
maintenance, compaction can also be performed while the instance is running.
This is called Online Compaction.
You can configure Online Compaction by doing the following:
➔
Go to the folder where AEM is installed, then browse to
crx-quickstart\install
➔
Open the org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStoreService.config
file.
➔
If that doesn’t exist create one and add the following
line to the configuration file
repository.home=${repository.home}/segmentstore
tarmk.size=256
pauseCompaction=false
➔
Restart AEM
To verify the
configuration has taken place, check here.
➔ Go to the JMX console by pointing your browser to
➔ Search for CompactionStrategy
and click the MBean that shows up in the search.
➔ Next, verify that the value for PausedCompaction is set to false. This confirms that online
compaction is set to run.
➔ Next, verify if Online Compaction is running properly.
You can do this by first going to the Operations Dashboard and checking what is
the time interval configured for the Daily Maintenance Window. By default, it
is scheduled to run between 2 and 5 AM.
➔ Now, inspect the error.log
file for events logged during the time of the daily maintenance window to see
if online compaction ran correctly.
➔ Example Log:
[TarMK compaction thread
[/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19
02:00:10 EDT 2015, previous max duration 1369831ms]
org.apache.jackrabbit.oak.plugins.segment.file.FileStore TarMK compaction
started
19.03.2015 02:00:30.441 *INFO* [pool-9-thread-2]
com.adobe.granite.taskmanagement.impl.jcr.TaskArchiveService archiving tasks
at: 'Thu Mar 19 02:00:30 EDT 2015'
19.03.2015 02:01:01.699 *INFO* [TarMK compaction thread
[/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19
02:00:10 EDT 2015, previous max duration 1369831ms]
org.apache.jackrabbit.oak.plugins.segment.file.FileStore Estimated compaction
in 51.47 s, gain is 69% (1018859520/3343598080) or (1.0 GB/3.3 GB), so running
compaction
➔
Log to make sure online compaction is completed.
[TarMK compaction thread
[/author/crx-quickstart/repository/segmentstore], active since Thu Mar 19
02:00:10 EDT 2015, previous max duration 1369831ms]
org.apache.jackrabbit.oak.plugins.segment.file.FileStore TarMK compaction
completed in 1310939ms
|
How to automate
the offline compaction process.
Offline Tar Compaction is still the Adobe recommended way of
compacting Oak.
Below is the script which automates entire process.
For above process download a version of Oak Run that matches your repository version.
Steps to follow:
- Shutdown AEM
- Find Old Checkpoints
- Remove Unreferenced Checkpoints
- Compact Oak (using compact keyword in command).
- Restart AEM
SCRIPT:
#!/bin/bash
todayDate="$(date
+'%d-%m-%Y')"
logfile="compact-$
todayDate.log"
installfolder="/data/aem"
aemfolder="$installfolder/crx-quickstart"
oakrun="$installfolder/help/oak-run-1.0.18.jar"
## Shutdown AEM
printf
"Shutting down AEM.\n"
$aemfolder/bin/stop
todayDate ="$(date)"
echo "AEM
Shutdown at: $ todayDate " >> $installfolder/help/logs/$logfile
## Find old
checkpoints
printf
"Finding old checkpoints.\n"
java
-Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun checkpoints
$aemfolder/repository/segmentstore >> $installfolder/help/logs/$logfile
## Delete
unreferenced checkpoints
printf
"Deleting unreferenced checkpoints.\n"
java
-Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun checkpoints $aemfolder/repository/segmentstore
rm-unreferenced >> $installfolder/help/logs/$logfile
## Run compaction
printf
"Running compaction. This may take a while.\n"
java
-Dtar.memoryMapped=true -Xms8g -Xmx8g -jar $oakrun compact
$aemfolder/repository/segmentstore >> $installfolder/help/logs/$logfile
## Report
Completed
printf
"Compaction complete. Please check the log at:\n"
printf
"$installfolder/help/logs/$logfile\n"
## Start AEM back
up
todayDate ="$(date)"
printf
"Starting up AEM.\n"
$aemfolder/bin/start
echo "AEM
Startup at: $ todayDate " >> $installfolder/help/logs/$logfile