The sailpoint log contains multiple Early Heartbeat warnings, such as:
2022-04-24T21:19:34,931 WARN HeartbeatThread sailpoint.server.HeartbeatService:636 - Early heartbeat for server myserver
2022-04-24T21:19:34,931 WARN HeartbeatThread sailpoint.server.HeartbeatService:637 - Last: 4/24/2022 21:19:32 PM PDT
2022-04-24T21:19:34,931 WARN HeartbeatThread sailpoint.server.HeartbeatService:638 - Now: 4/24/2022 21:19:34 PM PDT
2022-04-24T21:19:34,932 WARN HeartbeatThread sailpoint.server.HeartbeatService:639 - Expected: 4/24/2022 21:19:42 PM PDT
You might also see concurrency issues such as OverlappingFileLockException from the FullTextService:
2022-04-24T21:19:33,148 ERROR ServerThread sailpoint.api.FullTextifier:412 - Exception building index
java.nio.channels.OverlappingFileLockException: null
at sun.nio.ch.SharedFileLockTable.checkList(FileLockTable.java:255) ~[?:1.8.0_181]
at sun.nio.ch.SharedFileLockTable.add(FileLockTable.java:152) ~[?:1.8.0_181]
at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1108) ~[?:1.8.0_181]
at java.nio.channels.FileChannel.tryLock(FileChannel.java:1155) ~[?:1.8.0_181]
at org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:126) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:729) ~[lucene-core-8.2.0.jar:8.2.0 31d7ec7bbfdcd2c4cc61d9d35e962165410b65fe - ivera - 2019-07-19 15:05:56]
at sailpoint.fulltext.Builder.startLucene(Builder.java:299) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.fulltext.Builder.refreshIndex(Builder.java:212) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.fulltext.Builder.refreshAll(Builder.java:170) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.api.FullTextifier.refreshAll(FullTextifier.java:199) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.server.FullTextService.refresh(FullTextService.java:186) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.server.FullTextService.execute(FullTextService.java:141) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.server.Servicer.execute(Servicer.java:981) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.server.Servicer.processServices(Servicer.java:932) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
at sailpoint.server.Servicer.run(Servicer.java:677) [identityiq.jar:8.1 Build dfd55b86e66-20210421-132208]
and other unexpected events, such as duplicate copies of the same Task running at once on the same server, which can result in all kinds of errors, including LazyInitializationException or random data corruption. You might also see performance issues, as tomcat will be using twice as much memory as normally expected.
In thread dumps (from /debug/threads.jsf) you will have multiple copies of various threads such as the RequestProcessor and the HeartbeatThread:
27 | HeartbeatThread | 10 | TIMED_WAITING | |||
|
...
178 | HeartbeatThread | 10 | TIMED_WAITING | |||
|
||||||
These duplicate HeartbeatThreads are the cause of the early heartbeat warnings.
There are two instances of IdentityIQ running on the same appserver. On tomcat, very often this is due to an attempt to change the web context of IdentityIQ from "/identityiq" to just "/". If this is done by adding a <Context> in the server.xml, for example:
<Host name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="true">
<Context path="" docBase="identityiq">
</Context>
then instead of just changing the context you are actually creating a second context, with duplicate threads.
On other appservers, it could be due to having two copies of the same identityiq.war deployed.
Note: You may also get early heartbeat warnings if you have IIQ installed on multiple server and the server times are not in sync. If that is the case, simply make sure the server times get synchronised. In this case, you will likely also see warnings about the heartbeat being missed: "Servers missed a heartbeat: <list of servers that missed a heartbeat>".
There are a few ways to correct this, depending on your configuration:
Please consult with your appserver admin for the best solution for your environment.
Great writeup here, thanks! Additionally, I've seen early heartbeat warnings when the time on the server is different from the other IdentityIQ application server(s). Cheers!
What about for parallel deployments in tomcat? What are the pitfalls of this? Or has anyone tried this? I've seen this due to parallel deployments.
@walker_pa , what do you mean by parallel deployments? Do you mean multiple IIQ installs on the same server but separate tomcats, all connecting to the same IIQ database? If so then yes, those also would cause issues.
@Eric_Mendes_CISSP , yes, having different times between servers can also cause early heartbeat warnings. I'll add a note.
@stephan_dasilva I mean tomcat parallel deployments. One tomcat server installed on a server, with parallel deployments turned on. https://tomcat.apache.org/tomcat-9.0-doc/config/context.html#Parallel_deployment
@walker_pa , thanks for the link explaining about parallel deployments!
if I understand correctly how parallel deployments work, this seems like a bad idea for IdentityIQ. It sounds like you'd have different versions of your IIQ install running on the same appserver. Presumably that would be different patches, or maybe just a version with some .js or .xhtml files changed.
If it's different patch versions all connecting to the same database, I don't think the older patch would even work, since it checks the database for the patch version during startup. But if you do have multiple versions running and they are using the same IIQ database, I would expect you would get the same early heartbeat issue.
@stephan_dasilva I can definitely see how the different version would be a complication. I hadn't actually considered that, so thank you for bringing that up! I will have to build that into the process somehow.
The idea of the parallel deployment is that you can have two wars (same version/patch, of course) out there at the same time. The newer war will start up and all new connections will go to it. Eventually, all connections will be on the newer war, and the older war will be removed from the server. I was trying this out to prevent any downtime when deploying the war. I just wasn't sure if anyone at SailPoint had explored that as a possibility. It sounds like it's not a common practice, but I got curious and tried it in our lower environments. It honestly seems to work properly, but I did notice the heartbeat logs, so thought I'd ask.
@walker_pa , I see the appeal of using parallel deployments. I'm not aware of anyone having tried it.
The Early Heartbeat messages (and other unwanted consequences) are not coming from the http connections made to the deployment; they are coming from the background threads that the deployments start up to run Requests and Services. These are threads that run some process every 30s/1min/etc, one of which is the Heartbeat Service. And if you have multiple deployments, whether it be two instances of IIQ under different contexts or parallel deployments, those threads will interfere with each other.
Thank you for the article , What is the issue it was it cause ( deployment having multi task and UI server) if we ignore this warning log or just update log4j2 not to not show this log.
@vishal_kejriwal sorry for the late reply. One of the issues with having two instances if IIQ on the same server is that you have duplicate threads for some processes, which can cause Tasks and Requests for that server to run in duplicate, i.e. two separate threads running the same Task or Request. And that leads to all sorts of issues (locking, data corruption, unexpected/unexplained errors, ...)