Troubleshooting
The following section provides information to help you troubleshoot Autonomous Identity. The topics are:
More troubleshooting tips will be added in the future. |
Where to access the logs
Autonomous Identity captures information in its log files to troubleshoot any problem.
Deployment Logs
When running the ansible playbook during the deployment, logs print to your screen (STDOUT). You can access additional
information is available through the -v
or --verbose
.
For more information, try -vvx
. To enable connection debugging, try -vvvv
.
The Cassandra install log file (installcassandra.log
) is located at /data/opt/autoid/cassandra
.
Front-End Logs
You can view any output logs of the running services on Docker using the following commands:
$ docker service logs <SERVICE NAME> --follow $ docker service ps <SERVICE NAME> --no-trunc
Cassandra Logs
You can view any output logs of the Cassandra database, which is kicked-off at startup. Autonomous Identity pipes the output message to a log file in the standard installation folder.
Log | Location | Standard Cassandra log |
---|---|---|
/data/opt/autoid/cassandra.out |
Backup Log |
/data/opt/autoid/cassandra/cassandra-backup/cassandra-backup.log |
Spark UI Logs
If the Spark UI is not available on port 8080 of the Spark master server, then do the following:
-
Check the Spark start-up logs. Check if the status of Spark UI port 8080 is not the default port, or if there is another service using the port.
-
If the UI is not accessible, run some
curl
commands to check the core and memory in the cluster.$ curl -s https://<ip-address>:8080 | grep -A 2 'Memory in use' $ curl -s https://<ip-address>:8080 | grep -A 2 'Cores in use'
For more information, Refer to Spark REST API.
How to change the Docker root folder
Docker stores its images in the root /var
folder. Customers who mount /var
with low storage can run out of disk space quickly.
How to change the Docker root folder?
-
Stop the Docker service:
$ sudo systemctl stop docker.service $ sudo systemctl stop docker.socket
-
Edit the Docker service file, and add a
-g
option to the file to redirect the root folder to another location:$ sudo vi /usr/lib/systemd/system/docker.service ExecStart=/usr/bin/dockerd -g /opt/autoid/docker -H fd:// --containerd=/run/containerd/containerd.sock
-
Make a new folder for the Docker root if needed:
$ sudo mkdir -p /opt/autoid/docker
-
Copy all of the content from the old Docker root folder to the new Docker root folder:
$ sudo rsync -aqxP /var/lib/docker/* /new/path/docker/.
-
Reload the system daemon:
$ sudo systemctl daemon-reload
-
Start the docker service:
$ sudo systemctl start docker
-
Make sure Docker is running with right arguments. The output should show Docker is running with right parameters set:
$ ps aux | grep -i docker | grep -v grep
Tune Cassandra for large data
You can tune Cassandra for large data sets or when Cassandra times out during analytics.
-
Navigate to Cassandra Folder:
cd /opt/autoid/apache-cassandra-3.11.2/conf
-
Edit the
jvm.options
, and change the Java heap size and the size of the heap size for young generation as follows:-Xms10G -Xmx10G -Xmn2800M
-
Edit the
cassandra.yml
file, and change the files as follows:$ vi cassandra.yml key_cache_size_in_mb: 1000 key_cache_save_period: 34400 max_mutation_size_in_kb: 65536 commitlog_segment_size_in_mb: 128 read_request_timeout_in_ms: 200000 write_request_timeout_in_ms: 200000 request_timeout_in_ms: 200000 counter_write_request_timeout_in_ms: 200000 cas_contention_timeout_in_ms: 50000 truncate_request_timeout_in_ms: 600000 slow_query_log_timeout_in_ms: 50000 concurrent_writes: 256 commitlog_compression: - class_name: LZ4Compressor
-
After saving the file, restart the Cassandra and Docker jobs:
-
First, find the Cassandra job:
$ ps -ef | grep cassandra. // find the PID
-
Kill the Cassandra PID.
$ kill -9 PID
-
Make sure no Cassandra process is running:
$ ps -ef|grep cassandra
-
Restart Cassandra:
$ cd /opt/autoid/apache-cassandra-3.11.2/bin $ nohup cassandra > /opt/autoid/apache-cassandra-3.11.2/cassandra.out 2>&1 &
-
Make sure to check if the following information message or similar is present:
INFO [main] 2022-01-24 23:38:26,207 Gossiper.java:1701 - No gossip backlog; proceeding
-
Restart Docker:
$ sudo systemctl docker restart
-