Run Analytics

Autonomous Identity administrators must conduct various tasks to run analytics.

The following are the basic tasks to run the analytics pipeline:

Ingest the Data Files

At this point, you should have set your data sources and configured your attribute mappings. You can now run the initial analytics job to import the data into the Cassandra or MongoDB database.

Run ingest using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Ingest, and then click Next.

  4. On the New Ingest Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary:

    • Driver Memory (GB)

    • Driver Cores

    • Executor Memory (GB)

    • Executor Cores

  6. Click Save to continue.

  7. Click one of the following commands:

    1. If you need to edit any of the job settings, click Edit.

    2. If you want to remove the job from your Jobs page, click Delete job.

  8. Click Run Now to start the ingestion run.

  9. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  10. When the job completes, you can see the change in the status.

    See it in action
    jobs ingest

Run Training

After you have ingested the data into Autonomous Identity, start the training run.

Training involves two steps:

  • Autonomous Identity starts an initial machine learning run where it analyzes the data and produces association rules, which are relationships discovered within your large set of data. In a typical deployment, you can have several million generated rules. The training process can take time depending on the size of your data set.

  • Each of these rules are mapped from the user attributes to the entitlements and assigned a confidence score.

The initial training run may take time as it goes through the analysis process. Once it completes, it saves the results directly to the database.

Run training using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Training, and then click Next.

  4. On the New Training Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

    See it in action
    jobs training

Run Recommendations

During the second phase of the predictions process, the recommendations process analyzes each employee who may not have a particular entitlement and predicts the access rights that they should have according to their high confidence score justifications. These rules will then be displayed in the UI and saved directly to the database.

Run predict-recommendation using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Predict-Recommendation, and then click Next.

  4. On the New Predict-Recommendation Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

    See it in action
    jobs recommendation

Run As-Is Predictions

After your initial training run, the association rules are saved to disk. The next phase is to use these rules as a basis for the predictions module.

The predictions module is comprised of two different processes:

  • as-is. During the As-Is Prediction process, confidence scores are assigned to the entitlements that users do not have. The as-is process maps the highest confidence score to the highest freqUnion rule for each user-entitlement access. These rules will then be displayed in the UI and saved directly to the database.

  • Recommendations. See Run Recommendations.

Run predict as-is using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Predict-As-Is, and then click Next.

  4. On the New Predict-As-Is Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

    See it in action
    jobs predict as is

Publish the Analytics Data

Populate the output of the training, predictions, and recommendation runs to a large table with all assignments and justifications for each assignment. The table data is then pushed to the Cassandra or MongoDB backend.

Run publish using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Publish, and then click Next.

  4. On the New Publish Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click one of the following commands:

  8. Click Run Now.

  9. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  10. When the job completes, you can see the change in the status.

    See it in action
    jobs publish

Create Assignment Index

Next, generate the Elasticsearch index for the system.

Run create-assignment-index using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Create Assignment Index, and then click Next.

  4. On the New Create Assignment Index Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

    See it in action
    jobs create assignment index

Run Insight Report

Next, run an insight report on the generated rules and predictions that were generated during the training and predictions runs. The analytics command generates insight_report.txt and insight_report.xlsx and writes them to the /data/input/spark_runs/reports directory.

The report provides the following insights:

  • Number of assignments received, scored, and unscored.

  • Number of entitlements received, scored, and unscored.

  • Number of assignments scored >80% and <5%.

  • Distribution of assignment confidence scores.

  • List of the high volume, high average confidence entitlements.

  • List of the high volume, low average confidence entitlements.

  • Top 25 users with more than 10 entitlements.

  • Top 25 users with more than 10 entitlements and confidence scores greater than 80%.

  • Top 25 users with more than 10 entitlements and confidence scores less than 5%.

  • Breakdown of all applications and confidence scores of their assignments.

  • Supervisors with most employees and confidence scores of their assignments.

  • Top 50 role owners by number of assignments.

  • List of the "Golden Rules," high confidence justifications that apply to a large volume of people.

Run the insight report using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Insight, and then click Next.

  4. On the New Insight Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

  10. Access the insight report. The report is available at /data/output/reports/insight_report.xlsx.

Run Anomaly Report

Autonomous Identity provides a report on any anomalous entitlement assignments that have a low confidence score but are for entitlements that have a high average confidence score. The report’s purpose is to identify true anomalies rather than poorly managed entitlements.

The report generates the following points:

  • Identifies potential anomalous assignments.

  • Identifies the number of users who fall below a low confidence score threshold. For example, if 100 people all have low confidence score assignments to the same entitlement, then it is likely not an anomaly. The entitlement is either missing data or the assignment is poorly managed.

Run the anomaly report using the UI:

  1. On the Autonomous Identity UI, click the Administration link, and then click Jobs.

  2. On the Jobs page, click New Job. You will see a job schedule with each job in the analytics pipeline.

  3. Click Anomaly, and then click Next.

  4. On the New Anomaly Job box, enter the name of the job, and then select the data source file.

  5. Click Advanced and adjust any of the Spark properties, if necessary.

  6. Click Save to continue.

  7. Click Run Now to start the ingestion run.

  8. Next monitor the state of the job by clicking Logs, or click Refresh to update the Jobs page.

  9. When the job completes, you can see the change in the status.

  10. Access the anomaly report. The report is available at /data/output/reports/anomaly_report/<report-id>.csv.