Identity Cloud

Administration

Two important administrator tasks is to run the training pipeline, tune the AI/ML models, and make adjustments to the risk configuration.

ForgeRock Professional Services (FPS) will run these tasks as any misconfiguration can result in failed operations.

Training

The Training pipeline is a multi-step process automating machine learning pipelines to generate the machine learning models.

Initially, the Autonomous Access’s training pipeline runs model-less on a default, pre-seeded data source. The training pipeline uses this input data and heuristics to generate ML models. The training pipeline iteratively repeats its processing to improve the accuracy of its models.

Once training has completed, you must tune the models for accuracy and performance. Once the models are tuned, the Autonomous Access engine polls for a configuration change and initiates a model download to itself. The Autonomous Access engine then runs a risk explainability job to refresh the model.

From this point, Autonomous Access begins to use the new ML model.

Summary:

  • Verify data source. The initial step is to define a training dataset, or data source. For initial deployments, Autonomous Access uses a default, pre-seeded data source initially. The data source autoaccess-ds pulls in data from the data lake. You only need to verify that the autoaccess-ds is present and active.

  • Run training. Run the training pipeline. When training starts, we run heuristics, as soon as we get some few events, the heuristics make sense.

  • Tune the models. Tune the models for training performance if necessary. If you tune the models, rerun the training pipeline.

  • Publish the model. Publish the model to save and use it.

Verify the data source

You can skip this section for Identity Cloud tenants. This section is presented for information only.

Autonomous Access automatically uses an out-of-the-box data source, autoaccess-ds, that accesses the customer’s data lake within the Identity Cloud tenant’s cloud storage data for ML training. You do not need to define any data sources in this case.

The out-of-the-box data source also does not require attribute mapping. You simply verify the data source on the Autonomous Access Identity Cloud UI (see Run training) when setting up your training run.

auto access data sources

Autonomous Access uses cloud storage data for its ML training runs and Elasticsearch data for its heuristic predictions. The Autonomous Access result journey node collects data and stores it in cloud storage and stores risky events in Elasticsearch.

The general guidelines for customer data storage is as follows:

  • Three months of access logs. The Autonomous Access activity dashboard displays the anomalous accesses occurring over the past three months. As a result, Autonomous Access requires three months or more of data in Elasticsearch.

  • Google Cloud storage. Autonomous Access requires six months of customer data or data with 1000 or more events for optimal AI/ML analytics results.

  • Secure data. All customer data resides within each customer’s private tenant and cannot be accessed externally to the tenant.

Verify the default data source (new deployments):
  1. On the Autonomous Access UI, click Risk Administration > Data Sources.

  2. Verify that the autoaccess-ds is present and activated.

You do not have to set the mapping as it is configured already. Next, Run training.

Run training

Using the default data source, autoaccess-ds, run the training pipeline on the UI.

The general guidance as to when you can run your first training model is as follows:

  • Six months of customer data. For optimal results, Autonomous Access requires six months of customer data, or data with 1000 or more events. Data collections with less than 1000 events will not yield good ML results.

The training pipeline takes time to process as it iteratively runs the machine learning process multiple times.
Run training:
  1. On the Identity Cloud UI, go to Risk Administration > Pipelines.

  2. Click Add Pipeline.

    See it in action

    run training pipeline

  3. On the Add Pipeline dialog box, enter the following information:

    1. Name. Enter a descriptive name for the training pipeline.

    2. Data Source. Select the data source to use for the pipeline.

    3. Select Type. Select Training. The dialog opens with model settings you can change if you understand machine learning.

      • Model A. Model A is a neural-network module used to code data. You can configure the following:

        • Batch size. The batch size of a dataset in MB.

        • Epochs. An epoch is the number of iterations the ML algorithm has completed during its training on the entire dataset.

        • Learning rate. The learning rate is a parameter that determines how much to tune the model as model weights are adjusted. The learning rate adjusts the step size of each iterative training pass.

      • Model B. Model B is a neural-network module used to manage data points. You can configure the following:

        • Batch size. The batch size of a dataset in MB. The smaller the batch size, the longer the training will run, but the result may be better.

        • Epochs. An epoch is the number of iterations the ML algorithm has completed during its training on the entire dataset.

        • Learning rate. The learning rate is a parameter that determines how much to tune the model as model weights are adjusted. The learning rate adjusts the step size of each iterative training pass.

      • Model C. Model C is a neural-network module grouping data points.

        • Max number_of_clusters. The maximum number of clusters for the dataset.

        • Min number_of_clusters. The minimum number of clusters for the dataset.

      • Embeddings. The Embeddings module is a meta-model to convert fields into numbers that the ML models can understand. Autonomous Access trains the embedding model, and then trains the other models on top of the Embeddings model.

        • Embedding dimension. The embedding dimension determines how many numbers/tokens to use in the encoding. For the purposes of Autonomous Access, 20 is the default number and is sufficient for most Autonomous Access applications.

        • Learning rate. The learning rate is a parameter determining how much to tune the model as model weights are adjusted. Thus, the learning rate adjusts the step size of each iterative training pass.

        • Window. Window indicates how far ahead and behind to look for the data. The minimum number is 1. The default number is 5.

  4. Click Save.

  5. Click the trailing dots, click Run Pipeline, and then click Run. Depending on the size of your data source and how you configured your pipeline settings, the training run may take time to process.

  6. Upon a successful run, you will see a Succeeded status.

Tuning Training

Autonomous Access supports the ability to tune the AI/ML training models for greater accuracy. There are three things that you can check to look at ML model performance:

  • Training logs. Each model generates a metadata file that show the train_losses and val_losses (validation losses). Train loss indicates how well the model fits the training data. Validation loss indicates how well the model fits new training data. The losses tend to move down from a high value (near 1) to a low value (toward 0). Another thing to notice is how well are the train losses to the validation losses. The bigger the gap between the two loss numbers is an indicator that the model memorizes the training set but does not generalize very well. One thing you can do to improve this gap is to increase the number of epochs and make the learning rate smaller, which adjusts the model closer to data. For the embedding model, you can increase the window size and decrease the learning rate to improve results.

    If you are seeing a model with a big gap between the training loss and validation loss, it could mean you have too many parameters, meaning that an overfitting of the data is taking place. One possible solution is to reduce the embedding dimension. Contact the ForgeRock data scientists for assistance.
  • ROC curve and confusion matrix. You can view the receiver operator characteristics (ROC) curve and confusion matrix on the UI. For explanation of the terms, see Training model terms.

  • Risk Configuration. The Autonomous Access lets you fine-tune the backend Autonomous Access server. For more details, see Risk configuration.

ForgeRock Professional Services staff will tune your training models. Any misconfiguration can result in failed operations.
Tune the training models:
  1. On the Pipelines page, click the dots next to a training run, and then click View Logs.

  2. Click the dots next to a training run, and then click View Run Details.

  3. On the Training Execution Details, click the dots, and then click Results. The training results are displayed.

  4. Tune each model by adjusting the threshold. You can select the model on the drop-down list, and adjust the threshold to view the optimal balance of parameters in the confusion matrix on the Decision node. You can view the graphs of the following models:

    • Ensemble. Displays the best average of all charts in one view.

    • Model A. Displays the Model A charts.

    • Model B. Displays the Model B charts.

    • Model C Displays the Model C charts.

  5. To close the dialog, click OK.

  6. Finally, if you are satisfied with the models' performance, click Publish to save the training model. Once published, you can only overwrite it with another training run.

    See it in action
    tune training models

Training model terms

The following training model terms are presented to better understand the tuning models:

  • Receiver Operator Characteristics. A receiver operator characteristic (ROC) curve is graphical plot that shows the tradeoffs between true positives and false positives of a model as the risk threshold changes. The x-axis shows the false positive rate (FPR), the probability of a false alarm; the y-axis shows the true positive rate (TPR), probability of correct detection. The diagonal line is a random classifier. The points above the diagonal represent good classification results; points below the diagonal represent bad results. The ROC curve with blue points is a probability curve. The ideal representation is called a perfect classification, where (0,1) indicate no false negatives and 100% true positives, and where the graph looks like an upside-down "L". The area under the curve (AUC) represents how well the model can distinguish a true positive and a false negative. The higher the AUC, the better the model is at distinguishing between a risky threat and no threat.

    auto access roc

  • Confusion Matrix. A confusion matrix is an 2x2 (also called a binary classification) table that aggregates the ML model’s correct and incorrect predictions. The horizontal axis is the actual results; the vertical axis is the predicted results. Note that each prediction in a confusion matrix represents one point on the ROC curve.

    auto access confusion matrix

  • True Positive. A true positive is an outcome where the model correctly identifies an actual risky threat as a risky threat.

  • True Negative. A true negative is an outcome where the model correctly identifies a non-risky threat as a non-risky threat.

  • False positive. A false positive is an outcome where the model incorrectly identifies a non-risky threat as a risky threat.

  • False negative. A false negative is an outcome where the model incorrectly identifies a risky threat as a non-risky threat.

  • True Positive Rate (TPR). The rate of probability of detection, where TP/(TP+FN), where TP is the number of true positives and FN is the number of false negatives. The rate is the probability that a positive threat is predicted when the actual result is positive. The TPR is also known as recall.

  • False Positive Rate (FPR). The rate of the probability of a false alarm, where FP/(FP+TN), where FP is the number of false positives and TN is the number of true negatives. "FP+TN" is the total number of negatives. The rate is the probability that a false alarm will be raised, where a positive threat is predicted when the actual result is negative.

  • Positive Predictive Value (PPV). The rate where TP/(TP+FP), where TP is the number of true positives and FP is the number of false positives. "TP+FP" is the total number of positives. The ideal value of the PPV with a perfect prediction is 1 (100%), and the worst is zero. The rate is the probability that a predicted positive is a true positive. The PPV is also known as precision.

  • Area under the curve. On a scale from 0 to 1, the area under the curve (AUC) shows how well the model distinguishes between a true positive and false negative. If the AUC is closer to 1, the better the model.

  • Centroids. On model C, the centroids represents a typical user and describes the profiles of known users.

  • Precision Recall. The precision recall curve is a graphical plot that shows the tradeoff between precision and recall for different threshold values. The Y-axis plots the positive predictive value (PPV), or precision; the x-axis plots the true positive rate (TPR), or recall. A model with ideal results are at the point (1,1). The precision recall is a different way to view the model’s performance. Most users can use the ROC and confusion matrix.

The Ensemble model is the best average of the three models and should show better results than each individual model. The model C chart is a choppier step graph.

Configuration

Customers in different industry verticals require varying risk policies and heuristics in their applications. ForgeRock designed Autonomous Access’s risk configuration with this in mind.

To enable easy configuration of its risk parameters, Autonomous Access stores its risk configuration settings in a YAML-based file. You can easily modify the parameters to determine how risk is evaluated, and the response Autonomous Access sends back to its node. The Autonomous Access server polls the configuration file every ten minutes (default) for changes to the file.

Customers should not edit this file unless they know what they are doing. Misconfiguration of the file can result in an inoperable service. ForgeRock Professional Services group will make the appropriate risk configuration modifications.

Risk configuration

The risk configuration page provides an extremely extensible and performant server configuration giving the end-user full control of their Autonomous Access system. There are two main sections to focus: heuristics enablement and risk thresholds.

Edit the risk configuration:
  1. On the Autonomous Access UI, go to Risk Administration > Risk Config.

  2. In the heuristics section of the file, set a heuristic to False if you want to disable it. For example, in the code snippet below, change the value from true to false next to "machine_to_machine_done" to disable the automated user agent filter.

    - name: automatedUserAgentsFilter
      withParallelStep:
        joinStep: Check if ready to respond
        postAction: >-
          let request = .request
    
          let response = .response
    
          $request | {
            "stepStatus" : {
              "machine_to_machine_done" : true, * : .
            },
            "heuristic_agg_result" : {
              "risk_score_data" : {
                "raw_results" : .heuristic_agg_result.risk_score_data.raw_results + [$response]
              }, * : .
            }, * : .
          }
        actions:
          - action: automatedUserAgent(.)
  3. In the heuristics config section of the file, edit any of the threshold parameters for each heuristic if necessary.

      "heuristics_config": {
        "single_line": {
          "machine_to_machine": {
            "MACHINE_TO_MACHINE_HEURISTIC_ENABLED": true
          }
        },
        "HEURSITIC_RISK_THRESSHOLD": 95.0,
        "multi_line": {
          "brute_force": {
            "BRUTE_FORCE_ENABLED": true,
            "BRUTE_FORCE_WINDOW_MS": 300000,
            "BRUTE_FORCE_COUNT_THRESHOLD": 10,
            "BRUTE_FORCE_RISK_SCORE": 100.0
          },
          "rarity_vectorizer": {
            "RARITY_VECTORIZER_ENABLED": false,
            "MIN_EVENT_COUNT_FOR_ZERO_RISK_SCORE": 20,
            "RARITY_RISK_SCORE_COMPUTE_STRATEGY": "softmax"
          },
          "credential_stuffing": {
            "CREDENTIAL_STUFFING_ENABLED": true,
            "CREDENTIAL_STUFFING_RISK_SCORE": 100.0,
            "CREDENTIAL_STUFFING_WINDOW_MS": 300000,
            "CREDENTIAL_STUFFING_COUNT_THRESHOLD": 10
          },
          "impossible_travel": {
            "IMPOSSIBLE_TRAVEL_ENABLED": true,
            "IMPOSSIBLE_TRAVEL_SPEED_CUTOFF_MPH": 700.0,
            "IMPOSSIBLE_TRAVEL_RISK_SCORE": 100.0
          },
          "HEURISTIC_RISK_SCORE_COMPUTE_STRATEGY": "max",
          "suspicious_ip": {
            "SUSPICIOUS_IP_ENABLED": true,
            "SUSPICIOUS_IP_WINDOW_MS": 300000,
            "SUSPICIOUS_IP_RISK_SCORE": 100.0,
            "SUSPICIOUS_IP_COUNT_THRESHOLD": 10
          }
        },
        "HEURISTIC_RISK_SCORE_COMPUTE_STRATEGY": "max",
        "RARITY_RISK_THRESSHOLD": 75.0
      },
  4. After you have made your changes to the file, click Save. You will see a Preview Risk Evaluation modal.

  5. On the Preview Risk Evaluation modal, do the following:

    1. Click Bucket Search to select your data source location, or type the name of the data source location.

    2. Optional. Enter an object prefix to filter your search results.

    3. Next to your desired object, click the trailing dots, and then click Preview Object to view your data source change(s).

    4. Click Preview Risk Evaluation to review a simulated risk evaluation for the first event.

    5. If you are satisfied with your change(s), click Save Config.

      See it in action
      auto access risk config
Copyright © 2010-2022 ForgeRock, all rights reserved.