AK-MCS#

Theory#

AK-MCS is an active learning reliability method combining Kriging and Monte Carlo Simulation.

Todo

Write this section.

Syntax#

class flx.gpr.akmcs#

Represents/manages an instance of AK-MCS.

__init__(config)#

Defines the AK-MCS handler.

Parameters:

config (dict) – Configuration directory. The structure of config is outlined in detail in the following.

General properties:

The following properties are required/allowed in config:

  • sampler (flx.sampler): Sampler of a set of random variables.

  • lsf (flxPara): Limit-state function of the structural reliability problem to investigate. The function should depend on the random variables associated with sampler.

  • gp (flx.gpr.gp, optional): Gaussian process for Kriging of the limit-state function.

    If not specified, a zero-mean Gaussian process with a Gaussian kernel structure is used.

    Important

    It is strongly recommended to use only Gaussian processes with a mean_type of "zero" (see flx.gpr.gp.__init__()).

  • seed (int, default: 0): Seed value for the Pseudo-Random-Number-Generator. Number must be positive or zero.

  • N_RNG_init (int, default: 0): Number of initial calls to the Pseudo-Random-Number-Generator. Number must be positive or zero.

  • N_reserve (int, default: \(10^4\)): Memory is allocated for N_reserve calls of the actual limit-state function (lsf). Number must be positive.

  • itermax (int, default: 500): Maximum number of iterations used for optimizing the process parameters of the Gaussian process (see parameter itermax of flx.gpr.gp.optimize()).

  • NmaxSur (int, default: \(10^7\)): Maximum number of samples to use for Monte Carlo sampling of the surrogate model. If set to 0, an upper limit is not taken into account. Value must not be negative.

  • Nsmpls (int, default: \(10^6\)): Initial number of calls of the surrogate model (per simulation step). Value must be larger than zero.

  • err_thresh (float, default: 0.3): Threshold \(\varepsilon_\mathrm{t}\) for evaluating the Stopping criterion. Value must be larger than zero.

  • tqi_val (float, default: 0.99): Quantile value for evaluating the Stopping criterion. Value should be within \((0.5,1)\).

  • data_box (flx.dataBox): A flx.dataBox for post-processing (e.g. writing to a file) of performed limit-state evaluatios. The dimension of the input vector of the flx.dataBox must equal the number of standard Normal random variables in sampler, the dimension of the output vector must eqal one.

  • init_accept_only_unique (bool, default: False):

    • True: when importing samples from data_box, an error if thrown if the samples added are not unique.

    • False: any repeated samples imported from data_box are ignored

  • account4noise (bool, default: False):

    • True: the model for the uncertainty about the limit-state function accounts for nummerical noise

    • False: a close-to-zero value is assigned to the noise parameter

  • optimize_after_N_calls (int, default: 20): Optimize the parameters of the underlying Gaussian process after the specified number of calls of the actual limit-state functions.

initialize_with_LHS(N=0, box_bounds=3.)#

Generate an initial set of samples for AK-MCS, using N Latin-hypercube samples.

Parameters:
  • N (int) – Number of samples. Value must be positive or zero. If N is set to 0, the number of samples is set to two times the dimension of the input vector.

  • box_bounds (float) – Specifies the bounds (in Standard Normal space) of the hyper-cube in which the Latin-hypercube samples are generated. Sampling within the hypercube is performed uniformely.

Return type:

None

simulate(N=0, output_gp_info=False)#

Perform a N simulation steps of the surrogate model.

Parameters:
  • N (int) – Number simulation steps to conduct. If 0, a single simulation step is conducted.

  • output_gp_info (bool) – If True, at each iteration step information about the Gaussian process is outputted.

Return type:

flx.gpr.akmcs_status

res: dict#

A dictionary that contains information about the previous call to flx.gpr.akmcs.simulate().

The dictionary has the following structure:
  • mean_pf_bayesian (type: float): \(\operatorname{E}\left[p_\mathrm{f}\right]\), Bayesian posterior mean value of the probability of failure

  • pf_mle (type: float): MLE estimate of the probability of failure

  • pf_no_Kriging_uncertainty (type: float): Estimate of the probability of failure, if the uncerainty of the Kriging model is ignored.

  • Pr_q_tqi (type: float): \(p_\mathrm{tqi}\) (see Stopping criterion), the current value associated with the quantile tqi_val.

  • err (type: float): the current error value (see Stopping criterion)

  • r (type: float): Value of the ratio \(r\) (see Stopping criterion) that is associated with \(p_\mathrm{tqi}\).

  • r_increase_N_surrogate (type: float): Value of the ratio \(r\) (see Stopping criterion), if the MCS on the surrogate model is performed with twice as many samples.

  • r_no_Kriging_uncertainty (type: float): Value of the ratio \(r\) (see Stopping criterion), if the uncertainty from the Kriging model is ignored; i.e., only the sampling uncertainty from the surrogate model is considered.

  • af (type: float): af = (r_increase_N_surrogate-r_no_Kriging_uncertainty)/r; if this value is positive, the actual limit-state function is evaluated in the next iteration; if the value is negative, the number of surrogate samples is increase in the next iteration.

  • propose_to_increase_N_smpls_surrogate (type: int): 1, if af is negative, otherwise 0.

  • N (type: int): Total number of samples used to perform a Monte Carlo Simulation on the surrogate model.

  • N_model_calls (type: int): Total number of observations of the actual limit-state function (i.e., number of data-points to train the surrogate model).

  • Uval_worst_point (type: float): The probability about the sign of the limit-state function is quantified for each surrogate sample. Uval_worst_point corresponds to the standard Normal transformation of this probability. Thus, the larger Uval_worst_point differs from zero, the larger the confidence about the sign of the most uncertain surrogate point.

  • sd_worst_point (type: float): The standard deviation associated with the worst point.

  • sd_avg (type: float): The average standard deviation of the Kriging model at the sampling points.

  • kernel_sd (type: float): A-priori standard deviation of the kernel.

  • kernel_corrl (type: numpy.ndarray): A-priori correlation length of the kernel.

get_GP()#

Retrieve a reference to the internal Gaussian process.

Note

Do not modify the properties of the returned object to avoid inconsistent behavior.

Return type:

flx.gpr.gp

get_N_model_calls(only_from_current_run=True)#

Retrieve total number of calls of the actual limit-state function.

Parameters:

only_from_current_run (bool) –

  • True: return number of limit-state function calls from the current instance.

  • False: return total number of available observations of the limit-state function.

Return type:

int

class flx.gpr.akmcs_status#

An enumeration of the status of a flx.gpr.akmcs.

Members:
  • undefined: initial state; the Gaussian process is currently not initialized

  • defined: internal state, nothing to do

  • evalLSF: requires a new call of the actual limit-state function of the model

  • increase_N_surrogate: requires an increase of surrogate samples

  • decrease_N_surrogate: a decrease of the number of surrogate samples is proposed

  • stop_success: stop is recommended, as error is below the specified threshold

  • stop_iterLimit: maximum number of surrogate samples is exceeded

Stopping criterion#

The value config['tqi_val'] in flx.gpr.akmcs.__init__() defines a quantile of the distribution that represents the uncertainty about the value of the probability of failure estimated by flx.gpr.akmcs (which includes both the uncertainty from the Kirging model and the sampling uncertainty from MCS on the surrogate model). We refer to the value associated with this quantile as \(p_\mathrm{tqi}\). The ratio \(r\) is defined as:

\[ r = \frac{p_\mathrm{tqi}}{\operatorname{E}\left[p_\mathrm{f}\right]}\;, \]

where \(\operatorname{E}\left[p_\mathrm{f}\right]\) is the Bayesian posterior mean value of the probability of failure. Note that \(r\) equals one if \(p_\mathrm{tqi}\) equals \(\operatorname{E}\left[p_\mathrm{f}\right]\).

The error value \(\varepsilon\) is evaluated based on \(r\) as:

\[ \varepsilon = r - 1 \]

The probability of failure estimated through AK-MCS is considered sufficently accurate, when \(\varepsilon\le\varepsilon_\mathrm{t}\) (where \(\varepsilon_\mathrm{t}\) is defined through value config['err'] in flx.gpr.akmcs.__init__()). If this condition is met, AK-MCS stops the iteration.

Importing data into an instance of AK-MCS#

property akmcs#

A post-processor that imports any sample added to the corresponding flx.dataBox to an instance of flx.gpr.akmcs.

Parametrization:

Parameters of this post-processor can be specified as additional key-value pairs in an object of type dataBox_postProc_type. The following parameters are accepted:

States:

When the function flx.dataBox.postProc.eval() is called on this post-processor, the following states are returned:

Application Examples#

Example 1#

import fesslix as flx
flx.load_engine()
import fesslix.gpr
Random Number Generator: MT19937 - initialized with rand()=1321323247;
Random Number Generator: MT19937 - initialized with 1000 initial calls.
## ==============================================
## Generate input model
## ==============================================
config_rv_R = { 'name':'R', 'type':'logn', 'mu':6., 'sd':1. }
config_rv_S = { 'name':'S', 'type':'normal', 'mu':1., 'sd':1.0 }
rv_set = flx.rv_set( {'name':'rv_set'}, [ config_rv_R, config_rv_S ] )
sampler = flx.sampler(['rv_set'])
## ==============================================
## Set up dataBox (for storing preformed model calls)
## ==============================================
dBox_1 = flx.dataBox(2,1)
dBox_1.write2file( {
    'fname': "akmcs_samples.bin",
    'append': False,
    'binary': True,
    'cols': 'all'
    } )
## ==============================================
## Define the AK-MCS sampler
## ==============================================
config = {
        "sampler": sampler,
        "lsf": "rbrv(rv_set::R)-rbrv(rv_set::S)",
        "err_thresh": 0.05,
        "data_box": dBox_1
    }
ak_mcs = fesslix.gpr.akmcs(config)
## ==============================================
## Initialize with Latin-hypercube sampling
## ==============================================
ak_mcs.initialize_with_LHS(N=5,box_bounds=4.)
## ==============================================
## Perform a single simulation step
## ==============================================
state = ak_mcs.simulate()
print(state, ak_mcs.res)
akmcs_status.evalLSF {'mean_pf_bayesian': 0.00010819967729323618, 'pf_mle': 0.00010719989369259076, 'pf_no_Kriging_uncertainty': 9.9e-05, 'Pr_q_tqi': 0.0002816033867044683, 'err': 1.6026268631216336, 'af': 0.521996659540296, 'r': 2.6026268631216336, 'r_increase_N_surrogate': 2.5956829409173, 'r_no_Kriging_uncertainty': 1.2371204123379684, 'r_no_Kriging_uncertainty_AND_N_half': 1.3538411459613409, 'propose_to_increase_N_smpls_surrogate': 0, 'N': 1000000, 'N_model_calls': 5, 'Uval_worst_point': 2.1886090971947027e-05, 'sd_worst_point': 0.07803116024457694, 'sd_avg': 0.19728767819341295, 'kernel_sd': 5.213420867013631, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0005213420864618494}
## ==============================================
## Retrieve the current state of AK-MCS
## ==============================================
gp = ak_mcs.get_GP()
gp_info = gp.info()
print( gp_info['noise_log'] )
print( gp_info['kernel'] )
#print( gp_info['opt_log'] )
»» LSE-results  logl=-12.0165  »»  sd_obsv [8.750648 <- 5.213421] »»  sd_Z [8.750648 <- 5.213421] »»  sd_noise [8.750648e-4 <- 5.213421e-4] 

{'type': ['gauss', 'gauss'], 'para_vec': array([5.21342087, 5.60715716, 5.54952167]), 'n_vec': array([1.        , 2.58264733, 2.64116891]), 'kernel_sd': 5.213420867013631}
for i in range(10):
    state = ak_mcs.simulate()
    print(state, ak_mcs.res)
    if state == fesslix.gpr.akmcs_status.stop_success or state == fesslix.gpr.akmcs_status.stop_iterLimit:
        break
dBox_1.close_file()
print(ak_mcs.get_N_model_calls())
akmcs_status.evalLSF {'mean_pf_bayesian': 5.4346325542189164e-05, 'pf_mle': 5.3346434234840245e-05, 'pf_no_Kriging_uncertainty': 5.4e-05, 'Pr_q_tqi': 8.816669328214764e-05, 'err': 0.6223119484628938, 'af': 0.12052270136260605, 'r': 1.6223119484628938, 'r_increase_N_surrogate': 1.537840200144653, 'r_no_Kriging_uncertainty': 1.342314781663072, 'r_no_Kriging_uncertainty_AND_N_half': 1.5219808847912812, 'propose_to_increase_N_smpls_surrogate': 0, 'N': 1000000, 'N_model_calls': 6, 'Uval_worst_point': 0.012751749352661384, 'sd_worst_point': 0.03075644652181861, 'sd_avg': 0.31260778932620537, 'kernel_sd': 9.500292624338744, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0009500292617908745}
akmcs_status.evalLSF {'mean_pf_bayesian': 5.1892537750657253e-05, 'pf_mle': 5.0892641535732754e-05, 'pf_no_Kriging_uncertainty': 5.1e-05, 'Pr_q_tqi': 7.552625148591057e-05, 'err': 0.45543568997941297, 'af': 0.006207984743823717, 'r': 1.455435689979413, 'r_increase_N_surrogate': 1.3599821419858784, 'r_no_Kriging_uncertainty': 1.3509468194268697, 'r_no_Kriging_uncertainty_AND_N_half': 1.5360656122478058, 'propose_to_increase_N_smpls_surrogate': 0, 'N': 1000000, 'N_model_calls': 7, 'Uval_worst_point': 0.08417067042919676, 'sd_worst_point': 0.023927516793982275, 'sd_avg': 0.17225661268239553, 'kernel_sd': 8.831410181560653, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008831410207766181}
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 5.4262279457463636e-05, 'pf_mle': 5.326238798202255e-05, 'pf_no_Kriging_uncertainty': 5.3e-05, 'Pr_q_tqi': 7.18534059505887e-05, 'err': 0.324187016634913, 'af': -0.0921860498325348, 'r': 1.324187016634913, 'r_increase_N_surrogate': 1.2205286980426697, 'r_no_Kriging_uncertainty': 1.3426002683457714, 'r_no_Kriging_uncertainty_AND_N_half': 1.5224460216320026, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 1000000, 'N_model_calls': 8, 'Uval_worst_point': 1.1511962061298635, 'sd_worst_point': 0.03060004539918942, 'sd_avg': 0.07613730537691898, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 5.52778117444744e-05, 'pf_mle': 5.477786702228615e-05, 'pf_no_Kriging_uncertainty': 5.45e-05, 'Pr_q_tqi': 6.78431540623297e-05, 'err': 0.22731258567071144, 'af': -0.06264179196128328, 'r': 1.2273125856707114, 'r_increase_N_surrogate': 1.1575649674287416, 'r_no_Kriging_uncertainty': 1.234446027091791, 'r_no_Kriging_uncertainty_AND_N_half': 1.3496537768268997, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 2000000, 'N_model_calls': 8, 'Uval_worst_point': 0.2830333006335305, 'sd_worst_point': 0.0006177498883443583, 'sd_avg': 0.07614886752048858, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 5.169786018363477e-05, 'pf_mle': 5.144788603256486e-05, 'pf_no_Kriging_uncertainty': 5.125e-05, 'Pr_q_tqi': 6.0365030728812665e-05, 'err': 0.1676504697562229, 'af': -0.04295426532051246, 'r': 1.1676504697562229, 'r_increase_N_surrogate': 1.1186874035621186, 'r_no_Kriging_uncertainty': 1.1688429716416484, 'r_no_Kriging_uncertainty_AND_N_half': 1.2482768336503067, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 4000000, 'N_model_calls': 8, 'Uval_worst_point': 0.2830333006335305, 'sd_worst_point': 0.0006177498883443583, 'sd_avg': 0.07616101168791528, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 5.995114176254636e-05, 'pf_mle': 5.9826156750331796e-05, 'pf_no_Kriging_uncertainty': 5.9875e-05, 'Pr_q_tqi': 6.696819105330256e-05, 'err': 0.11704613264163055, 'af': -0.020370226239972564, 'r': 1.1170461326416306, 'r_increase_N_surrogate': 1.0865239321736648, 'r_no_Kriging_uncertainty': 1.1092784146160612, 'r_no_Kriging_uncertainty_AND_N_half': 1.158564995076154, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 8000000, 'N_model_calls': 8, 'Uval_worst_point': 0.048573505651167796, 'sd_worst_point': 0.011059467009892323, 'sd_avg': 0.076157681596171, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 6.212192362023621e-05, 'pf_mle': 6.202193604462093e-05, 'pf_no_Kriging_uncertainty': 6.2e-05, 'Pr_q_tqi': 6.851941983326113e-05, 'err': 0.10298290587609782, 'af': -0.017289621846497003, 'r': 1.1029829058760978, 'r_increase_N_surrogate': 1.0766236400904425, 'r_no_Kriging_uncertainty': 1.0956937974361907, 'r_no_Kriging_uncertainty_AND_N_half': 1.1384234953377173, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 10000000, 'N_model_calls': 8, 'Uval_worst_point': 0.048573505651167796, 'sd_worst_point': 0.011059467009892323, 'sd_avg': 0.07616098613173633, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
akmcs_status.stop_iterLimit {'mean_pf_bayesian': 6.212192362023621e-05, 'pf_mle': 6.202193604462093e-05, 'pf_no_Kriging_uncertainty': 6.2e-05, 'Pr_q_tqi': 6.851941983326113e-05, 'err': 0.10298290587609782, 'af': -0.017289621846497003, 'r': 1.1029829058760978, 'r_increase_N_surrogate': 1.0766236400904425, 'r_no_Kriging_uncertainty': 1.0956937974361907, 'r_no_Kriging_uncertainty_AND_N_half': 1.1384234953377173, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 10000000, 'N_model_calls': 8, 'Uval_worst_point': 0.048573505651167796, 'sd_worst_point': 0.011059467009892323, 'sd_avg': 0.07616098613173633, 'kernel_sd': 8.319332564918614, 'kernel_corrl': array([14.48130949, 14.65722412]), 'kernel_noise': 0.0008319332595143641}
8

Example 2 - restart sampling#

## ==============================================
## Set up dataBox (for storing preformed model calls)
## ==============================================
dBox_2 = flx.dataBox(2,1)
dBox_2.write2file( {
    'fname': "akmcs_samples.bin",
    'append': True,
    'binary': True,
    'cols': 'all'
    } )
## ==============================================
## Define the AK-MCS sampler
## ==============================================
config = {
        "sampler": sampler,
        "lsf": "rbrv(rv_set::R)-rbrv(rv_set::S)",
        "NmaxSur": int(1e8),
        "err_thresh": 0.05,
        "data_box": dBox_2
    }
ak_mcs = fesslix.gpr.akmcs(config)
## ==============================================
## Initialize with past model calls
## ==============================================
dBox_3 = flx.dataBox(2,1)
dBox_3.register_post_processor({ 'type':'akmcs', 'akmcs':ak_mcs })
dBox_3.read_from_file({'fname':"akmcs_samples.bin", 'binary':True})
print(ak_mcs.get_N_model_calls(), ak_mcs.get_N_model_calls(False))
0 8
state = ak_mcs.simulate()
print(state, ak_mcs.res)
akmcs_status.increase_N_surrogate {'mean_pf_bayesian': 5.3847300881759355e-05, 'pf_mle': 5.2847408576361116e-05, 'pf_no_Kriging_uncertainty': 5.3e-05, 'Pr_q_tqi': 7.308639708975918e-05, 'err': 0.35728988998438416, 'af': -0.07134141123208065, 'r': 1.3572898899843842, 'r_increase_N_surrogate': 1.2471891330509215, 'r_no_Kriging_uncertainty': 1.344020109253443, 'r_no_Kriging_uncertainty_AND_N_half': 1.5247600335361826, 'propose_to_increase_N_smpls_surrogate': 1, 'N': 1000000, 'N_model_calls': 8, 'Uval_worst_point': 0.17685144760904603, 'sd_worst_point': 0.0037175390015843583, 'sd_avg': 0.028252261409772043, 'kernel_sd': 39.05342617203303, 'kernel_corrl': array([ 24.13003035, 101.08318782]), 'kernel_noise': 0.0039053426118001136}
state = ak_mcs.simulate(100)
print(state, ak_mcs.res)
print(ak_mcs.get_N_model_calls(), ak_mcs.get_N_model_calls(False))
dBox_2.close_file()