# Quick Start¶

## Configure Existing Experiment¶

- Procedure is very simple (
NEER):

Navigate to the example directory.Edit the configuration fileda_solver.inpEdit the experiment module filemodel_<Model Name>.pyif needed.Run the test case file<Model Name>_Pytest.py.Inside each example directory you will find a configuration file

da_solver.inpthat you can edit. Here is a list of options currently used by DAPack:

filter_name: name of the data assimilation filter. This will be modified when smoothers take place in DAPack. As of now, this can be one of the following (non-case-sensitive):

EnKF: A stochastic (perturbed-observations) version of the ensemble Kalman Filter (EnKF).

SqrEnKF: A deterministic (square root) version of the ensemble Kalman Filter (EnKF).

BootStrap_PF: Bootstrap particle filter.

PF: Particle filter with resampling step. This also implements the sequential importance resampling (SIR).

HMC: This should be chosen if Hybrid Monte-Carlo sampler is to be used. This is used for each of the following cases:

- Vanilla HMC with manual parameter tuning.
- The No-UTurn-Sampler (NUTS).
- Generalized NUTS for DA (Work-in-progress).
The sampler will be determined based on the settings in the two variables

Hamiltonian_integrator, mass_matrix_strategy.

particle_filter_resampling_strategy: The strategy used to re-sample states from the prior ensemble. Currentlysystematic , stratifiedare implemented:

ensemble_size: The number of ensemble members to keep during the assimilation process.

initial_time: Beginning of the timespan of the experiment.

final_time: Beginning of the timespan of the experiment.

cycle_length: length of the assimilation cycle. Measurements will be made at multiples of that interval. The three variablesinitial_time , final_time, cycle_lengthaltogether define the filter timespan.

observation_operator_type: Type of the observation operator. For now, we have alinear,empirical. Both choices create a sparse matrix with ones corresponding to observed entries and zeros elsewhere. The operator function implemented here is very simple but more will be added. The design of the linear observation depends on the value ofobserved_variables_jump.

observation_noise_type: OnlyGaussianobservation errors are implemented.

observation_noise_level: Standard deviation of observation errors is calculated as the product ofobservation_noise_leveland the average magnitude of the signal over the timespan of the assimilation experiment. This is calculated per prognostic variable.

observation_spacing_type: Eitherfixed, orrandom. Iffixedis chosen, the observation time points are selected and fixed based on the variableobservation_steps_per_filter_steps. Ifrandomis chosen, each time point in the filter timespan is chosen to be an actual observation point (assimilation point) based on a coin flip with probability of being picked set in the variableobservation_chance_per_filter_steps

observation_steps_per_filter_steps: Observation frequency in time. 1 means observations are made at each time instance in the timespan, 2 means observations are made at every other time instance, and so on. Only used ifobservation_spacing_type == fixed

observation_chance_per_filter_steps: Probability of a time instance in the filter timespan to be picked as an observation point. Only used ifobservation_spacing_type == random

observed_variables_jump: This controls the observation frequency over the grid state points. 1 means all prognostic variables are observed at all grid points, 2 means all prognostic variables only at each other grid point are observed, and so on.

screen_output: Yes(True) or No(False). Controls whether to show numerical results on the screen or not.

screen_output_iter: The frequency of screen outputting. This is w.r.t simulation cycles. Used only ifscreen_output == Yes

file_output: Yes(True) or No(False). Controls whether to save numerical results to files on disk or not.

file_output_iter: The frequency of file outputting. This is w.r.t simulation cycles. Used only iffile_output == Yes

file_output_means_only: Yes(True) or No(False). Controls whether to save (to files on disk) the ensemble means (Yes) or the whole ensembles (No).

decorrelate: Yes(True) or No(False). Create and apply a decorrelation operator to the background error covariance matrix. This is known as localization.

decorrelation_radius: localization distance/radius.

periodic_decorrelation: if periodic boundaries are used, this should be set to True.

read_decorrelation_from_file: Yes(True) or No(False). Check for ‘hdf5’ file namedDecorrto read the decorrelation matrix from. An Exception will be thrown if the file is not in place.

background_errors_covariance_method: This creates a modeled version of the background error covariance matrix. Only two methods are currently implemented:

diagonal: This will result in uncorrelated structure.diagonal: A full covariance matrix that may be decorrelated if requested by settingdecorrelate=Yes.These options necessarily requires background errors to be

Gaussian. More options will be considered. In both cases, a standard deviation vector is created calculated as the product ofbackground_noise_leveland the average magnitude of the signal over the timespan of the assimilation experiment. This is calculated per prognostic variable. Then either to set it as the diagonal of the background error covariance matrix ifdiagonalis chosen, set the dense (pre-localized) version of the background error covariance matrix to the outer product of this perturbation vector.

background_noise_level: Check previous point.

background_noise_type:Gaussianfor now.

update_B_factor: The background error covariance matrix is updated as a linear combination of the modeled background error covariance matrix, and a flow-dependent (ensemble-based) version. This factor is multiplied by the flow-dependent version. 1 means flow-dependent version dominates, 0 means modeled version is used, and any other value (between 0 and 1) results in a hybrid version of the background error covariance matrix.

model_errors_covariance_method: This creates a model error covariance matrix. We assume model errors are Gaussian. This will be investigated further later. Construction strategy as in the background error covariance matrix, and uses uncertainty level set in the variablemodel_noise_level.

model_error_steps_per_model_steps: Time frequency of adding model errors. This is w.r.t model time step. Model time step is set in the configuration filesolver.inpin the model directory and can be set in the setup function in the model class.

model_noise_type:Gaussianfor now.

model_noise_level: check previous two points

use_sparse_packages: Yes(True) or No(False). Use sparse packages for matrix representation or not.

linear_system_solver: Eitherlu, spluwill be used for solving linear systems if required ( to find the effect of on a vector). Of course constructing the full inverse is avoided.

Hamiltonian_integrator: Symplectic integrator used to propagate the Hamiltonian system used in HMC. Available areverlet, 2stage, 3stage, 4stage

Hamiltonian_step_size: Size of the step size used in the Symplectic Hamiltonian integrator. Will be initial value only if NUTS is used.

Hamiltonian_number_of_steps: Number of steps taken by the symplectic integrator to between proposed points. This controls the length of the Hamiltonian trajectory. Will not be needed for NUTS.

Hamiltonian_burn_in_steps: Number of generated states (to reach convergence) before starting the sampling process.

Hamiltonian_mixing_steps: Number of states dropped between retained states after convergence. This is usually useful to reduce correlations and consequently increase independence of sampled states.

mass_matrix_strategy: The mass matrix is assumed to be diagonal. It can be multiple of the identity matrix, but better depend on the variances of the posterior. Available choices areidentity, prior_variances, prior_precisions, modeled_variances, modeled_precisions. I recommend_precisions.

mass_matrix_scale_factor: This scales the diagonal of the mass matrix.

hamiltonian_sampling_strategy:HMC, or NUTS. The former directs the sampler to use fixed step size and number of steps in the symplectic integrator, while the latter activates generalized NUTS with dual averaging.

## Add New Model¶

This is big thing that we will continue working on to make it much easier. Currently the easiest two options are:

- Add the model to the HyPar package by editing the source code. In this case, you will need to create a directory under the
modelsdirectory, and a corresponding one under theexamplesdirectory. Follow the pattern in other examples, you will find all are almost the same except for naming. Of course you can modify the model class file to override default functions given in the base classes..- Write the model class in details. We will add an example soon.

## Add New Filter¶

You will need to edit two modules. Firstly, you need to add your assimilation cycle function and related functions to the module name`_Assimilation_Filters`

. Secondly, you need to edit the function`DA_filtering_cycle()`

inside`HyPar_DAFiltering`

to add the option corresponding to the name you chose for your filter. If you will add statistical monitors to your filter, you may want to update the function`DA_filtering_process()`

inside`HyPar_DAFiltering`

by updating the file output section(s).