pyblock.pd_utils¶

Pandas-based wrapper around pyblock.blocking.

pyblock.pd_utils.reblock(data, axis=0, weights=None)¶

Blocking analysis of correlated data.

data : pandas.Series or pandas.DataFrame: Data to be blocked. See axis for order.
axis : int: If non-zero, variables in data are in rows with the columns corresponding to the observation values. Blocking is then performed along the rows. Otherwise each column is a variable, the observations are in the columns and blocking is performed down the columns. Only used if data is a pandas.DataFrame.
weights : pandas.Series or pandas.DataFrame: A 1D weighting of the data to be reblocked. For multidimensional data an identical weighting is applied to the data for each variable.

data_len : pandas.Series: Number of data points used in each reblocking iteration. Note some reblocking iterations discard a data point if there were an odd number of data points in the previous iteration.
block_info : pandas.DataFrame: Mean, standard error and estimated standard error for each variable at each reblock step.
covariance : pandas.DataFrame: Covariance matrix at each reblock step.

pyblock.blocking.reblock():: numpy-based implementation; see for documentation and notes on the reblocking procedure. pyblock.pd_utils.reblock() is a simple wrapper around this.

pyblock.pd_utils.optimal_block(block_sub_info)¶

Get the optimal block value from the reblocking data.

block_sub_info: pandas.DataFrame or pandas.Series: Reblocking data (i.e. the first item of the tuple returned by reblock), or a subset thereof containing the statistics columns for one or more data items.

index : int: Reblocking index corresponding to the reblocking iteration at which serial correlation has been removed (as estimated by the procedure in pyblock.blocking.find_optimal_block). If multiple data sets are passed in block_sub_info, this is the maximum index out of all data sets. Set to inf if an optimal block is not found for a data set.

ValueError: block_sub_info contains no Series or column in DataFrame named ‘optimal block’.

pyblock.pd_utils.reblock_summary(block_sub_info)¶

Get the data corresponding to the optimal block from the reblocking data.

block_sub_info : pandas.DataFrame or pandas.Series: Reblocking data (i.e. the first item of the tuple returned by reblock), or a subset thereof containing the statistics columns for one or more data items.

summary : pandas.DataFrame: Mean, standard error and estimate of the error in the standard error corresponding to the optimal block size in the reblocking data (or largest optimal size if multiple data sets are given. The index is labelled with the data name, if known. An empty DataFrame is returned if no optimal block size was found.