Skip to content

ding.utils.normalizer_helper

ding.utils.normalizer_helper

DatasetNormalizer

Overview

The DatasetNormalizer class provides functionality to normalize and unnormalize data in a dataset. It takes a dataset as input and applies a normalizer function to each key in the dataset.

Interfaces

__init__, __repr__, normalize, unnormalize.

__init__(dataset, normalizer, path_lengths=None)

Overview

Initialize the NormalizerHelper object.

Parameters:

Name Type Description Default
- dataset (

obj:np.ndarray): The dataset to be normalized.

required
- normalizer (

obj:str): The type of normalizer to be used. Can be a string representing the name of the normalizer class.

required
- path_lengths (

obj:list): The length of the paths in the dataset. Defaults to None.

required

__repr__()

Overview

Returns a string representation of the NormalizerHelper object. The string representation includes the key-value pairs of the normalizers stored in the NormalizerHelper object.

Returns: - ret (:obj:str):A string representation of the NormalizerHelper object.

normalize(x, key)

Overview

Normalize the input data using the specified key.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input data to be normalized.

required
- key (

objstr): The key to identify the normalizer.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The normalized value of the input data.

unnormalize(x, key)

Overview

Unnormalizes the given value x using the specified key.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The value to be unnormalized.

required
- key (

objstr): The key to identify the normalizer.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The unnormalized value.

Normalizer

Overview

Parent class, subclass by defining the normalize and unnormalize methods

Interfaces

__init__, __repr__, normalize, unnormalize.

__init__(X)

Overview

Initialize the Normalizer object.

Arguments: - X (:obj:np.ndarray): The data to be normalized.

__repr__()

Overview

Returns a string representation of the Normalizer object.

Returns: - ret (:obj:str): A string representation of the Normalizer object.

normalize(*args, **kwargs)

Overview

Normalize the input data.

Arguments: - args (:obj:list): The arguments passed to the normalize function. - kwargs (:obj:dict): The keyword arguments passed to the normalize function.

unnormalize(*args, **kwargs)

Overview

Unnormalize the input data.

Arguments: - args (:obj:list): The arguments passed to the unnormalize function. - kwargs (:obj:dict): The keyword arguments passed to the unnormalize function.

GaussianNormalizer

Bases: Normalizer

Overview

A class that normalizes data to zero mean and unit variance.

Interfaces

__init__, __repr__, normalize, unnormalize.

__init__(*args, **kwargs)

Overview

Initialize the GaussianNormalizer object.

Arguments: - args (:obj:list): The arguments passed to the __init__ function of the parent class, i.e., the Normalizer class. - kwargs (:obj:dict): The keyword arguments passed to the __init__ function of the parent class, i.e., the Normalizer class.

__repr__()

Overview

Returns a string representation of the GaussianNormalizer object.

Returns: - ret (:obj:str): A string representation of the GaussianNormalizer object.

normalize(x)

Overview

Normalize the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input data to be normalized.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The normalized data.

unnormalize(x)

Overview

Unnormalize the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input data to be unnormalized.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The unnormalized data.

CDFNormalizer

Bases: Normalizer

Overview

A class that makes training data uniform (over each dimension) by transforming it with marginal CDFs.

Interfaces

__init__, __repr__, normalize, unnormalize.

__init__(X)

Overview

Initialize the CDFNormalizer object.

Arguments: - X (:obj:np.ndarray): The data to be normalized.

__repr__()

Overview

Returns a string representation of the CDFNormalizer object.

Returns: - ret (:obj:str): A string representation of the CDFNormalizer object.

wrap(fn_name, x)

Overview

Wraps the given function name and applies it to the input data.

Parameters:

Name Type Description Default
- fn_name (

obj:str): The name of the function to be applied.

required
- x (

obj:np.ndarray): The input data.

required

Returns:

Type Description
ndarray
  • ret: The output of the function applied to the input data.

normalize(x)

Overview

Normalizes the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input data.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The normalized data.

unnormalize(x)

Overview

Unnormalizes the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input data.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray):: The unnormalized data.

CDFNormalizer1d

Overview

CDF normalizer for a single dimension. This class provides methods to normalize and unnormalize data using the Cumulative Distribution Function (CDF) approach.

Interfaces: __init__, __repr__, normalize, unnormalize.

__init__(X)

Overview

Initialize the CDFNormalizer1d object.

Arguments: - X (:obj:np.ndarray): The data to be normalized.

__repr__()

Overview

Returns a string representation of the CDFNormalizer1d object.

normalize(x)

Overview

Normalize the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The data to be normalized.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The normalized data.

unnormalize(x, eps=0.0001)

Overview

Unnormalize the input data.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The data to be unnormalized.

required
- eps (

obj:float): A small value used for numerical stability. Defaults to 1e-4.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The unnormalized data.

LimitsNormalizer

Bases: Normalizer

Overview

A class that normalizes and unnormalizes values within specified limits. This class maps values within the range [xmin, xmax] to the range [-1, 1].

Interfaces

__init__, __repr__, normalize, unnormalize.

normalize(x)

Overview

Normalizes the input values.

Argments
  • x (:obj:np.ndarray): The input values to be normalized.

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The normalized values.

unnormalize(x, eps=0.0001)

Overview

Unnormalizes the input values.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input values to be unnormalized.

required
- eps (

obj:float): A small value used for clipping. Defaults to 1e-4.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The unnormalized values.

flatten(dataset, path_lengths)

Overview

Flattens dataset of { key: [ n_episodes x max_path_length x dim ] } to { key : [ (n_episodes * sum(path_lengths)) x dim ] }

Parameters:

Name Type Description Default
- dataset (

obj:dict): The dataset to be flattened.

required
- path_lengths (

obj:list): A list of path lengths for each episode.

required

Returns:

Type Description
dict
  • flattened (:obj:dict): The flattened dataset.

empirical_cdf(sample)

Overview

Compute the empirical cumulative distribution function (CDF) of a given sample.

Parameters:

Name Type Description Default
- sample (

obj:np.ndarray): The input sample for which to compute the empirical CDF.

required

Returns:

Type Description
(ndarray, ndarray)
  • quantiles (:obj:np.ndarray): The unique values in the sample.
(ndarray, ndarray)
  • cumprob (:obj:np.ndarray): The cumulative probabilities corresponding to the quantiles.
References
  • Stack Overflow: https://stackoverflow.com/a/33346366

atleast_2d(x)

Overview

Ensure that the input array has at least two dimensions.

Parameters:

Name Type Description Default
- x (

obj:np.ndarray): The input array.

required

Returns:

Type Description
ndarray
  • ret (:obj:np.ndarray): The input array with at least two dimensions.

Full Source Code

../ding/utils/normalizer_helper.py

1import numpy as np 2 3 4class DatasetNormalizer: 5 """ 6 Overview: 7 The `DatasetNormalizer` class provides functionality to normalize and unnormalize data in a dataset. 8 It takes a dataset as input and applies a normalizer function to each key in the dataset. 9 10 Interfaces: 11 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 12 """ 13 14 def __init__(self, dataset: np.ndarray, normalizer: str, path_lengths: list = None): 15 """ 16 Overview: 17 Initialize the NormalizerHelper object. 18 19 Arguments: 20 - dataset (:obj:`np.ndarray`): The dataset to be normalized. 21 - normalizer (:obj:`str`): The type of normalizer to be used. Can be a string representing the name of \ 22 the normalizer class. 23 - path_lengths (:obj:`list`): The length of the paths in the dataset. Defaults to None. 24 """ 25 dataset = flatten(dataset, path_lengths) 26 27 self.observation_dim = dataset['observations'].shape[1] 28 self.action_dim = dataset['actions'].shape[1] 29 30 if isinstance(normalizer, str): 31 normalizer = eval(normalizer) 32 33 self.normalizers = {} 34 for key, val in dataset.items(): 35 try: 36 self.normalizers[key] = normalizer(val) 37 except: 38 print(f'[ utils/normalization ] Skipping {key} | {normalizer}') 39 # key: normalizer(val) 40 # for key, val in dataset.items() 41 42 def __repr__(self) -> str: 43 """ 44 Overview: 45 Returns a string representation of the NormalizerHelper object. \ 46 The string representation includes the key-value pairs of the normalizers \ 47 stored in the NormalizerHelper object. 48 Returns: 49 - ret (:obj:`str`):A string representation of the NormalizerHelper object. 50 """ 51 string = '' 52 for key, normalizer in self.normalizers.items(): 53 string += f'{key}: {normalizer}]\n' 54 return string 55 56 def normalize(self, x: np.ndarray, key: str) -> np.ndarray: 57 """ 58 Overview: 59 Normalize the input data using the specified key. 60 61 Arguments: 62 - x (:obj:`np.ndarray`): The input data to be normalized. 63 - key (:obj`str`): The key to identify the normalizer. 64 65 Returns: 66 - ret (:obj:`np.ndarray`): The normalized value of the input data. 67 """ 68 return self.normalizers[key].normalize(x) 69 70 def unnormalize(self, x: np.ndarray, key: str) -> np.ndarray: 71 """ 72 Overview: 73 Unnormalizes the given value `x` using the specified `key`. 74 75 Arguments: 76 - x (:obj:`np.ndarray`): The value to be unnormalized. 77 - key (:obj`str`): The key to identify the normalizer. 78 79 Returns: 80 - ret (:obj:`np.ndarray`): The unnormalized value. 81 """ 82 return self.normalizers[key].unnormalize(x) 83 84 85def flatten(dataset: dict, path_lengths: list) -> dict: 86 """ 87 Overview: 88 Flattens dataset of { key: [ n_episodes x max_path_length x dim ] } \ 89 to { key : [ (n_episodes * sum(path_lengths)) x dim ] } 90 91 Arguments: 92 - dataset (:obj:`dict`): The dataset to be flattened. 93 - path_lengths (:obj:`list`): A list of path lengths for each episode. 94 95 Returns: 96 - flattened (:obj:`dict`): The flattened dataset. 97 """ 98 flattened = {} 99 for key, xs in dataset.items(): 100 assert len(xs) == len(path_lengths) 101 if key == 'path_lengths': 102 continue 103 flattened[key] = np.concatenate([x[:length] for x, length in zip(xs, path_lengths)], axis=0) 104 return flattened 105 106 107class Normalizer: 108 """ 109 Overview: 110 Parent class, subclass by defining the `normalize` and `unnormalize` methods 111 112 Interfaces: 113 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 114 """ 115 116 def __init__(self, X): 117 """ 118 Overview: 119 Initialize the Normalizer object. 120 Arguments: 121 - X (:obj:`np.ndarray`): The data to be normalized. 122 """ 123 124 self.X = X.astype(np.float32) 125 self.mins = X.min(axis=0) 126 self.maxs = X.max(axis=0) 127 128 def __repr__(self) -> str: 129 """ 130 Overview: 131 Returns a string representation of the Normalizer object. 132 Returns: 133 - ret (:obj:`str`): A string representation of the Normalizer object. 134 """ 135 136 return ( 137 f"""[ Normalizer ] dim: {self.mins.size}\n -: """ 138 f"""{np.round(self.mins, 2)}\n +: {np.round(self.maxs, 2)}\n""" 139 ) 140 141 def normalize(self, *args, **kwargs): 142 """ 143 Overview: 144 Normalize the input data. 145 Arguments: 146 - args (:obj:`list`): The arguments passed to the ``normalize`` function. 147 - kwargs (:obj:`dict`): The keyword arguments passed to the ``normalize`` function. 148 """ 149 150 raise NotImplementedError() 151 152 def unnormalize(self, *args, **kwargs): 153 """ 154 Overview: 155 Unnormalize the input data. 156 Arguments: 157 - args (:obj:`list`): The arguments passed to the ``unnormalize`` function. 158 - kwargs (:obj:`dict`): The keyword arguments passed to the ``unnormalize`` function. 159 """ 160 161 raise NotImplementedError() 162 163 164class GaussianNormalizer(Normalizer): 165 """ 166 Overview: 167 A class that normalizes data to zero mean and unit variance. 168 169 Interfaces: 170 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 171 """ 172 173 def __init__(self, *args, **kwargs): 174 """ 175 Overview: 176 Initialize the GaussianNormalizer object. 177 Arguments: 178 - args (:obj:`list`): The arguments passed to the ``__init__`` function of the parent class, \ 179 i.e., the Normalizer class. 180 - kwargs (:obj:`dict`): The keyword arguments passed to the ``__init__`` function of the parent class, \ 181 i.e., the Normalizer class. 182 """ 183 184 super().__init__(*args, **kwargs) 185 self.means = self.X.mean(axis=0) 186 self.stds = self.X.std(axis=0) 187 self.z = 1 188 189 def __repr__(self) -> str: 190 """ 191 Overview: 192 Returns a string representation of the GaussianNormalizer object. 193 Returns: 194 - ret (:obj:`str`): A string representation of the GaussianNormalizer object. 195 """ 196 197 return ( 198 f"""[ Normalizer ] dim: {self.mins.size}\n """ 199 f"""means: {np.round(self.means, 2)}\n """ 200 f"""stds: {np.round(self.z * self.stds, 2)}\n""" 201 ) 202 203 def normalize(self, x: np.ndarray) -> np.ndarray: 204 """ 205 Overview: 206 Normalize the input data. 207 208 Arguments: 209 - x (:obj:`np.ndarray`): The input data to be normalized. 210 211 Returns: 212 - ret (:obj:`np.ndarray`): The normalized data. 213 """ 214 return (x - self.means) / self.stds 215 216 def unnormalize(self, x: np.ndarray) -> np.ndarray: 217 """ 218 Overview: 219 Unnormalize the input data. 220 221 Arguments: 222 - x (:obj:`np.ndarray`): The input data to be unnormalized. 223 224 Returns: 225 - ret (:obj:`np.ndarray`): The unnormalized data. 226 """ 227 return x * self.stds + self.means 228 229 230class CDFNormalizer(Normalizer): 231 """ 232 Overview: 233 A class that makes training data uniform (over each dimension) by transforming it with marginal CDFs. 234 235 Interfaces: 236 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 237 """ 238 239 def __init__(self, X): 240 """ 241 Overview: 242 Initialize the CDFNormalizer object. 243 Arguments: 244 - X (:obj:`np.ndarray`): The data to be normalized. 245 """ 246 247 super().__init__(atleast_2d(X)) 248 self.dim = self.X.shape[1] 249 self.cdfs = [CDFNormalizer1d(self.X[:, i]) for i in range(self.dim)] 250 251 def __repr__(self) -> str: 252 """ 253 Overview: 254 Returns a string representation of the CDFNormalizer object. 255 Returns: 256 - ret (:obj:`str`): A string representation of the CDFNormalizer object. 257 """ 258 259 return f'[ CDFNormalizer ] dim: {self.mins.size}\n' + ' | '.join( 260 f'{i:3d}: {cdf}' for i, cdf in enumerate(self.cdfs) 261 ) 262 263 def wrap(self, fn_name: str, x: np.ndarray) -> np.ndarray: 264 """ 265 Overview: 266 Wraps the given function name and applies it to the input data. 267 268 Arguments: 269 - fn_name (:obj:`str`): The name of the function to be applied. 270 - x (:obj:`np.ndarray`): The input data. 271 272 Returns: 273 - ret: The output of the function applied to the input data. 274 """ 275 shape = x.shape 276 # reshape to 2d 277 x = x.reshape(-1, self.dim) 278 out = np.zeros_like(x) 279 for i, cdf in enumerate(self.cdfs): 280 fn = getattr(cdf, fn_name) 281 out[:, i] = fn(x[:, i]) 282 return out.reshape(shape) 283 284 def normalize(self, x: np.ndarray) -> np.ndarray: 285 """ 286 Overview: 287 Normalizes the input data. 288 289 Arguments: 290 - x (:obj:`np.ndarray`): The input data. 291 292 Returns: 293 - ret (:obj:`np.ndarray`): The normalized data. 294 """ 295 return self.wrap('normalize', x) 296 297 def unnormalize(self, x: np.ndarray) -> np.ndarray: 298 """ 299 Overview: 300 Unnormalizes the input data. 301 302 Arguments: 303 - x (:obj:`np.ndarray`): The input data. 304 305 Returns: 306 - ret (:obj:`np.ndarray`):: The unnormalized data. 307 """ 308 return self.wrap('unnormalize', x) 309 310 311class CDFNormalizer1d: 312 """ 313 Overview: 314 CDF normalizer for a single dimension. This class provides methods to normalize and unnormalize data \ 315 using the Cumulative Distribution Function (CDF) approach. 316 Interfaces: 317 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 318 """ 319 320 def __init__(self, X: np.ndarray): 321 """ 322 Overview: 323 Initialize the CDFNormalizer1d object. 324 Arguments: 325 - X (:obj:`np.ndarray`): The data to be normalized. 326 """ 327 328 import scipy.interpolate as interpolate 329 assert X.ndim == 1 330 self.X = X.astype(np.float32) 331 if self.X.max() == self.X.min(): 332 self.constant = True 333 else: 334 self.constant = False 335 quantiles, cumprob = empirical_cdf(self.X) 336 self.fn = interpolate.interp1d(quantiles, cumprob) 337 self.inv = interpolate.interp1d(cumprob, quantiles) 338 339 self.xmin, self.xmax = quantiles.min(), quantiles.max() 340 self.ymin, self.ymax = cumprob.min(), cumprob.max() 341 342 def __repr__(self) -> str: 343 """ 344 Overview: 345 Returns a string representation of the CDFNormalizer1d object. 346 """ 347 348 return (f'[{np.round(self.xmin, 2):.4f}, {np.round(self.xmax, 2):.4f}') 349 350 def normalize(self, x: np.ndarray) -> np.ndarray: 351 """ 352 Overview: 353 Normalize the input data. 354 355 Arguments: 356 - x (:obj:`np.ndarray`): The data to be normalized. 357 358 Returns: 359 - ret (:obj:`np.ndarray`): The normalized data. 360 """ 361 if self.constant: 362 return x 363 364 x = np.clip(x, self.xmin, self.xmax) 365 # [ 0, 1 ] 366 y = self.fn(x) 367 # [ -1, 1 ] 368 y = 2 * y - 1 369 return y 370 371 def unnormalize(self, x: np.ndarray, eps: float = 1e-4) -> np.ndarray: 372 """ 373 Overview: 374 Unnormalize the input data. 375 376 Arguments: 377 - x (:obj:`np.ndarray`): The data to be unnormalized. 378 - eps (:obj:`float`): A small value used for numerical stability. Defaults to 1e-4. 379 380 Returns: 381 - ret (:obj:`np.ndarray`): The unnormalized data. 382 """ 383 # [ -1, 1 ] --> [ 0, 1 ] 384 if self.constant: 385 return x 386 387 x = (x + 1) / 2. 388 389 if (x < self.ymin - eps).any() or (x > self.ymax + eps).any(): 390 print( 391 f"""[ dataset/normalization ] Warning: out of range in unnormalize: """ 392 f"""[{x.min()}, {x.max()}] | """ 393 f"""x : [{self.xmin}, {self.xmax}] | """ 394 f"""y: [{self.ymin}, {self.ymax}]""" 395 ) 396 397 x = np.clip(x, self.ymin, self.ymax) 398 399 y = self.inv(x) 400 return y 401 402 403def empirical_cdf(sample: np.ndarray) -> (np.ndarray, np.ndarray): 404 """ 405 Overview: 406 Compute the empirical cumulative distribution function (CDF) of a given sample. 407 408 Arguments: 409 - sample (:obj:`np.ndarray`): The input sample for which to compute the empirical CDF. 410 411 Returns: 412 - quantiles (:obj:`np.ndarray`): The unique values in the sample. 413 - cumprob (:obj:`np.ndarray`): The cumulative probabilities corresponding to the quantiles. 414 415 References: 416 - Stack Overflow: https://stackoverflow.com/a/33346366 417 """ 418 419 # find the unique values and their corresponding counts 420 quantiles, counts = np.unique(sample, return_counts=True) 421 422 # take the cumulative sum of the counts and divide by the sample size to 423 # get the cumulative probabilities between 0 and 1 424 cumprob = np.cumsum(counts).astype(np.double) / sample.size 425 426 return quantiles, cumprob 427 428 429def atleast_2d(x: np.ndarray) -> np.ndarray: 430 """ 431 Overview: 432 Ensure that the input array has at least two dimensions. 433 434 Arguments: 435 - x (:obj:`np.ndarray`): The input array. 436 437 Returns: 438 - ret (:obj:`np.ndarray`): The input array with at least two dimensions. 439 """ 440 if x.ndim < 2: 441 x = x[:, None] 442 return x 443 444 445class LimitsNormalizer(Normalizer): 446 """ 447 Overview: 448 A class that normalizes and unnormalizes values within specified limits. \ 449 This class maps values within the range [xmin, xmax] to the range [-1, 1]. 450 451 Interfaces: 452 ``__init__``, ``__repr__``, ``normalize``, ``unnormalize``. 453 """ 454 455 def normalize(self, x: np.ndarray) -> np.ndarray: 456 """ 457 Overview: 458 Normalizes the input values. 459 460 Argments: 461 - x (:obj:`np.ndarray`): The input values to be normalized. 462 463 Returns: 464 - ret (:obj:`np.ndarray`): The normalized values. 465 466 """ 467 # [ 0, 1 ] 468 x = (x - self.mins) / (self.maxs - self.mins) 469 # [ -1, 1 ] 470 x = 2 * x - 1 471 return x 472 473 def unnormalize(self, x: np.ndarray, eps: float = 1e-4) -> np.ndarray: 474 """ 475 Overview: 476 Unnormalizes the input values. 477 478 Arguments: 479 - x (:obj:`np.ndarray`): The input values to be unnormalized. 480 - eps (:obj:`float`): A small value used for clipping. Defaults to 1e-4. 481 482 Returns: 483 - ret (:obj:`np.ndarray`): The unnormalized values. 484 485 """ 486 if x.max() > 1 + eps or x.min() < -1 - eps: 487 # print(f'[ datasets/mujoco ] Warning: sample out of range | ({x.min():.4f}, {x.max():.4f})') 488 x = np.clip(x, -1, 1) 489 490 # [ -1, 1 ] --> [ 0, 1 ] 491 x = (x + 1) / 2. 492 493 return x * (self.maxs - self.mins) + self.mins