ding.torch_utils.data_helper¶
ding.torch_utils.data_helper
¶
LogDict
¶
Bases: dict
Overview
Derived from dict. Would convert torch.Tensor to list for convenient logging.
Interfaces:
_transform, __setitem__, update.
__setitem__(key, value)
¶
Overview
Override the __setitem__ function of built-in dict.
Arguments:
- key (:obj:Any): The key of the data item.
- value (:obj:Any): The value of the data item.
update(data)
¶
Overview
Override the update function of built-in dict.
Arguments:
- data (:obj:dict): The dict for updating current object.
CudaFetcher
¶
Bases: object
Overview
Fetch data from source, and transfer it to a specified device.
Interfaces:
__init__, __next__, run, close.
__init__(data_source, device, queue_size=4, sleep=0.1)
¶
Overview
Initialize the CudaFetcher object using the given arguments.
Arguments:
- data_source (:obj:Iterable): The iterable data source.
- device (:obj:str): The device to put data to, such as "cuda:0".
- queue_size (:obj:int): The internal size of queue, such as 4.
- sleep (:obj:float): Sleeping time when the internal queue is full.
__next__()
¶
Overview
Response to the request for data. Return one data item from the internal queue.
Returns:
- item (:obj:Any): The data item on the required device.
run()
¶
Overview
Start producer thread: Keep fetching data from source, change the device, and put into queue for request.
Examples: >>> timer = EasyTimer() >>> dataloader = iter([torch.randn(3, 3) for _ in range(10)]) >>> dataloader = CudaFetcher(dataloader, device='cuda', sleep=0.1) >>> dataloader.run() >>> data = next(dataloader)
close()
¶
Overview
Stop producer thread by setting end_flag to True .
to_device(item, device, ignore_keys=[])
¶
Overview
Transfer data to certain device.
Arguments:
- item (:obj:Any): The item to be transferred.
- device (:obj:str): The device wanted.
- ignore_keys (:obj:list): The keys to be ignored in transfer, default set to empty.
Returns:
- item (:obj:Any): The transferred item.
Examples:
>>> setup_data_dict['module'] = nn.Linear(3, 5)
>>> device = 'cuda'
>>> cuda_d = to_device(setup_data_dict, device, ignore_keys=['module'])
>>> assert cuda_d['module'].weight.device == torch.device('cpu')
Examples:
>>> setup_data_dict['module'] = nn.Linear(3, 5)
>>> device = 'cuda'
>>> cuda_d = to_device(setup_data_dict, device)
>>> assert cuda_d['module'].weight.device == torch.device('cuda:0')
.. note:
Now supports item type: :obj:`torch.nn.Module`, :obj:`torch.Tensor`, :obj:`Sequence`, :obj:`dict`, :obj:`numbers.Integral`, :obj:`numbers.Real`, :obj:`np.ndarray`, :obj:`str` and :obj:`None`.
to_dtype(item, dtype)
¶
Overview
Change data to certain dtype.
Arguments:
- item (:obj:Any): The item for changing the dtype.
- dtype (:obj:type): The type wanted.
Returns:
- item (:obj:object): The item with changed dtype.
Examples (tensor):
>>> t = torch.randint(0, 10, (3, 5))
>>> tfloat = to_dtype(t, torch.float)
>>> assert tfloat.dtype == torch.float
Examples (list): >>> tlist = [torch.randint(0, 10, (3, 5))] >>> tlfloat = to_dtype(tlist, torch.float) >>> assert tlfloat[0].dtype == torch.float
Examples (dict): >>> tdict = {'t': torch.randint(0, 10, (3, 5))} >>> tdictf = to_dtype(tdict, torch.float) >>> assert tdictf['t'].dtype == torch.float
.. note:
Now supports item type: :obj:`torch.Tensor`, :obj:`Sequence`, :obj:`dict`.
to_tensor(item, dtype=None, ignore_keys=[], transform_scalar=True)
¶
Overview
Convert numpy.ndarray object to torch.Tensor.
Arguments:
- item (:obj:Any): The numpy.ndarray objects to be converted. It can be exactly a numpy.ndarray object or a container (list, tuple or dict) that contains several numpy.ndarray objects.
- dtype (:obj:torch.dtype): The type of wanted tensor. If set to None, its dtype will be unchanged.
- ignore_keys (:obj:list): If the item is a dict, values whose keys are in ignore_keys will not be converted.
- transform_scalar (:obj:bool): If set to True, a scalar will be also converted to a tensor object.
Returns:
- item (:obj:Any): The converted tensors.
Examples (scalar): >>> i = 10 >>> t = to_tensor(i) >>> assert t.item() == i
Examples (dict): >>> d = {'i': i} >>> dt = to_tensor(d, torch.int) >>> assert dt['i'].item() == i
Examples (named tuple): >>> data_type = namedtuple('data_type', ['x', 'y']) >>> inputs = data_type(np.random.random(3), 4) >>> outputs = to_tensor(inputs, torch.float32) >>> assert type(outputs) == data_type >>> assert isinstance(outputs.x, torch.Tensor) >>> assert isinstance(outputs.y, torch.Tensor) >>> assert outputs.x.dtype == torch.float32 >>> assert outputs.y.dtype == torch.float32
.. note:
Now supports item type: :obj:`dict`, :obj:`list`, :obj:`tuple` and :obj:`None`.
to_ndarray(item, dtype=None)
¶
Overview
Convert torch.Tensor to numpy.ndarray.
Arguments:
- item (:obj:Any): The torch.Tensor objects to be converted. It can be exactly a torch.Tensor object or a container (list, tuple or dict) that contains several torch.Tensor objects.
- dtype (:obj:np.dtype): The type of wanted array. If set to None, its dtype will be unchanged.
Returns:
- item (:obj:object): The changed arrays.
Examples (ndarray): >>> t = torch.randn(3, 5) >>> tarray1 = to_ndarray(t) >>> assert tarray1.shape == (3, 5) >>> assert isinstance(tarray1, np.ndarray)
Examples (list): >>> t = [torch.randn(5, ) for i in range(3)] >>> tarray1 = to_ndarray(t, np.float32) >>> assert isinstance(tarray1, list) >>> assert tarray1[0].shape == (5, ) >>> assert isinstance(tarray1[0], np.ndarray)
.. note:
Now supports item type: :obj:`torch.Tensor`, :obj:`dict`, :obj:`list`, :obj:`tuple` and :obj:`None`.
to_list(item)
¶
Overview
Convert torch.Tensor, numpy.ndarray objects to list objects, and keep their dtypes unchanged.
Arguments:
- item (:obj:Any): The item to be converted.
Returns:
- item (:obj:Any): The list after conversion.
Examples:
>>> data = { 'tensor': torch.randn(4), 'list': [True, False, False], 'tuple': (4, 5, 6), 'bool': True, 'int': 10, 'float': 10., 'array': np.random.randn(4), 'str': "asdf", 'none': None, } >>> transformed_data = to_list(data)
.. note::
Now supports item type: :obj:`torch.Tensor`, :obj:`numpy.ndarray`, :obj:`dict`, :obj:`list`, :obj:`tuple` and :obj:`None`.
tensor_to_list(item)
¶
Overview
Convert torch.Tensor objects to list, and keep their dtypes unchanged.
Arguments:
- item (:obj:Any): The item to be converted.
Returns:
- item (:obj:Any): The lists after conversion.
Examples (2d-tensor): >>> t = torch.randn(3, 5) >>> tlist1 = tensor_to_list(t) >>> assert len(tlist1) == 3 >>> assert len(tlist1[0]) == 5
Examples (1d-tensor): >>> t = torch.randn(3, ) >>> tlist1 = tensor_to_list(t) >>> assert len(tlist1) == 3
Examples (list) >>> t = [torch.randn(5, ) for i in range(3)] >>> tlist1 = tensor_to_list(t) >>> assert len(tlist1) == 3 >>> assert len(tlist1[0]) == 5
Examples (dict): >>> td = {'t': torch.randn(3, 5)} >>> tdlist1 = tensor_to_list(td) >>> assert len(tdlist1['t']) == 3 >>> assert len(tdlist1['t'][0]) == 5
.. note::
Now supports item type: :obj:`torch.Tensor`, :obj:`dict`, :obj:`list`, :obj:`tuple` and :obj:`None`.
to_item(data, ignore_error=True)
¶
Overview
Convert data to python native scalar (i.e. data item), and keep their dtypes unchanged.
Arguments:
- data (:obj:Any): The data that needs to be converted.
- ignore_error (:obj:bool): Whether to ignore the error when the data type is not supported. That is to say, only the data can be transformed into a python native scalar will be returned.
Returns:
- data (:obj:Any): Converted data.
Examples:
>>>> data = { 'tensor': torch.randn(1), 'list': [True, False, torch.randn(1)], 'tuple': (4, 5, 6), 'bool': True, 'int': 10, 'float': 10., 'array': np.random.randn(1), 'str': "asdf", 'none': None, }
>>>> new_data = to_item(data)
>>>> assert np.isscalar(new_data['tensor'])
>>>> assert np.isscalar(new_data['array'])
>>>> assert np.isscalar(new_data['list'][-1])
.. note::
Now supports item type: :obj:`torch.Tensor`, :obj:`torch.Tensor`, :obj:`ttorch.Tensor`, :obj:`bool`, :obj:`str`, :obj:`dict`, :obj:`list`, :obj:`tuple` and :obj:`None`.
same_shape(data)
¶
Overview
Judge whether all data elements in a list have the same shapes.
Arguments:
- data (:obj:list): The list of data.
Returns:
- same (:obj:bool): Whether the list of data all have the same shape.
Examples:
>>> tlist = [torch.randn(3, 5) for i in range(5)]
>>> assert same_shape(tlist)
>>> tlist = [torch.randn(3, 5), torch.randn(4, 5)]
>>> assert not same_shape(tlist)
build_log_buffer()
¶
Overview
Build log buffer, a subclass of dict, which can convert the input data into log format.
Returns:
- log_buffer (:obj:LogDict): Log buffer dict.
Examples:
>>> log_buffer = build_log_buffer()
>>> log_buffer['not_tensor'] = torch.randn(3)
>>> assert isinstance(log_buffer['not_tensor'], list)
>>> assert len(log_buffer['not_tensor']) == 3
>>> log_buffer.update({'not_tensor': 4, 'a': 5})
>>> assert log_buffer['not_tensor'] == 4
get_tensor_data(data)
¶
Overview
Get pure tensor data from the given data (without disturbing grad computation graph).
Arguments:
- data (:obj:Any): The original data. It can be exactly a tensor or a container (Sequence or dict).
Returns:
- output (:obj:Any): The output data.
Examples:
>>> a = { 'tensor': torch.tensor([1, 2, 3.], requires_grad=True), 'list': [torch.tensor([1, 2, 3.], requires_grad=True) for _ in range(2)], 'none': None }
>>> tensor_a = get_tensor_data(a)
>>> assert not tensor_a['tensor'].requires_grad
>>> for t in tensor_a['list']:
>>> assert not t.requires_grad
unsqueeze(data, dim=0)
¶
Overview
Unsqueeze the tensor data.
Arguments:
- data (:obj:Any): The original data. It can be exactly a tensor or a container (Sequence or dict).
- dim (:obj:int): The dimension to be unsqueezed.
Returns:
- output (:obj:Any): The output data.
Examples (tensor): >>> t = torch.randn(3, 3) >>> tt = unsqueeze(t, dim=0) >>> assert tt.shape == torch.Shape([1, 3, 3])
Examples (list): >>> t = [torch.randn(3, 3)] >>> tt = unsqueeze(t, dim=0) >>> assert tt[0].shape == torch.Shape([1, 3, 3])
Examples (dict): >>> t = {"t": torch.randn(3, 3)} >>> tt = unsqueeze(t, dim=0) >>> assert tt["t"].shape == torch.Shape([1, 3, 3])
squeeze(data, dim=0)
¶
Overview
Squeeze the tensor data.
Arguments:
- data (:obj:Any): The original data. It can be exactly a tensor or a container (Sequence or dict).
- dim (:obj:int): The dimension to be Squeezed.
Returns:
- output (:obj:Any): The output data.
Examples (tensor): >>> t = torch.randn(1, 3, 3) >>> tt = squeeze(t, dim=0) >>> assert tt.shape == torch.Shape([3, 3])
Examples (list): >>> t = [torch.randn(1, 3, 3)] >>> tt = squeeze(t, dim=0) >>> assert tt[0].shape == torch.Shape([3, 3])
Examples (dict): >>> t = {"t": torch.randn(1, 3, 3)} >>> tt = squeeze(t, dim=0) >>> assert tt["t"].shape == torch.Shape([3, 3])
get_null_data(template, num)
¶
Overview
Get null data given an input template.
Arguments:
- template (:obj:Any): The template data.
- num (:obj:int): The number of null data items to generate.
Returns:
- output (:obj:List[Any]): The generated null data.
Examples:
>>> temp = {'obs': [1, 2, 3], 'action': 1, 'done': False, 'reward': torch.tensor(1.)}
>>> null_data = get_null_data(temp, 2)
>>> assert len(null_data) ==2
>>> assert null_data[0]['null'] and null_data[0]['done']
zeros_like(h)
¶
Overview
Generate zero-tensors like the input data.
Arguments:
- h (:obj:Any): The original data. It can be exactly a tensor or a container (Sequence or dict).
Returns:
- output (:obj:Any): The output zero-tensors.
Examples (tensor): >>> t = torch.randn(3, 3) >>> tt = zeros_like(t) >>> assert tt.shape == torch.Shape([3, 3]) >>> assert torch.sum(torch.abs(tt)) < 1e-8
Examples (list): >>> t = [torch.randn(3, 3)] >>> tt = zeros_like(t) >>> assert tt[0].shape == torch.Shape([3, 3]) >>> assert torch.sum(torch.abs(tt[0])) < 1e-8
Examples (dict): >>> t = {"t": torch.randn(3, 3)} >>> tt = zeros_like(t) >>> assert tt["t"].shape == torch.Shape([3, 3]) >>> assert torch.sum(torch.abs(tt["t"])) < 1e-8
Full Source Code
../ding/torch_utils/data_helper.py