How to handle missing values in Torch?

10 months ago

Noah Thompson

1 minute

In Torch, dealing with missing values usually involves replacing the missing values with a specific value, such as 0 or NaN, before proceeding with the necessary data processing operations.

A common way to handle this is by using the torch.masked_fill_() function, which can replace specific values in the data based on a specified mask condition. For example, if missing values are represented by -1, you can use the following code to replace them with 0:

import torch

# 创建一个包含缺失值的张量
x = torch.tensor([1, 2, -1, 4, -1])

# 创建一个掩码，标记缺失值的位置
mask = x == -1

# 替换缺失值为0
x.masked_fill_(mask, 0)

print(x)

Another common approach is to use the torch.where() function, which selects values at corresponding positions between two tensors based on a specified condition. For example, missing values can be replaced with 0 using the following code:

import torch

# 创建一个包含缺失值的张量
x = torch.tensor([1, 2, -1, 4, -1])

# 创建一个掩码，标记缺失值的位置
mask = x == -1

# 替换缺失值为0
x = torch.where(mask, torch.tensor(0), x)

print(x)

These are two common methods for dealing with missing values, you can choose the most appropriate method based on the specific situation.