Azure - 机器学习：使用自动化机器学习训练计算机视觉模型的数据架构

一、用于训练的数据架构

Azure 机器学习的图像 AutoML 要求以 JSONL（JSON 行）格式准备输入图像数据。本部分介绍多类图像分类、多标签图像分类、对象检测和实例分段的输入数据格式或架构。我们还将提供最终训练或验证 JSON 行文件的示例。

图像分类（二进制/多类）

每个 JSON 行中的输入数据格式/架构：

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":"class_name",
}

密钥说明示例

image_url

Azure 机器学习数据存储中的图像位置

Required, String

"AmlDatastore://data_directory/Image_01.jpg"

image_details

图像详细信息

Optional, Dictionary

"image_details":{"format": "jpg", "width": "400px", "height": "258px"}

format

图像类型（支持 Pillow 库中所有可用的图像格式）

Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif","bmp", "tif", "tiff"}

"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"

width

图像的宽度

Optional, String or Positive Integer

"400px" or 400

height

图像的高度

Optional, String or Positive Integer

"200px" or 200

label

图像的类/标签

Required, String

"cat"

多类图像分类的 JSONL 文件示例：

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details":{"format": "jpg", "width": "400px", "height": "258px"}, "label": "can"}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "397px", "height": "296px"}, "label": "milk_bottle"}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "1024px", "height": "768px"}, "label": "water_bottle"}

多标签图像分类

下面是每个 JSON 行中用于图像分类的输入数据格式/架构示例。

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      "class_name_1",
      "class_name_2",
      "class_name_3",
      "...",
      "class_name_n"
        
   ]
}

密钥说明示例

image_url

Azure 机器学习数据存储中的图像位置

Required, String

"AmlDatastore://data_directory/Image_01.jpg"

image_details

图像详细信息

Optional, Dictionary

"image_details":{"format": "jpg", "width": "400px", "height": "258px"}

format

图像类型（支持 Pillow 库中所有可用的图像格式）

Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff"}

"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"

width

图像的宽度

Optional, String or Positive Integer

"400px" or 400

height

图像的高度

Optional, String or Positive Integer

"200px" or 200

label

图像中的类/标签列表

Required, List of Strings

["cat","dog"]

多标签图像分类的 JSONL 文件示例：

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details":{"format": "jpg", "width": "400px", "height": "258px"}, "label": ["can"]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "397px", "height": "296px"}, "label": ["can","milk_bottle"]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "1024px", "height": "768px"}, "label": ["carton","milk_bottle","water_bottle"]}

对象检测

下面是用于对象检测的示例 JSONL 文件。

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      {
         "label":"class_name_1",
         "topX":"xmin/width",
         "topY":"ymin/height",
         "bottomX":"xmax/width",
         "bottomY":"ymax/height",
         "isCrowd":"isCrowd"
      },
      {
         "label":"class_name_2",
         "topX":"xmin/width",
         "topY":"ymin/height",
         "bottomX":"xmax/width",
         "bottomY":"ymax/height",
         "isCrowd":"isCrowd"
      },
      "..."
   ]
}

其中：

xmin = 边界框左上角的 x 坐标
ymin = 边界框左上角的 y 坐标
xmax = 边界框右下角的 x 坐标
ymax = 边界框右下角的 y 坐标
密钥说明示例
```
image_url
```
Azure 机器学习数据存储中的图像位置
```
Required, String
```

"AmlDatastore://data_directory/Image_01.jpg"

image_details

图像详细信息

Optional, Dictionary

"image_details":{"format": "jpg", "width": "400px", "height": "258px"}

format

图像类型（支持 Pillow 库中提供的所有图像格式。但对于 YOLO，仅支持 opencv 允许的图像格式）

Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff"}

"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"

width

图像的宽度

Optional, String or Positive Integer

"499px" or 499

height

图像的高度

Optional, String or Positive Integer

"665px" or 665

label

（外部键）边界框列表，其中每个框都是其左上方和右下方坐标的

label, topX, topY, bottomX, bottomY, isCrowd

字典

Required, List of dictionaries

[{"label": "cat", "topX": 0.260, "topY": 0.406, "bottomX": 0.735, "bottomY": 0.701, "isCrowd": 0}]

label

（内部键）边界框中对象的类/标签

Required, String

"cat"

topX

边界框左上角的 x 坐标与图像宽度的比率

Required, Float in the range [0,1]

0.260

topY

边界框左上角的 y 坐标与图像高度的比率

Required, Float in the range [0,1]

0.406

bottomX

边界框右下角的 x 坐标与图像宽度的比率

Required, Float in the range [0,1]

0.735

bottomY

边界框右下角的 y 坐标与图像高度的比率

Required, Float in the range [0,1]

0.701

isCrowd

指示边界框是否围绕对象群。如果设置了此特殊标志，我们在计算指标时将跳过此特定边界框。

Optional, Bool

用于对象检测的 JSONL 文件示例：

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "can", "topX": 0.260, "topY": 0.406, "bottomX": 0.735, "bottomY": 0.701, "isCrowd": 0}]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "topX": 0.172, "topY": 0.153, "bottomX": 0.432, "bottomY": 0.659, "isCrowd": 0}, {"label": "milk_bottle", "topX": 0.300, "topY": 0.566, "bottomX": 0.891, "bottomY": 0.735, "isCrowd": 0}]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "topX": 0.0180, "topY": 0.297, "bottomX": 0.380, "bottomY": 0.836, "isCrowd": 0}, {"label": "milk_bottle", "topX": 0.454, "topY": 0.348, "bottomX": 0.613, "bottomY": 0.683, "isCrowd": 0}, {"label": "water_bottle", "topX": 0.667, "topY": 0.279, "bottomX": 0.841, "bottomY": 0.615, "isCrowd": 0}]}

实例分段

对于实例分段，自动化 ML 仅支持多边形作为输入和输出，不支持掩码。

下面是实例分段的示例 JSONL 文件。

{
   "image_url":"AmlDatastore://data_directory/../Image_name.image_format",
   "image_details":{
      "format":"image_format",
      "width":"image_width",
      "height":"image_height"
   },
   "label":[
      {
         "label":"class_name",
         "isCrowd":"isCrowd",
         "polygon":[["x1", "y1", "x2", "y2", "x3", "y3", "...", "xn", "yn"]]
      }
   ]
}

密钥说明示例

image_url

Azure 机器学习数据存储中的图像位置

Required, String

"AmlDatastore://data_directory/Image_01.jpg"

image_details

图像详细信息

Optional, Dictionary

"image_details":{"format": "jpg", "width": "400px", "height": "258px"}

format

映像类型

Optional, String from {"jpg", "jpeg", "png", "jpe", "jfif", "bmp", "tif", "tiff" }

"jpg" or "jpeg" or "png" or "jpe" or "jfif" or "bmp" or "tif" or "tiff"

width

图像的宽度

Optional, String or Positive Integer

"499px" or 499

height

图像的高度

Optional, String or Positive Integer

"665px" or 665

label

（外部键）掩码列表，其中每个掩码都是

label, isCrowd, polygon coordinates

的字典

Required, List of dictionaries

[{"label": "can", "isCrowd": 0, "polygon": [[0.577, 0.689,

0.562, 0.681,

0.559, 0.686]]}]

label

（内部键）掩码中对象的类/标签

Required, String

"cat"

isCrowd

指示掩码是否围绕对象群

Optional, Bool

polygon

对象的多边形坐标

Required, List of list for multiple segments of the same instance. Float values in the range [0,1]

[[0.577, 0.689, 0.567, 0.689, 0.559, 0.686]]

实例分段的 JSONL 文件示例：

{"image_url": "AmlDatastore://image_data/Image_01.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "can", "isCrowd": 0, "polygon": [[0.577, 0.689, 0.567, 0.689, 0.559, 0.686, 0.380, 0.593, 0.304, 0.555, 0.294, 0.545, 0.290, 0.534, 0.274, 0.512, 0.2705, 0.496, 0.270, 0.478, 0.284, 0.453, 0.308, 0.432, 0.326, 0.423, 0.356, 0.415, 0.418, 0.417, 0.635, 0.493, 0.683, 0.507, 0.701, 0.518, 0.709, 0.528, 0.713, 0.545, 0.719, 0.554, 0.719, 0.579, 0.713, 0.597, 0.697, 0.621, 0.695, 0.629, 0.631, 0.678, 0.619, 0.683, 0.595, 0.683, 0.577, 0.689]]}]}
{"image_url": "AmlDatastore://image_data/Image_02.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "carton", "isCrowd": 0, "polygon": [[0.240, 0.65, 0.234, 0.654, 0.230, 0.647, 0.210, 0.512, 0.202, 0.403, 0.182, 0.267, 0.184, 0.243, 0.180, 0.166, 0.186, 0.159, 0.198, 0.156, 0.396, 0.162, 0.408, 0.169, 0.406, 0.217, 0.414, 0.249, 0.422, 0.262, 0.422, 0.569, 0.342, 0.569, 0.334, 0.572, 0.320, 0.585, 0.308, 0.624, 0.306, 0.648, 0.240, 0.657]]}, {"label": "milk_bottle",  "isCrowd": 0, "polygon": [[0.675, 0.732, 0.635, 0.731, 0.621, 0.725, 0.573, 0.717, 0.516, 0.717, 0.505, 0.720, 0.462, 0.722, 0.438, 0.719, 0.396, 0.719, 0.358, 0.714, 0.334, 0.714, 0.322, 0.711, 0.312, 0.701, 0.306, 0.687, 0.304, 0.663, 0.308, 0.630, 0.320, 0.596, 0.32, 0.588, 0.326, 0.579]]}]}
.
.
.
{"image_url": "AmlDatastore://image_data/Image_n.jpg", "image_details": {"format": "jpg", "width": "499px", "height": "666px"}, "label": [{"label": "water_bottle", "isCrowd": 0, "polygon": [[0.334, 0.626, 0.304, 0.621, 0.254, 0.603, 0.164, 0.605, 0.158, 0.602, 0.146, 0.602, 0.142, 0.608, 0.094, 0.612, 0.084, 0.599, 0.080, 0.585, 0.080, 0.539, 0.082, 0.536, 0.092, 0.533, 0.126, 0.530, 0.132, 0.533, 0.144, 0.533, 0.162, 0.525, 0.172, 0.525, 0.186, 0.521, 0.196, 0.521 ]]}, {"label": "milk_bottle", "isCrowd": 0, "polygon": [[0.392, 0.773, 0.380, 0.732, 0.379, 0.767, 0.367, 0.755, 0.362, 0.735, 0.362, 0.714, 0.352, 0.644, 0.352, 0.611, 0.362, 0.597, 0.40, 0.593, 0.444,  0.494, 0.588, 0.515, 0.585, 0.621, 0.588, 0.671, 0.582, 0.713, 0.572, 0.753 ]]}]}

二、用于推理的数据格式

在本部分中，我们将记录在使用部署的模型时进行预测所需的输入数据格式。可以接受内容类型为

application/octet-stream

的任何上述图像格式。

输入格式

下面是使用特定于任务的模型终结点对任何任务生成预测所需的输入格式。部署模型后，我们可以使用以下代码段来获取所有任务的预测。

# input image for inference
sample_image = './test_image.jpg'
# load image data
data = open(sample_image, 'rb').read()
# set the content type
headers = {'Content-Type': 'application/octet-stream'}
# if authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {key}'
# make the request and display the response
response = requests.post(scoring_uri, data, headers=headers)

输出格式

根据任务类型，对模型终结点进行的预测遵循不同的结构。本部分将探讨多类、多标签图像分类、对象检测和实例分段任务的输出数据格式。

图像分类

图像分类的终结点返回数据集中的所有标签及其在输入图像中的概率分数，格式如下：

{
   "filename":"/tmp/tmppjr4et28",
   "probs":[
      2.098e-06,
      4.783e-08,
      0.999,
      8.637e-06
   ],
   "labels":[
      "can",
      "carton",
      "milk_bottle",
      "water_bottle"
   ]
}

多标签图像分类

对于多标签图像分类，模型终结点返回标签及其概率。

{
   "filename":"/tmp/tmpsdzxlmlm",
   "probs":[
      0.997,
      0.960,
      0.982,
      0.025
   ],
   "labels":[
      "can",
      "carton",
      "milk_bottle",
      "water_bottle"
   ]
}

对象检测

对象检测模型返回多个框，其中包含缩放后的左上角和右下角坐标，以及框标签和置信度分数。

{
   "filename":"/tmp/tmpdkg2wkdy",
   "boxes":[
      {
         "box":{
            "topX":0.224,
            "topY":0.285,
            "bottomX":0.399,
            "bottomY":0.620
         },
         "label":"milk_bottle",
         "score":0.937
      },
      {
         "box":{
            "topX":0.664,
            "topY":0.484,
            "bottomX":0.959,
            "bottomY":0.812
         },
         "label":"can",
         "score":0.891
      },
      {
         "box":{
            "topX":0.423,
            "topY":0.253,
            "bottomX":0.632,
            "bottomY":0.725
         },
         "label":"water_bottle",
         "score":0.876
      }
   ]
}

实例分段

在实例分段中，输出包含多个框，其中包含缩放后的左上角和右下角坐标、标签、置信度和多边形（非掩码）。此处，多边形值与我们在“架构”部分中讨论的格式相同。

{
   "filename":"/tmp/tmpi8604s0h",
   "boxes":[
      {
         "box":{
            "topX":0.679,
            "topY":0.491,
            "bottomX":0.926,
            "bottomY":0.810
         },
         "label":"can",
         "score":0.992,
         "polygon":[
            [
               0.82, 0.811, 0.771, 0.810, 0.758, 0.805, 0.741, 0.797, 0.735, 0.791, 0.718, 0.785, 0.715, 0.778, 0.706, 0.775, 0.696, 0.758, 0.695, 0.717, 0.698, 0.567, 0.705, 0.552, 0.706, 0.540, 0.725, 0.520, 0.735, 0.505, 0.745, 0.502, 0.755, 0.493
            ]
         ]
      },
      {
         "box":{
            "topX":0.220,
            "topY":0.298,
            "bottomX":0.397,
            "bottomY":0.601
         },
         "label":"milk_bottle",
         "score":0.989,
         "polygon":[
            [
               0.365, 0.602, 0.273, 0.602, 0.26, 0.595, 0.263, 0.588, 0.251, 0.546, 0.248, 0.501, 0.25, 0.485, 0.246, 0.478, 0.245, 0.463, 0.233, 0.442, 0.231, 0.43, 0.226, 0.423, 0.226, 0.408, 0.234, 0.385, 0.241, 0.371, 0.238, 0.345, 0.234, 0.335, 0.233, 0.325, 0.24, 0.305, 0.586, 0.38, 0.592, 0.375, 0.598, 0.365
            ]
         ]
      },
      {
         "box":{
            "topX":0.433,
            "topY":0.280,
            "bottomX":0.621,
            "bottomY":0.679
         },
         "label":"water_bottle",
         "score":0.988,
         "polygon":[
            [
               0.576, 0.680, 0.501, 0.680, 0.475, 0.675, 0.460, 0.625, 0.445, 0.630, 0.443, 0.572, 0.440, 0.560, 0.435, 0.515, 0.431, 0.501, 0.431, 0.433, 0.433, 0.426, 0.445, 0.417, 0.456, 0.407, 0.465, 0.381, 0.468, 0.327, 0.471, 0.318
            ]
         ]
      }
   ]
}

file

关注TechLead，分享AI全维度知识。作者拥有10+年互联网服务架构、AI产品研发经验、团队管理经验，同济本复旦硕，复旦机器人智能实验室成员，阿里云认证的资深架构师，项目管理专业人士，上亿营收AI产品研发负责人。

标签： azure 机器学习人工智能

本文转载自: https://blog.csdn.net/magicyangjay111/article/details/134286386
版权归原作者 TechLead KrisChang 所有，如有侵权，请联系我们删除。

Azure - 机器学习：使用自动化机器学习训练计算机视觉模型的数据架构

目录

一、用于训练的数据架构

图像分类（二进制/多类）

多标签图像分类

对象检测

实例分段

二、用于推理的数据格式

输入格式

输出格式

图像分类

多标签图像分类

对象检测

实例分段

发表评论

“Azure - 机器学习：使用自动化机器学习训练计算机视觉模型的数据架构”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航