1000字范文 > 遥感影像制作语义分割数据集——摸索（踩坑）记录

遥感影像制作语义分割数据集——摸索（踩坑）记录

时间：2023-12-19 05:11:46

遥感影像制作语义分割数据集——探索（踩坑）记录

由于本人是个小白，本篇文章主要为了记录自己的摸索过程，文中一些名词描述可能不是很准确，我尽量说清楚整个处理的流程。文章大多代码都是参考或者直接使用其他前辈的代码，文中会给出一些前辈的文章链接。

任务背景

给定某区域的影像，我们需要用它来制作遥感影像的语义分割数据集。由于目标区域的影像大概率不是规则形状，在影像变为栅格数据时，其中很可能包含一些无效区域（这里一定要注意！）。为完成数据集制作这个目标，主要有两个不同的方式（方向），第一种是用arcgis这类软件来标注，得到相应的标注结果；另一种方法是将原始tiff影像转变为普通图片格式，之后在利用例如labelme这种软件来标注。

用ArcGIS来标注

标注

这里给出当初的参考：

/weixin_50988792/article/details/120509506/p/163353715

切割影像

/p/139757130/zcq0525/article/details/119618043

转为普通图片格式

转为普通图片的重点在于，将辐射值从0到几千大小的辐射值标定到[0,255]范围内。在影像拉伸上，一些博主已经写了一些总结，没什么问题。但是需要十分注意的地方是，影像中存在nodata的处理。

先给出我最开始使用的影像拉伸代码：

def stretchImg(imgPath, resultPath, lower_percent=0.5, higher_percent=99.5):"""#将光谱DN值映射至0-255，并保存:param imgPath: 需要转换的tif影像路径（***.tif）:param resultPath: 转换后的文件存储路径(***.jpg):param lower_percent: 低值拉伸比率:param higher_percent: 高值拉伸比率:return: 无返回参数，直接输出图片"""RGB_Array = readTif(imgPath)band_Num = RGB_Array.shape[2]JPG_Array = np.zeros_like(RGB_Array, dtype=np.uint8)for i in range(band_Num):minValue = 0maxValue = 255# 获取数组RGB_Array某个百分比分位上的值low_value = np.percentile(RGB_Array[:, :, i], lower_percent)high_value = np.percentile(RGB_Array[:, :, i], higher_percent)temp_value = minValue + (RGB_Array[:, :, i] - low_value) * (maxValue - minValue) / (high_value - low_value)temp_value[temp_value < minValue] = minValuetemp_value[temp_value > maxValue] = maxValueJPG_Array[:, :, i] = temp_valueoutputImg = Image.fromarray(np.uint8(JPG_Array))outputImg.save(resultPath)————————————————原文链接：/zcq0525/article/details/119618043

下面是我踩坑的过程：

我最开始并不知道nodata这个属性，然后转换出来的结果为看起来是黑白的影像，后来我debug看了下，发现影像中有65535这个值，最开始我并不清楚这个值十分合理，只是觉得有点特别，且数值太大。后来我发现，影像中是有无效区域的，对于无效的区域到底赋值了什么呢？是不是这个65535?若是65535，这个值太大了，而且对于我处理的那张原图，无效区域占比很多，即使百分比截断，取值上限仍然是65535，这很可能就是导致影像转变之后总是黑白的原因。后来我把图中等于65535的地方的值改为了0，影像就不再是看起来黑白的了。

然而新的问题又出现了，我按照他人的代码，转换出来虽然能看到图中的各种类别的物体，但是总是感觉像是有一层薄雾覆盖了整张图。若是我没在arcgis打开，看到影像我可能觉得就是有雾，但是有点奇怪地覆盖了整张图。然而，我用arcgis打开过，在arcgis中并没有出现这种薄雾，它的百分率阶段地取值和我差不多，我把gamma校正关闭，影像仍然不带有“薄雾”。

所以是什么造成了我转换的时候出现“薄雾”这种现象呢？我试了一些方法，其中有两个比较有启示和指导作用。

其一、我把有“薄雾”的数据放到了arcgis中，发现它竟然显得的时候没有用其他图片查看器看到了那种“薄雾”，我人傻了，这*****。我想是arcgis所用的线性拉伸不一样吗？应该不大可能吧，我检查了线性拉伸的代码实现没啥问题呀，难道是arcgis做了其他的数据处理操作，但是我不知道做了什么？嗯，肯定arcgis这么干了，肯定是的，我这样干没毛病。

其二、我各种尝试，我开始各种调上面的stretchImg函数的lower_percent和higher_percent的值，然后发现竟然可以出现不带“薄雾”的影像，太**离谱了，发什么了，为什么可以这样？当时我是熬夜到了很久，脑子懵了，没想到，我就先放弃了。第二天我终于发现自己干得不太合理的地方，我把65535变成0就没问题吗？这样不是把原来过大值得影像放到了最小值上面？这可真是菜dog行为，自己当时还完全不自知，为自己得智商捉急…

终于，我尝试了在做百分比截断统计各个辐射值个数时，把65536排除在外，终于，终于，它正常了！下面是我后来用的一些代码，做了一些简单注释，代码存在部分冗余，功能畅通，美感稀碎，还请各位忽略，或者自己完善一下。

import gdalimport osimport cv2import numpy as np# 读取tif数据集def readTif(fileName, xoff = 0, yoff = 0, data_width = 0, data_height = 0):dataset = gdal.Open(fileName, gdal.GA_ReadOnly)if dataset == None:print(fileName + "文件无法打开")# 栅格矩阵的列数width = dataset.RasterXSize# 栅格矩阵的行数height = dataset.RasterYSize# 波段数bands = dataset.RasterCountassert bands >= 1, "the band of data is less than 1, please check the data!"first_band = dataset.GetRasterBand(1)nodata_value = first_band.GetNoDataValue() # 获取影像的nodata_value，用于后面影像拉伸时，提出nodata的干扰，踩坑密集区# 获取数据if(data_width == 0 and data_height == 0):data_width = widthdata_height = heightdata = dataset.ReadAsArray(xoff, yoff, data_width, data_height)# 获取仿射矩阵信息geotrans = dataset.GetGeoTransform()# 获取投影信息proj = dataset.GetProjection()first_band = Nonedataset = Nonereturn width, height, bands, data, geotrans, proj, nodata_value# 保存tif文件函数（我没验证这部分）def writeTiff(im_data, im_geotrans, im_proj, path):if 'int8' in im_data.dtype.name:datatype = gdal.GDT_Byteelif 'int16' in im_data.dtype.name:datatype = gdal.GDT_UInt16else:datatype = gdal.GDT_Float32if len(im_data.shape) == 3:im_bands, im_height, im_width = im_data.shapeelif len(im_data.shape) == 2:im_data = np.array([im_data])im_bands, im_height, im_width = im_data.shape#创建文件driver = gdal.GetDriverByName("GTiff")dataset = driver.Create(path, int(im_width), int(im_height), int(im_bands), datatype)if(dataset!= None):dataset.SetGeoTransform(im_geotrans) #写入仿射变换参数dataset.SetProjection(im_proj) #写入投影for i in range(im_bands):dataset.GetRasterBand(i+1).WriteArray(im_data[i])del dataset# 影像拉伸def stretchImg(rgb_array, lower_percent=0.5, higher_percent=99.5, nodata_value=None):"""# 将光谱DN值映射至0-255，并保存:rgb_array: 用gdal读取之后的RGB三通道影像的辐射值:param lower_percent: 低值拉伸比率:param higher_percent: 高值拉伸比率:nodata_value: 影像种nodata的值"""RGB_Array = rgb_array.copy()band_Num = RGB_Array.shape[2]IMG_Array = np.zeros_like(RGB_Array)for i in range(band_Num):minValue = 0maxValue = 255# 获取数组RGB_Array某个百分比分位上的值if nodata_value is not None:low_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=ndv], lower_percent)high_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=ndv], higher_percent)else:low_value = np.percentile(RGB_Array[:, :, i], lower_percent)high_value = np.percentile(RGB_Array[:, :, i], higher_percent)temp_value = minValue + (RGB_Array[:, :, i] - low_value) * (maxValue - minValue) / (high_value - low_value)temp_value[temp_value < minValue] = minValuetemp_value[temp_value > maxValue] = maxValueIMG_Array[:, :, i] = temp_valuereturn IMG_Arrayif __name__ == "__main__":rs_path = "***.tif" # 原始影像的路径im_path = "***.png" # 处理之后图片的储存路径# print(gdal.Info(rs_path))width, height, bands, data, geotrans, proj, ndv = readTif(rs_path)arr = data.transpose(1, 2, 0)t_arr = arr[:, :, 0:3]image = t_arr.copy()# print("max value：", t_arr.max())# print("min value：", t_arr.min())output_img = stretchImg(image, lower_percent=1, higher_percent=99.5, nodata_value=ndv).astype(np.uint8)cv2.imwrite(im_path, output_img)

新坑记录

在Uint16格式下，上述代码没问题，但是，在Uint32下面，上面的代码出现了卡顿在stretchImg()函数那里，尝试了各种办法，效果都不太行（我太菜）。当然，可以将Uint32转为Uint16然后用上面的代码来处理。但是，为了代码的通用性，我还是在试着解决问题，最后的解决方法我自己也很震惊，至今不知道为啥，但是它就是可以…(为了快速使用，先放出来作为参考）。其实改动也很小，就是把stretchImg函数改动了一下下，它就好了。是的，它就好了（摊手手~）。下面给出代码

import osimport gdalimport numpy as npimport cv2from PIL import Imageimport matplotlib.pyplot as pltimport sysimport time# 设置IO读取超大图片，防止报错os.environ["OPENCV_IO_MAX_IMAGE_PIXELS"] = pow(2,40).__str__()Image.MAX_IMAGE_PIXELS = 100000000000os.environ['PROJ_LIB'] = "/home/dmt/anaconda3/envs/mmlab/share/proj"# 读取tif数据集def readTif(fileName, xoff = 0, yoff = 0, data_width = 0, data_height = 0):dataset = gdal.Open(fileName, gdal.GA_ReadOnly)if dataset == None:print(fileName + "Cannot open!")# 栅格矩阵的列数width = dataset.RasterXSize# 栅格矩阵的行数height = dataset.RasterYSize# 波段数bands = dataset.RasterCountassert bands >= 1, "the band of data is less than 1, please check the data!"first_band = dataset.GetRasterBand(1)nodata_value = first_band.GetNoDataValue() # 获取影像的nodata_value，用于后面影像拉伸时，提出nodata的干扰，踩坑密集区# 获取数据if(data_width == 0 and data_height == 0):data_width = widthdata_height = heightdata = dataset.ReadAsArray(xoff, yoff, data_width, data_height)# 获取仿射矩阵信息geotrans = dataset.GetGeoTransform()# 获取投影信息proj = dataset.GetProjection()first_band = Nonedataset = Nonereturn width, height, bands, data, geotrans, proj, nodata_value# 保存tif文件函数def writeTiff(im_data, im_geotrans, im_proj, path):if 'int8' in im_data.dtype.name:datatype = gdal.GDT_Byteelif 'int16' in im_data.dtype.name:datatype = gdal.GDT_UInt16else:datatype = gdal.GDT_Float32if len(im_data.shape) == 3:im_bands, im_height, im_width = im_data.shapeelif len(im_data.shape) == 2:im_data = np.array([im_data])im_bands, im_height, im_width = im_data.shape#创建文件driver = gdal.GetDriverByName("GTiff")dataset = driver.Create(path, int(im_width), int(im_height), int(im_bands), datatype)if(dataset!= None):dataset.SetGeoTransform(im_geotrans) #写入仿射变换参数dataset.SetProjection(im_proj) #写入投影for i in range(im_bands):dataset.GetRasterBand(i+1).WriteArray(im_data[i])del dataset# 保存tif文件函数def writeTiffV2(im_data, im_geotrans, im_proj, path, data_type=None):if data_type is not None:if 'int8' in data_type:datatype = gdal.GDT_Byteelif 'int16' in data_type:datatype = gdal.GDT_UInt16elif 'int32' in data_type:datatype = gdal.GDT_UInt32else:datatype = gdal.GDT_Float32else:if 'int8' in im_data.dtype.name:datatype = gdal.GDT_Byteelif 'int16' in im_data.dtype.name:datatype = gdal.GDT_UInt16elif 'int32' in im_data.dtype.name:datatype = gdal.GDT_UInt32else:datatype = gdal.GDT_Float32if len(im_data.shape) == 3:im_bands, im_height, im_width = im_data.shapeelif len(im_data.shape) == 2:im_data = np.array([im_data])im_bands, im_height, im_width = im_data.shape#创建文件driver = gdal.GetDriverByName("GTiff")dataset = driver.Create(path, int(im_width), int(im_height), int(im_bands), datatype)if(dataset!= None):dataset.SetGeoTransform(im_geotrans) #写入仿射变换参数dataset.SetProjection(im_proj) #写入投影for i in range(im_bands):dataset.GetRasterBand(i+1).WriteArray(im_data[i])del dataset# 影像拉伸，处理Uint32数据会卡在这部分，不要用def stretchImg_ori(rgb_array, lower_percent=0.5, higher_percent=99.5, nodata_value=None): """#将光谱DN值映射至0-255，并保存:rgb_array: 用gdal读取之后的RGB三通道影像的辐射值:param lower_percent: 低值拉伸比率:param higher_percent: 高值拉伸比率:nodata_value: 影像种nodata的值"""# RGB_Array = rgb_array.copy()RGB_Array = rgb_arrayband_Num = RGB_Array.shape[2]IMG_Array = np.zeros_like(RGB_Array)for i in range(band_Num):minValue = 0maxValue = 255# 获取数组RGB_Array某个百分比分位上的值if nodata_value is not None:low_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=nodata_value], lower_percent)high_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=nodata_value], higher_percent)else:low_value = np.percentile(RGB_Array[:, :, i], lower_percent)high_value = np.percentile(RGB_Array[:, :, i], higher_percent)temp_value = minValue + (RGB_Array[:, :, i] - low_value) * (maxValue - minValue) / (high_value - low_value)temp_value[temp_value < minValue] = minValuetemp_value[temp_value > maxValue] = maxValueIMG_Array[:, :, i] = temp_valueprint("Band {} has beed stretched!".format(i))return IMG_Arraydef stretchImg(rgb_array, lower_percent=0.5, higher_percent=99.5, nodata_value=None):"""#将光谱DN值映射至0-255，并保存:rgb_array: 用gdal读取之后的RGB三通道影像的辐射值:param lower_percent: 低值拉伸比率:param higher_percent: 高值拉伸比率:nodata_value: 影像种nodata的值"""# RGB_Array = rgb_array.copy()RGB_Array = rgb_arrayband_Num = RGB_Array.shape[2]# IMG_Array = np.zeros_like(RGB_Array)for i in range(band_Num):minValue = 0maxValue = 255# 获取数组RGB_Array某个百分比分位上的值if nodata_value is not None:low_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=nodata_value], lower_percent)high_value = np.percentile(RGB_Array[:, :, i][RGB_Array[:, :, i]!=nodata_value], higher_percent)else:low_value = np.percentile(RGB_Array[:, :, i], lower_percent)high_value = np.percentile(RGB_Array[:, :, i], higher_percent)temp_value = minValue + (RGB_Array[:, :, i] - low_value) * (maxValue - minValue) / (high_value - low_value)temp_value[temp_value < minValue] = minValuetemp_value[temp_value > maxValue] = maxValueRGB_Array[:, :, i] = temp_valueprint(time.strftime("%y%m%d-%H%M%S", time.localtime()))print("Band {} has beed stretched!".format(i))return RGB_Arrayif __name__ == "__main__":rs_path = "/***.tif" # 待处理的tif影像的路径im_path = "***.png" # 要保存的图片路径l_p = 2 # 用来线性拉伸的下界（百分位）h_p = 98 # 用来线性拉伸的上界print(gdal.Info(rs_path)) # 个人理解就是打印地理投影信息now_t = time.strftime("%y%m%d-%H%M%S", time.localtime()) # 显示时间便于自己查看处理各步骤的耗时print(now_t)start_t = time.time()print("Starting Function readTif...")# 读取影像width, height, bands, data, geotrans, proj, ndv = readTif(rs_path)print("Finish Function readTif")readtif_t = time.time() - start_tprint("readTif need {}s".format(readtif_t))arr = data.transpose(1, 2, 0)image = arr[:, :, 0:3]now_t = time.strftime("%y%m%d-%H%M%S", time.localtime())print(now_t)print("executing Function stretchImg...")output_img = stretchImg(image, lower_percent=l_p, higher_percent=h_p, nodata_value=ndv).astype(np.uint8)print("Complete Function stretchImg...")stretchimg_t = time.time() - start_tprint("stretchImg need {}s".format(stretchimg_t))now_t = time.strftime("%y%m%d-%H%M%S", time.localtime())print(now_t)cv2.imwrite(im_path, output_img)saveimg_t = time.time() - start_tprint("saveimg need {}s".format(saveimg_t))now_t = time.strftime("%y%m%d-%H%M%S", time.localtime())print(now_t)

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。