1000字范文 > python对数据集处理绘制世界地图

python对数据集处理绘制世界地图

时间：2020-06-28 02:52:23

相关推荐

python对数据集处理绘制世界地图

数据集一般是 json or csv 格式进行存储的。

本文绘制世界人口地图的数据已上传，或在/ 进行下载。

大致如下：

对数据如何处理呢？先是需要读取文件，然后将文件内容存储到列表里，再通过可视化的方法，进行处理。

比如：绘制某国家随着时间变化的人口变化图。

只有打开csv格式的是新内容，绘制在前篇博客有讲。

# -*- coding: utf-8 -*-"""根据世界各个国家人口在每年人口情况，绘制出随时间的人口数变化图"""import csv#导入csv格式的处理模板，如果是json则导入jsonfrom matplotlib import pyplot as plt #处理模板Country_code = input("请输入需要查询的人口变化图的国家代码：")population_nums = []#用于存储所有的人口years = []#用于存储人口对应的年份filename = "population_csv.csv"#文件名with open(filename) as f:#打开文件存储为freader = csv.reader(f)#阅读器对象header_row = next(reader)#读取第一行，第一行是行的属性。# print(header_row)#展示第一行#展示属性在列表中的位置for index, column_header in enumerate(header_row): # print(index, column_header)for row in reader:if row[1] == Country_code:num = int(row[3])population_nums.append(num)#将人口存储进入列表year = int(row[2])years.append(year)#将年份存储进入列表# print(population_nums,years)fig = plt.figure(dpi = 128,figsize =(10,8))#窗口大小plt.plot(years,population_nums,c = "blue")#横坐标和年份plt.title( str(Country_code) +" population_num in 1960-", fontsize=24)#标题和尺寸 plt.xlabel('year', fontsize=16)#x轴标题和尺寸plt.xticks([x for x in range(1960,) if x%3 == 0])#设置横坐标的取值点，这里设置为3的倍数。plt.yticks([y for y in range(0,1492730000,50000000)])fig.autofmt_xdate()#横坐标斜置plt.ylabel("population(10^9)", fontsize=16)#y轴标题和尺寸plt.tick_params(axis='both', which='major', labelsize=16)plt.axis([1960,,0,1492730000])#横纵坐标范围plt.show()#展示

首先读取了数据，随后由于数据的属性是相同的，根据属性的位置进行查找，获取到满足条件的数据存放到列表中，然后再通过plt进行绘制。

查询中国的人口变化情况：

USA：

根据如此多的国家，可以将人口绘制显示在世界地图上吗？先展示一下结果：

想要绘制到地图上，我们需要地图模板。

我们先来绘制一个简单的地图：

# -*- coding: utf-8 -*-import pygal_maps_world.mapswm = pygal_maps_world.maps.World()#空的地图模板wm.title = 'North, Central, and South America'wm.add('North America', ['ca', 'mx', 'us']) #绘制的某个州和国家的两个字母的国家码wm.add('Central America', ['bz', 'cr', 'gt', 'hn', 'ni', 'pa', 'sv']) wm.add('South America', ['ar', 'bo', 'br', 'cl', 'co', 'ec', 'gf', 'gy', 'pe', 'py', 'sr', 'uy', 've']) wm.render_to_file('americas.svg')

美洲的地图：

还可以通过传递参数，如显示人口：

# -*- coding: utf-8 -*-import pygal_maps_world.mapswm = pygal_maps_world.maps.World()wm.title = "Populations of Countries in North America"wm.add('North America', {'ca':392911, 'mx':232321, 'us':231313})wm.render_to_file('na_americas.svg')

联网的情况下，浏览器打开，鼠标停放可以显示人数。

以上我们对绘图有了基本的理解，我们来根据我们的数据集进行绘制世界人口地图。

由于pygal使用的是2个字母的国别码，而我们数据中的是3个字的，所以我们需要通过数据中的国家名字转换为两个字的国别码才能绘图。

# -*- coding: utf-8 -*-from pygal_maps_world.i18n import COUNTRIES#from pygal.i18n import COUNTRIES #python的更新导致没又il8n这个包'''pygal使用两个字的国别码，而数据中使用的是三个字的，字典COUNTRIES 包含的键和值分别为两个字母的国别码和国家名，用于进行国别码的转换。'''for country_code in sorted(COUNTRIES.keys()): print(country_code, COUNTRIES[country_code])

结果如下：

ad Andorraae United Arab Emiratesaf Afghanistanal Albaniaam Armeniaao Angolaaq Antarcticaar Argentinaat Austriaau Australiaaz Azerbaijanba Bosnia and Herzegovinabd Bangladeshbe Belgiumbf Burkina Fasobg Bulgariabh Bahrainbi Burundibj Beninbn Brunei Darussalambo Bolivia, Plurinational State ofbr Brazilbt Bhutanbw Botswanaby Belarusbz Belizeca Canadacd Congo, the Democratic Republic of thecf Central African Republiccg Congoch Switzerlandci Cote d'Ivoire————————————————数据过多，未完全显示。

可以看到，两个字的国别码和对应的国家名字。

那么可以通过遍历国家，查找其转换的国别码，进行转换，用于绘图。

# -*- coding: utf-8 -*-from pygal_maps_world.i18n import COUNTRIES#模块导入#编写查找国家代码函数def get_country_code(country_name):for code,name in COUNTRIES.items():#在库中查找if name == country_name:#查找到，返回代码return codereturn None

那么我们调用函数即可绘制地图。

# -*- coding: utf-8 -*-"""根据人口数目，绘制世界地图中的人口。"""import csv#导入csv格式的处理模板from country_codes import get_country_code#导入国家代码的查询import pygal_maps_world.mapsfrom pygal.style import LightColorizedStyle, RotateStyle #设置颜色，和亮度。wm = pygal_maps_world.maps.World()#全新的世界地图population_nums = []#用于存储所有的人口country_codes = []#用于存储国家代码population_codes = {}#用于存储国家代码和对应的人口数。直接绘制颜色单一#"""#为了颜色多样，可采用不同人口数量进行国家区分，如下区分为三类#"""#population_codes_1 = {}#一亿以下人口#population_codes_2 = {}#1-10亿的人口#population_codes_3 = {}#10亿以上的人口year = input("请输入需要绘制的世界地图的年份:")filename = "population_csv.csv"#文件名with open(filename) as f:#打开文件存储为freader = csv.reader(f)#阅读器对象header_row = next(reader)#读取第一行，第一行是行的属性。# print(header_row)#展示第一行#展示属性在列表中的位置for index, column_header in enumerate(header_row): # print(index, column_header)for row in reader:if row[2] == year:num = int(row[3]) #获取到国家的人数#population_nums.append(num)#将人口存储进入列表code = get_country_code(row[0])#获取到的国家的代码，将之前的三位数代码进行转换if code:#if num <100000000:#根据人数进行分别存储# population_codes_1[code] = num#elif 100000000 <= num and num< 1000000000:# population_codes_2[code] = num#else:# population_codes_3[code] = numpopulation_codes[code] = num #如果代码存在，则将代码和人数存储到字典中#print(row[0]+":"+row[3])country_codes.append(code)#将国家代码存储起来else:pass#print("Error --",row[0])#print(population_nums,country_codes)#获取人数和代码wm_style = RotateStyle('#336699')#重新设置的颜色，RBG颜色代码，6位中2位2位分别代表红绿蓝wm_style = LightColorizedStylewm = pygal_maps_world.maps.World(style=wm_style)wm.title = str(year+" of populations in world")wm.add(year,population_codes)#wm.add("0-1y",population_codes_1)#wm.add("1-10y",population_codes_2)#wm.add(">10y",population_codes_3)wm.render_to_file('pupulation_world.svg')#保存为此文件

下面两张图第一张颜色单一，还可以根据人口数量不同进行分类，绘制出更加好看的地图。将代码中的

#population_codes_1 = {}#一亿以下人口#population_codes_2 = {}#1-10亿的人口#population_codes_3 = {}#10亿以上的人口

取消注释改一改就可以了。第二张图。

眼尖的朋友已经发现，其实图中还有一些地方是空白，原因是无法识别的国家名字进行转换为两位数的国家码。

那么我们检查一下有哪些，修改刚才的代码，找不到的就用一个列表存储起来：

else:err_country.append(row[0])print(err_country)

结果：

？？怎么会有这么多？

把world之前的删除。再进行字符匹配，去掉末尾，空格。#! /usr/bin/python <br> # -*- coding: utf8 -*-import csvfrom country_codes import get_country_codefrom pygal.maps.world import COUNTRIESdef get_first_word(name):new_name = ''for letter in name:new_name += letterif letter == ','or letter == ' ' or letter == '.':new_name = new_name[:-1] #删掉最后的 , 空格或者 .breakreturn new_namedef get_pop(filename):filename = 'population_csv.csv'cc_population = {}err_country = []f = open(filename)pop_data = csv.reader(f)err_pop = []key = Truefor pop_dict in pop_data:if pop_dict[2] == '':country_name = pop_dict[0]population = int(float(pop_dict[3]))code = get_country_code(country_name)if code:cc_population[code] = populationif (code == None) and (key == False):err_country.append(country_name)err_pop.append(population)if country_name == 'World':key = Falseprint(err_country)f.close()new_err = []for country_name in err_country:new_name = get_first_word(country_name)if new_name == 'St':new_name = country_name[4:]new_name = get_first_word(new_name)new_err.append(new_name)new_country = {}count = -1for err_country in new_err:count += 1for code , country in COUNTRIES.items():if err_country in country:new_country[code] = err_pop[count]for code, pop in new_country.items():cc_population[code] = popreturn cc_population

再进行绘制：

#! /usr/bin/python <br> # -*- coding: utf8 -*-import pygalimport csvfrom country_codes import get_country_codefrom pygal.style import RotateStylefrom pygal.style import LightColorizedStylefrom countries import get_pop#将数据加载到一个列表中filename = 'population_csv.csv'#创建一个包含人口数量的字典cc_population = {}cc_pop1,cc_pop2,cc_pop3 = {},{},{}cc_population = get_pop(filename)for cc,pop in cc_population.items():if pop < 10000000:cc_pop1[cc] = popelif pop < 1000000000:cc_pop2[cc] = popelse:cc_pop3[cc] = popwm_style = RotateStyle('#226699',base_style=LightColorizedStyle)wm = pygal.maps.world.World(style=wm_style)wm.title = 'World Population in , by Country'wm.add('0-10m',cc_pop1)wm.add('10m-1bn',cc_pop2)wm.add('>1bn',cc_pop3)wm.render_to_file('world_population_v10.svg')

结果如下：和之前对比有了好转。

找出未识别的的国家地图参考了：找出未能识别的国家

学会这绘制，就可以做很多东西了。

可以用同样的方法，只需要做简单的修改，就可以用在不同的数据集上，如全球gdp。/data/core/gdp/在这里可以下载。

绘制如下：

也可以下载一些其他数据集进行操作，数据集可以在/indicator/世界银行进行下载。

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。