本篇文章由 VeriMake 旧版论坛中备份出的原帖的 Markdown 源码生成
原帖标题为:Python 数据可视化案例赏析——The Night Sky (HYG Database)
原帖网址为:https://verimake.com/topics/73 (旧版论坛网址,已失效)
原帖作者为:YX(旧版论坛 id = 22,注册于 2020-04-06 23:28:45)
原帖由作者初次发表于 2020-04-18 16:01:09,最后编辑于 2020-04-18 16:01:09(编辑时间可能不准确)
截至 2021-12-18 14:27:30 备份数据库时,原帖已获得 982 次浏览、0 个点赞、0 条回复
github上一位数据可视化作者的案例,有很高的学习价值,链接如下:
https://github.com/aaronpenne/data_visualization
从中选取 The Night Sky (HYG Database) 简单介绍一下,效果图见文章末尾。
数据来源和解析: hygdata_v3.csv
# -*- coding: utf-8 -*-
"""
Exploring star mapping
Author: Aaron Penne
Created: 03/15/2018
Developed with Python 3.6 on Windows 10
"""
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import imageio
import os
#np.random.seed(1138)
min_mag = -20
max_mag = 20
vis_mag = 8
steps = 40
twinkles = 10
# Set output directory, make it if needed
output_dir = os.path.relpath('output') # Windows machine
if not os.path.isdir(output_dir):
os.mkdir(output_dir)
# Get input file
df = pd.read_csv('Desktop//hygdata_v3.csv')
df.columns
Index(['id', 'hip', 'hd', 'hr', 'gl', 'bf', 'proper', 'ra', 'dec', 'dist',
'pmra', 'pmdec', 'rv', 'mag', 'absmag', 'spect', 'ci', 'x', 'y', 'z',
'vx', 'vy', 'vz', 'rarad', 'decrad', 'pmrarad', 'pmdecrad', 'bayer',
'flam', 'con', 'comp', 'comp_primary', 'base', 'lum', 'var', 'var_min',
'var_max'],
dtype='object')
# Filter to certain magnitudes, ignore the sun
df = df[df.mag > min_mag]
df = df[df.mag < max_mag]
# Filter the size/alpha of each marker by magnitude bin (log scale)
mag = {}
# 关于enumerate函数见附录[1]
# 关于np.geomspace 见附录[2]
for i,value in enumerate(np.geomspace(abs(min_mag), abs(min_mag)+max_mag, num=steps)):
mag[i] = (df['mag'] >= (value + min_mag)) & (df['mag'] < vis_mag)
marker = {}
for i,value in enumerate(np.geomspace(4, 0.1, num=steps)):
marker[i] = value
alpha = {}
for i,value in enumerate(np.geomspace(1, 0.4, num=steps)):
alpha[i] = value
# Workaround to get each magnitude bin only plotted once
mag_xor = {}
for i in range(len(mag)):
if i == len(mag)-1:
mag_xor[i] = mag[i]
break
# &按位与,^ 按位异或 见附录[3]
mag_xor[i] = mag[i] & (mag[i] ^ mag[i+1])
mag = mag_xor
# Plot with varying alphas to get twinkle effect
for i in range(twinkles):
# Set up plot
fig, ax = plt.subplots(figsize=(10, 5), dpi=150)
# Black out the entire background
fig.set_facecolor('black')
ax.set_facecolor('black')
# Plot each star, differing parameters depending on magnitude
# 'ra' 'dec' 为星体赤经、赤纬坐标,见附录[4]
for j in range(steps):
x = df.loc[mag[j], 'ra']
y = df.loc[mag[j], 'dec']
plt.plot(x, y,
color='white',
linestyle='none',
linewidth=0,
marker='.',
markersize=marker[j],
alpha=alpha[j],
markeredgewidth=0)
# Twinkle hack because other methods failed. Just plot random black patches.
# np.random.uniform 见附录[5]
x = np.random.uniform(0, 24, (1,300))
y = np.random.uniform(-90, 90, (1,300))
print(x[0][0], y[0][0])
# 用画黑点的方式遮挡白色星体,制造Twinkle效果
plt.plot(x, y,
color='black',
linestyle='none',
linewidth=0,
marker='.',
markersize=3,
alpha=0.5,
markeredgewidth=0)
# Despine plot
for side in ['right', 'left', 'top', 'bottom']:
ax.spines[side].set_visible(False)
# Set axis ticks/labels
plt.xticks(np.linspace(0, 24, 5),
family='monospace',
size=5,
color='white',
alpha=0.2)
plt.yticks(np.linspace(-90, 90, 5),
family='monospace',
size=5,
color='white',
alpha=0.2)
plt.text(0, -105,
'Right Ascension (hours)',
family='monospace',
size=5,
color='white',
alpha=0.15,
horizontalalignment='left')
plt.text(-1.2, -51,
'Declination (degrees)',
family='monospace',
size=5,
color='white',
alpha=0.15,
horizontalalignment='left',
rotation='vertical')
# Set max/min axis limits
plt.ylim([-90, 90])
plt.xlim([0, 24])
# Add text
plt.text(0, 115, 'The Night Sky',
family='monospace',
size=12,
horizontalalignment='left',
weight='bold',
color='white',
alpha=0.8)
plt.text(0, 111, 'Equirectangular projection of stars (mag<8)'.format(vis_mag),
family='monospace',
size=7,
horizontalalignment='left',
verticalalignment='top',
weight='bold',
color='white',
alpha=0.5)
plt.text(24, -120, '© 2018 Aaron Penne\nData: HYG Stellar Database\n\nApparent magnitude scale is logarithmic\nBrighter stars have a smaller apparent magnitude',
family='monospace',
size=7,
horizontalalignment='right',
verticalalignment='top',
weight='bold',
color='white',
alpha=0.5)
# Save plot
fig.savefig(os.path.join(output_dir, 'hyg_scatter_{:02}.png'.format(i)),
dpi=fig.dpi,
facecolor=fig.get_facecolor(),
edgecolor='none',
bbox_inches='tight',
pad_inches=0.5)
plt.close(fig)
## Append images to create GIF
# Read in all png files in folder - https://stackoverflow.com/a/27593246
png_files = [f for f in os.listdir(output_dir) if f.endswith('.png')]
charts = []
# Append all the charts
for f in png_files:
charts.append(imageio.imread(os.path.join(output_dir, f)))
# Save gif
imageio.mimsave(os.path.join(output_dir, 'hyg_scatter_twinkle.gif'), charts, format='GIF', duration=0.01)
geomspace(start, stop, num=50, endpoint=True, dtype=None)#指数等差数列;
数组返回在对数刻度上均匀间隔的数字(几何级数)。
start — 区间起始值。强制参数。
stop — 区间终止值,(是否取得到,需要设定参数endpoint)。强制参数。
num — 等分的个数,按照指数等分。默认值为50。可选参数。
endpoint — 若为True(默认),则可以取到区间终止值;否则取不到。可选参数。
原理:
d=(stop/start)**(1/(n-1))
start ,startd,startdd,startddd,…
附录
[1] Python enumerate() 函数
[2] 等差arange,linspace,logspace对数等差,geomspace指数等差的原理及创建
[3] Python “&”、“|”、“”按位逻辑运算到底是咋回事
[4] RIGHT ASCENSION & DECLINATION: CELESTIAL COORDINATES FOR BEGINNERS
[5] numpy.random模块用法总结