自描述性:BUFR格式的数据本身携带元数据信息,使得数据接收方能够通过数据内部自带的元数据信息获取该数据包含的内容、属性等。
表格驱动特性:BUFR格式数据的编解码所需的大量信息都在规范表格中定义,只要编码方和解码方维护同一套表格,即可实现各类数据的统一编码和解码。
可扩展性:BUFR格式通过维护统一的码表,并基于该码表制定不同的模板,即可扩展表示多种数据类型。
可压缩性:BUFR格式本身采用二进制、以比特为单位表示数据,在数据容量方面较ASCII码数据具有明显的优势。除此之外,BUFR格式支持通过压缩方式表示多个数据子集,更大程度上缩小数据容量,尤其适用于卫星、风廓线等大数据量的资料。
平台无关性:BUFR格式为二进制编码,因此与平台无关。
fn = 'D:/Temp/bufr/prepbufr.gdas.20230325.t00z.nr'
f = addfile(fn, keepopen=True)
数据文件对象 f 的信息如下:
> f
File Name: D:/Temp/bufr/prepbufr.gdas.20230325.t00z.nr
File type: WMO Binary Universal Form (BUFR)
Dimensions: 0
Global Attributes:
: :history = "Read using CDM BufrIosp2"
: :location = "D:/Temp/bufr/prepbufr.gdas.20230325.t00z.nr"
: :BUFR:categoryName = "ADPUPA UPPER-AIR (RAOB, PIBAL, RECCO, DROPS) REPORTS"
: :BUFR:centerName = "7.3 (US National Weather Service - National Centres for Environmental Prediction (NCEP) / NCEP Central Operations)"
: :BUFR:centerId = 7
: :BUFR:subCenter = 3
: :BUFR:table = 0
: :BUFR:tableVersion = 29
: :BUFR:localTableVersion = 0
: :Conventions = "BUFR/CDM"
: :BUFR:edition = 3
Variations: 8
Sequence ADPUPA);
ADPUPA: :coordinates = "YOB "
Sequence AIRCFT);
AIRCFT: :coordinates = "YOB YOB-2 "
Sequence SATWND);
SATWND: :coordinates = "YOB YOB-2 YOB-3 "
Sequence VADWND);
VADWND: :coordinates = "YOB YOB-2 YOB-3 YOB-4 "
Sequence ADPSFC);
ADPSFC: :coordinates = "YOB YOB-2 YOB-3 YOB-4 YOB-5 "
Sequence SFCSHP);
SFCSHP: :coordinates = "YOB YOB-2 YOB-3 YOB-4 YOB-5 YOB-6 "
Sequence RASSDA);
RASSDA: :coordinates = "YOB YOB-2 YOB-3 YOB-4 YOB-5 YOB-6 YOB-7 "
Sequence ASCATW);
ASCATW: :coordinates = "YOB YOB-2 YOB-3 YOB-4 YOB-5 YOB-6 YOB-7 YOB-8 "
可以看到文件中有8个根变量,每个变量表示不同的观测数据类型,例如ADPUPA是探空数据;ADPSFC是陆地地面观测数据;SATWND是卫星风场数据等等。所有变量的数据类型都是 Sequence ,这种变量相当于根变量,里面包含了多个内在的变量,内在的变量也可能是下一层级的根变量,从而用多个层级达到负责观测数据灵活表达的目的。
当然也可以用数据文件对象的varnames属性只看有哪些变量名:
>>> f.varnames
[ADPUPA, AIRCFT, SATWND, VADWND, ADPSFC, SFCSHP, RASSDA, ASCATW]
如果想读取陆地地表观测数据,可以先读取 ADPSFC 变量:
obs = f['ADPSFC']
然后看看该变量中有哪些内在的变量:
obs.varnames
[u'BYTCNT-5', u'SID-5', u'XOB-5', u'YOB-5', u'DHR-5', u'ELV-5', u'TYP-5', u'T29-5', u'TSB-5', u'ITP-5', u'SQN-5', u'PROCN-5', u'RPT-5', u'TCOR-5', u'RSRD_SEQ_RESTRICTIONS_ON_REDISTRIBUTION_SEQUENCE', u'CAT-5', u'P___INFO_PRESSURE_INFORMATION', u'Q___INFO_SPECIFIC_HUMIDITY_INFORMATION', u'T___INFO_TEMPERATURE_INFORMATION', u'Z___INFO_HEIGHT_INFORMATION', u'W___INFO_WIND_INFORMATION', u'W2_EVENT_WIND_{DIRECTION-SPEEDm-s}_EVENT_SEQUENCE', u'PMSL_SEQ_MEAN_SEA_LEVEL_PRESSURE_SEQUENCE', u'ALTIMSEQ_ALTIMETER_SETTING_SEQUENCE', u'SST_INFO_SEA_TEMPERATURE_INFORMATION', u'TOPC_SEQ_TOTAL_PRECIPITATION-TOTAL_WATER_EQUIVALENT_SEQUENCE', u'PREWXSEQ_PRESENT_WEATHER_SEQUENCE', u'CLOUDSEQ_OBSERVED_CLOUD_SEQUENCE_#_1', u'TMXMNSEQ_MAXIMUM-MINIMUM_TEMPERATURE_SEQUENCE', u'SWELLSEQ_SWELL_WAVE_SEQUENCE', u'VISB1SEQ_VISIBILITY_SEQUENCE_#_1', u'PSTWXSEQ_PAST_WEATHER_SEQUENCE', u'PKWNDSEQ_PEAK_WIND_SEQUENCE', u'GUST1SEQ_MAXIMUM_WIND_GUST_SEQUENCE_#_1', u'TPRECSEQ_TOTAL_PRECIPITATION_SEQUENCE', u'SUNSHSEQ_TOTAL_SUNSHINE_SEQUENCE', u'CLOU2SEQ_OBSERVED_CLOUD_SEQUENCE_#_2', u'SNOW_SEQ_SNOW_DEPTH_SEQUENCE', u'WAVE_SEQ_WAVE_SEQUENCE', u'PTENDSEQ_PRESSURE_TENDENCY_SEQUENCE', u'CLOU3SEQ_OBSERVED_CLOUD_SEQUENCE_#_3_CEILING', u'seq5']
XOB、YOB是观测站点的经纬度,SID是站点站号,每个内在变量会根据根变量加上数字后缀。可以用以下方式看某个变量的具体信息:
'XOB-5'] > obs[
ADPSFC.XOB-5):
ADPSFC.XOB-5: long_name = "LONGITUDE"
ADPSFC.XOB-5: units = "DEG E"
ADPSFC.XOB-5: missing_value = 67108863
ADPSFC.XOB-5: scale_factor = 1.0E-5f
ADPSFC.XOB-5: add_offset = -180.0f
ADPSFC.XOB-5: BUFR:TableB_descriptor = "0-6-240"
ADPSFC.XOB-5: BUFR:bitWidth = 26
上述有具体信息的变量是最终变量,可以直接从变量里读取数据:
>>> lon = obs['XOB-5'][:]
>>> lat = obs['YOB-5'][:]
>>> sid = obs['SID-5'][:]
>>> typ = obs['TYP-5'][:]
对于温度、气压等观测数据通常还有两层根变量:
>>> v = obs['T___INFO_TEMPERATURE_INFORMATION']
>>> vv = v['T__EVENT_TEMPERATURE_EVENT_SEQUENCE']
每个根变量都可以用variables属性看全面信息,或者用varnames只看包含的变量名:
v.varnames
[u'T__EVENT_TEMPERATURE_EVENT_SEQUENCE', u'TVO-5', u'T__BACKG_TEMPERATURE_BACKGROUND_SEQUENCE', u'T__POSTP_TEMPERATURE_POSTPROCESSING_SEQUENCE']
vv.varnames
[u'TOB-5', u'TQM-5', u'TPC-5', u'TRC-5']
vv根变量里的 TOB-5 变量是最终的温度变量:
>>> vvv = vv['TOB-5']
>>> vvv
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5):
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: long_name = "TEMPERATURE OBSERVATION"
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: units = "DEG C"
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: missing_value = 16383S
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: scale_factor = 0.1f
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: add_offset = -273.2f
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: BUFR:TableB_descriptor = "0-12-245"
ADPSFC.T___INFO_TEMPERATURE_INFORMATION.T__EVENT_TEMPERATURE_EVENT_SEQUENCE.TOB-5: BUFR:bitWidth = 14
从温度最终变量里读取数据,需要注意的是这里用 [0] 来读取,如果用 [:] 来读取数据会将所有数据都读出来,如果是探空站每个站会有多层数据,导致读出来的数组 data 比 lon, lat 经纬度数组大很多,很难和站点匹配上。用 [0] 来读取数据可以避免这种问题。也可以用 [0:] 来读取第一个站单站的数据(包含这个站的所有这个变量的数据)。
data = vvv[0]
完整的代码如下:
fn = 'D:/Temp/bufr/prepbufr.gdas.20230325.t00z.nr'
f = addfile(fn, keepopen=True)
obs = f['ADPSFC']
print(obs.varnames)
lon = obs['XOB-5'][:]
lat = obs['YOB-5'][:]
sid = obs['SID-5'][:]
typ = obs['TYP-5'][:]
v = obs['T___INFO_TEMPERATURE_INFORMATION']
vv = v['T__EVENT_TEMPERATURE_EVENT_SEQUENCE']
vvv = vv['TOB-5']
data = vvv[0]
f.close()
#Plot
axesm()
geoshow('country')
levs = arange(-20, 41, 5)
layer = scatter(lon, lat, data, levs, size=2, edgecolor=None, zorder=0)
colorbar(layer)
title('Bufr data example')
读取卫星数据的例子:
fn = 'D:/Temp/bufr/prepbufr.gdas.20230325.t00z.nr'
f = addfile(fn, keepopen=True)
obs = f['SATWND']
print(obs.varnames)
lon = obs['XOB-3'][:]
lat = obs['YOB-3'][:]
sid = obs['SID-3'][:]
typ = obs['TYP-3'][:]
v = obs['P___INFO_PRESSURE_INFORMATION']
vv = v['P__EVENT_PRESSURE_EVENT_SEQUENCE']
vvv = vv['POB-3']
data = vvv[0]
f.close()
#Plot
axesm()
geoshow('country')
layer = scatter(lon, lat, data, size=2, edgecolor=None, zorder=0)
colorbar(layer)
title('Bufr data example')