librosa库【NLP】音频特征工程(1)( 二 ) _音频

特征提取过零率(ZeroRate，ZCR)
The zerorate is the rate of sign- along a . i.e., the rate at which thefromtoor back. Thishas benn usedin bothand music. Ithasforlike those in metal and rock.
过零率表示在每帧中，信号通过零点的次数
# 过零率def ZCR_plot():start, end=1300, 1500plt.figure(figsize=(10, 6))plt.plot(info[start: end])plt.grid(True)plt.show()ZCR_plot()
计算过零点的数量
n_zcr=librosa.zero_crossings(info[start: end], pad=False)print('# of ZCR is {}'.format(sum(n_zcr)))
过零率计算，函数..()接口如下
librosa.feature.zero_crossing_rate(y, frame_length=2048, hop_length=512, center=True)参数：y，音频时间序列frame_length，帧长hop_length，帧移center，bool，True通过填充y的边缘使得帧居中返回：zcr，zcr[0, i]表示第i帧中的过零率
print('ZCR is')print(librosa.feature.zero_crossing_rate(info))
频谱中心( )
Itwhere theof mass for a sound isand isas themean of thein the sound. If thein music are samethenwould beaand if there are highat the end of sound then thewould beits end.
频谱中心表示声音的质心，即频谱一阶矩，中心的位置表示该段频谱的能量集中在该频段附近
# 频谱中心import sklearndef spec_center():x=info[:80000] # 取80000/8000=10秒的数据spec_centroids=librosa.feature.spectral_centroid(x, sr=sr)[0]frames=range(len(spec_centroids))t=librosa.frames_to_time(frames, sr=8000) # 时间轴# 归一化处理def normalize(x, axis=0): return sklearn.preprocessing.minmax_scale(x, axis=axis)dd.waveplot(x, sr=sr, alpha=0.4, label='wave')plt.plot(t, normalize(spec_centroids), color='r', linewidth=1, linestyle=':', label='spec_center')plt.legend()plt.show()spec_center()
计算每一帧的频谱中心
将帧转换为时间time[i]==frame[i]
频谱滚降点( )
is thebelow which aof the total, e.g. 90% lies.
设置频率点 f f f，低于 f f f的频谱能量占总能量的比例达到了设定值，如90%或者85%(经验值)
arg?min ? f c ∈ { 1 , … , N } ∑ i = 1 f c m i ≥ 0.85 × ∑ i = 1 N m i \{f_c\in\{1, \dots, N\}}\sum_{i=1}^{f_c} m_i\geq 0.85\times\sum_{i=1}^Nm_i fc?∈{1,…,N}?i=1∑fc??mi?≥0.85×i=1∑N?mi?
其中 f c f_c fc?为滚降点频率，m i m_i mi?为该频率下的能量()分量.
# 滚降点def spec_rolloff():x=info[:160000]# 前20秒spec_roll=librosa.feature.spectral_rolloff(x, sr=sr)[0]frames=range(len(spec_roll))t=librosa.frames_to_time(frames, sr=8000) # 时间轴# 归一化处理def normalize(x, axis=0): return sklearn.preprocessing.minmax_scale(x, axis=axis)dd.waveplot(x, sr=sr, alpha=0.4, label='wave')plt.plot(t, normalize(spec_roll), color='r', linewidth=1, linestyle=':', label='spec_roll')plt.show()spec_rolloff()
MFCC(Mel-Coef.)
Theis one of the mosttoaof an audioand is usedon audio . The mel(MFCCs) of aare a small set of( about 10-20) whichtheshape of a.（谱包络）
MFCC是重要的音频信号特征，属于集合特征，可以表示频谱的包络.
# MFCCdef mfcc_plot():x=info[:160000] # 采样前20秒mfccs=librosa.feature.mfcc(x, sr=sr)print(mfccs.shape)dd.specshow(mfccs, sr=sr, x_axis='time')plt.show()mfcc_plot()
得到mfccs.shape为(20, 313)表示mfcc每帧有20维特征，帧数为313
接口重采样
重采样从到的时间序列
y_hat = libsora.resample(y, orig_sr, target_sr, fix=True, scale=False)参数：y，音频时间序列，可以为单声道或者立体声orig_sr，y的原始采样率target_sr，目标采样率fix，bool，调整采样信号的长度，使其大小恰好为len(y)/orig_sr*target_sr=t*target_srscale，bool，缩放重新采样的信号，使得y和y_hat能量近似相等返回：y_hat，重采样之后的音频序列


上一页
1
2
3
4
5
下一页
		  	









亚运会的简单介绍  亚运会几年一次 

AQS源码分析 ---- 1 

【苹果推送】推信？具体流程和安装教程 

芈月传小说作者是谁  芈月传小说原著 

xr防水吗  xr是什么意思 

【PET材料的生产设备】——导电滑环应用实例，厂家讲解 

AI模拟人脑新突破：新型人造突触研究已公布 

【论文笔记】ICRA2019 视觉里程计的损失函数：Beyond Photome 

【GA三维路径规划】基于matlab遗传算法无人机三维航迹规划【含Matlab源 

关于航程的意思介绍  航程是什么意思

librosa库 【NLP】音频特征工程(1)( 二 )

librosa库【NLP】音频特征工程(1)( 二 )