awk使用-白红宇

awk使用

阅读量：4518 次

发布时间：2019-06-08

本文共 1045 字，大约阅读时间需要 3 分钟。

-F后面加分隔符‘p_words":’ ，print $2指的是输出分隔符右面内容

awk -F ' "p_words":' '{print $2}'

统计文件passage内容的单词数

cat data/train_middle-processed.json | awk -F ' "p_words":' '{print $2}'|  awk -F ', "p_q_relation":' '{print $1}' | awk '{print NF}'

计算单词数：

echo 'he said no SDG JCD DDDV .' | awk '{print NF}'

统计词频：

有两句话：

the day is sunny the thethe sunny is is

想得到：

the 4  is 3  sunny 2  day 1

命令脚本：

awk -F" " '{for(i=1;i<=NF;i++){array[$i]+=1;}} END{for(s in array){print s" "array[s];}}' words.txt|sort -nr -k 2

求平均数：

文件：

1 502 303 204 50

命令：

# awk -F' ' '{sum+=$2;count+=1} END{print "SUM:"sum"\nAVG:"sum/count}' inputfile SUM:150AVG:37.5

项目使用：

cat length.txt |  awk -F" " '{for(i=1;i<=NF;i++){array[$i]+=1;}} END{for(s in array){print s" "array[s];}}' |sort -nr -k 2

cat data/train_middle-processed.json | awk -F ' "p_words":' '{print $2}'|  awk -F ', "p_q_relation":' '{print $1}'  |awk '{print NF}' | awk -F" " '{for(i=1;i<=NF;i++){array[$i]+=1;}} END{for(s in array){print s" "array[s];}}'|sort -nr -k 2

转载于:https://www.cnblogs.com/hozhangel/p/9442293.html

你可能感兴趣的文章

opencv源代码之中的一个：cvboost.cpp

查看>>

Android通过泛型简化findViewById类型转换

查看>>

swift

查看>>

eclipse maven 插件的安装和配置

安装程序工具 (Installutil.exe)22

微软Office Online服务安装部署（二）

查看>>

从 0 到 1 实现 React 系列 —— 1.JSX 和 Virtual DOM

查看>>

面向接口编程详解（二）——编程实例

查看>>

解决java.lang.NoClassDefFoundError: org/apache/log4j/Level